Job Title: Linguist III
Location: Remote
Duration: 8 + months
Job Description:
- 0-3 years of experience.
- Must have a graduate degree in Linguistics
- Must be native speaker of a non-English language (preferably Hindi) with a high level of proficiency in another Indo-Aryan or South Dravidian language, plus broad knowledge of other languages in either of those two groups.
Main duties:
- Perform linguistic analyses on large datasets.
- Perform linguistic error analysis of AI model outputs, determining what the most frequent and severe error categories are.
- Write and revise guidelines for human annotation and other AI projects, including but not limited to translation tasks.
- Conduct typological and sociolinguistic research on a large number of languages, highlighting their similarities and differences.
- Perform linguistic analyses for Responsible AI (toxic language, hate speech, gender bias and other cultural biases) in massively multilingual settings.
- Conduct linguistic literature reviews on various NLP-adjacent topics, and summarize findings.
- Compare the quality of deliveries between vendors, identify error patterns, and provide actionable feedback.
- Provide information or guidance relative to any aspect of linguistic knowledge (typology, morpho-syntax, sociolinguistics, classification, phonetics/phonology, pragmatics, etc.).
- Reach out to and collaborate with native speakers in various languages.
- Communicate results of linguistic analyses to engineers and research scientists.
Skills:
- Must have strong written and spoken communication skills, especially business and research communication.
- Must be a native speaker of a non-English language (preferably Hindi) with a high level of proficiency in another Indo-Aryan or South Dravidian language, plus broad knowledge of other languages in either of those two groups.
- Working knowledge in other languages is a plus. Proficiency in a low-resource language is valued.
- Must be able to code in Python (must) and query databases using SQL, other coding languages used for data analysis are a plus.
- Must be able to independently work through complex requests and perform under pressure.
- Strong ability to work independently, prioritize, plan, and track work, as well as report progress
- education or training in the basics of project management is a plus
- self-motivation is a must
- Working knowledge of international language-classification standards is valued.
Education:
- Graduate degree in Linguistics or related field is a must; PhD is a plus
- A background or specialization in corpus linguistics is a plus
- Experience with field work is a plus
- A graduate degree in Literature or English is not an appropriate substitution
- Degree in Computer Science with a specialization in NLP is not an appropriate substitution
- Must have a very firm grasp of the following linguistic fields: language typology, syntax, morphology, sociolinguistics (especially dialectology and discourse analysis), corpus linguistics, writing systems, pragmatics, phonology.
- Must have some experience with applying basic Natural Language Processing techniques.
Experience
- Years of experience: 0-3
- Experience working cross-functionally
- Experience collaborating with machine learning, NLP, or software engineers, or data scientists
- Experience contributing to research papers
- Important: Preferably no known conflicts of interest in the fields of machine translation, ASR, TTS, or LLM research (as FAIR Linguists need to be contributing to research papers)
Top 3 must-have HARD skills:
- Must be native speaker of a non-English language (preferably Hindi) with a high level of proficiency in another Indo-Aryan or South Dravidian language, plus broad knowledge of other languages in either of those two groups.
- Perform linguistic error analysis of machine translations and identifying the most frequent and severe error categories
- Strong skills in pattern recognition, cross-functional communication, and multitasking
- Experience with Python
Good to have skills:
- Experience with creating and/or maintaining specialized lexical resources (e.g., profanity dictionaries) a plus
Soft Skills:
- Ability to independently work through ambiguous requests, based on priorities established by CWAM, and perform under pressure. Able to work cross functionally.
|