Lead Quality Engineer - AI/ML

Grainger
life insurance, parental leave, paid time off, tuition reimbursement, 401(k)
United States, Illinois, Lake Forest
Nov 19, 2024
As a leading industrial distributor with operations primarily in North America, Japan and the United Kingdom, We Keep The World Working by serving more than 4.5 million customers worldwide with products delivered through innovative technology and deep customer relationships. With 2023 sales of $16.5 billion, we're dedicated to providing value for customers, fostering an engaging culture for team members and driving strong financial results. Our welcoming workplace enables you to learn, grow and make a difference by keeping businesses running and their people safe. As a 2024 Glassdoor Best Place to Work and a Great Place to Work-Certified company, we're looking for passionate people to join our team as we continue leading the industry over our next 100 years. Position Details: A new and rapidly growing team at Grainger is focusing on transforming a variety of transactional and operational data, to support the development of new tools and services aimed at enhancing our ability to respond to Customer inquiries. We are looking for a LLM Testing Engineer specializing in automated testing frameworks for language models (LLMs). You will develop and execute automated scripts to evaluate the accuracy, reliability, and efficiency of our LLM outputs. You will collaborate closely with our AI/ML, QA, and engineering teams to build robust testing frameworks, improve model reliability, and ensure seamless, consistent user interactions. You will report to the Product Engineering Manager You Will: Define test automation strategy and lead implementation of automated solutions to test LLM responses across multiple scenarios, including edge cases and diverse prompt inputs. Analyze and assess the language model's outputs for quality, accuracy, coherence, and adherence to requirements. Build and maintain testing frameworks and tools to automate and streamline the testing process, ensuring scalable, efficient testing for large language models. Use statistical tools to analyze test results, identify patterns in model errors, and generate reports for the ML and engineering teams. Perform manual testing on LLM outputs to address urgent needs, high-stakes scenarios, or particularly challenging prompt responses to ensure quality. Conduct regular and on-demand regression testing, unit and integration tests ensuring the model's performance remains stable after updates, particularly for high-impact changes. Collaborate with engineering teams to ensure the LLM meets high standards for deployment, establishing benchmarks for response quality, performance and consistency. Maintain and update test cases, procedures, and detailed documentation for automated and manual test scripts. Provide clear documentation for reproducibility and knowledge-sharing. Design and implement test strategies and scripts targeting API, and front-end UI, with potential expansion to other components as needed. Identify, document, and report issues in model outputs, ensuring quick turnaround on urgent fixes in collaboration with engineering and QA. Testing in a Cloud/AWS, CI/CD DevOps environment. Collaborate closely with AI/ML engineers to provide actionable feedback on model behavior, issues and response generation. Work closely with AI/ML engineers, data engineers, and product managers to refine prompt strategies and model performance. Focus on increasing depth of understanding of domain and core systems maintained by team Lead production issue triage efforts You Have: Bachelor's degree in Computer Science, Data Analysis/Science, Engineering, or a related field, or equivalent work experience. 6+ years in QA engineering, test automation, or LLM/AI model testing. Proficiency in scripting languages (e.g., Python, Java) and familiarity with test automation frameworks (e.g., Selenium, Robot Framework, Pytest). Familiarity with NLP models, large language models (LLMs), and prompt engineering. Strong understanding of language model behavior and testing methodologies. Familiarity with data processing and analysis tools such as Pandas, SQL, or similar. Familiarity with model training processes, including dataset preparation and prompt testing. Strong analytical and problem-solving skills, with the ability to evaluate complex outputs for accuracy and relevance. Experience with automation tools, CI/CD pipelines, and test scripting. Experience with LLMs or NLP applications, preferably in a production environment. Excellent communication and leadership skills, and strong attention to detail. Rewards and Benefits: With benefits starting day one, Grainger is committed to your safety, health and wellbeing. Our programs provide choice to meet our team members' individual needs. Check out some of the rewards available to you at Grainger. Paid time off (PTO) days and 6 company holidays per year Benefits starting on day one, including medical, dental vision and life insurance 6% 401(k) company contribution each pay period with no personal contribution required Employee discounts, parental leave, tuition reimbursement, student loan refinancing, free access to financial counseling, education and more. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender, gender identity or expression, or veteran status. We are proud to be an equal opportunity workplace. We are committed to fostering an inclusive, accessible environment that includes both providing reasonable accommodations to individuals with disabilities during the application and hiring process as well as throughout the course of one's employment. With this in mind, should you need a reasonable accommodation during the application and selection process, please advise us so that we can provide appropriate assistance.