By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
ScienceabodeScienceabode
  • Home
  • News & Perspective
    News & PerspectiveShow More
    Microorganism that causes rare but severe eye infections detected in NSW coastal areas
    By Admin
    Scientists identify common cause of gastro in young children and adults over 50 years old
    By admin
    AI reveals hidden traits about our planet’s flora to help save species
    By admin
    Eye drops slow nearsightedness progression in kids, study finds
    By admin
    Using AI to create better, more potent medicines
    By admin
  • Latest News
    Latest NewsShow More
    Even thinking about marriage gets young people to straighten up
    By admin
    Study: People tend to locate the self in the brain or the heart – and it affects their judgments and decisions
    By admin
    UCLA patient is first to receive successful heart transplant after using experimental 50cc Total Artificial Heart
    By admin
    Via Dying Cells, UVA Finds Potential Way to Control Cholesterol Levels
    By admin
    Racial makeup of labor markets affects who gets job leads
    By admin
  • Health
    Health
    The World Health Organization defines health as “a state of complete physical, mental, and social well-being and not merely the absence of disease or infirmity.”…
    Show More
    Top News
    Researchers design machine learning models to better predict adolescent suicide and self-harm risk
    September 11, 2023
    Scientists identify evolutionary gateway helping pneumonia bacteria become resistant to antibiotics   
    October 3, 2023
    New research indicates some people may be physically unable to use police breathalysers
    October 3, 2023
    Latest News
    Study: High-fat diets make liver cells more likely to become cancerous
    January 5, 2026
    Learning to play music can improve older people’s brain function, study suggests
    December 31, 2025
    Clues to Alzheimer’s disease may be hiding in our ‘junk’ DNA
    December 22, 2025
    Helping young adults rethink uncertainty reduces anxiety and depression: study
    December 19, 2025
  • Environment
    EnvironmentShow More
    Deforestation can cause eight-fold increase in flood event risk, says report
    By Admin
    Pollution and Dementia: The Connection Too Dangerous to Ignore
    By Admin
    Diver-Operated Microscope Brings Hidden Coral Biology into Focus
    By Admin
    A fungal origin for coveted lac pigment
    By Admin
    Perfumes and lotions disrupt how body protects itself from indoor air pollutants
    By Admin
  • Infomation
    • Pricavy Policy
    • Terms of Service
  • Jobs
  • Application Submission
Notification Show More
ScienceabodeScienceabode
  • Home
  • Health
  • Anatomy
  • Jobs Portal
  • Application Submission
  • Categories
    • Health
    • Anatomy
    • Food & Diet
    • Beauty Lab
    • News & Perspective
    • Environment
  • More Foxiz
    • Blog Index
    • Sitemap
Follow US
Scienceabode > Senior AI Evaluation Scientist

Senior AI Evaluation Scientist

Last updated: 2026/01/08 at 6:45 PM
By
Share
6 Min Read
SHARE
  • Permanent
  • United States
  • Posted 19 hours ago
Steampunk

Website Steampunk

Steampunk

Company : Steampunk

Overview:

We are seeking an experienced Senior AI Evaluation Scientist to design and lead rigorous evaluation programs for predictive and generative AI systems across our enterprise and client engagements. This role is critical to ensuring that AI solutions are accurate, reliable, safe, and aligned with mission outcomes. The Senior AI Evaluation Scientist will develop evaluation frameworks, build automated testing pipelines, and act as a subject-matter expert on AI quality, risk, and performance measurement. This role blends deep technical expertise with analytical rigor, experimentation, and cross-functional collaboration.

Contributions:

  • Lead the design and implementation of comprehensive evaluation frameworks for generative and predictive AI models, including accuracy, robustness, relevance, trustworthiness, fairness, hallucination rates, and safety.
  • Develop and maintain automated evaluation pipelines that continuously audit model outputs, monitor quality drift, and validate alignment with mission-specific constraints.
  • Create custom benchmark datasets, challenge sets, and adversarial evaluation strategies tailored to client domains and regulatory requirements.
  • Conduct in-depth error analysis, model behavior studies, and sensitivity assessments to inform iterative improvements in prompts, retrieval systems, models, and orchestration frameworks.
  • Partner with AI Product Engineers, LLMOps Engineers, and Data Scientists to drive model improvements through structured experimentation, A/B testing, and scientifically grounded evaluation cycles.
  • Advise teams on measurement methodologies, statistical significance, and best practices for Trustworthy AI evaluation in alignment with NIST AI RMF, MLSecOps, and agency governance requirements.
  • Document evaluation results, risks, and findings for technical and non-technical audiences, including engineering teams, leadership, and government clients.
  • Contribute to the development of standardized tools, reusable templates, and evaluation components to improve repeatability and quality across engagements.
  • Stay informed of advances in LLM assessment, safety science, red-teaming methodologies, and evaluation frameworks emerging from academia and industry.
  • Mentor junior evaluation staff and help grow Steampunk’s AI measurement and evaluation capabilities.
  • You will contribute to the growth of our AI & Data Exploitation Practice!

 

Qualifications:

  • Ability to hold a position of public trust with the U.S. government.
  • Master's Degree (related program) and 7 years of relevant experience; OR
    • Bachelor's Degree (related program) and 10 years of relevant experience; OR
    • No degree and 16 years of relevant experience
  • Possesses at least one professional certification relevant to the technical service provided. Maintain a certification relevant to the product being deployed and/or maintained.
  • 8+ years of experience evaluating machine learning, NLP, or generative AI systems, with strong familiarity with LLMs and retrieval-based architectures.
  • Deep understanding of evaluation metrics, statistical testing, dataset construction, experimental design, and model validation methodologies.
  • Hands-on experience with Python and libraries such as PyTorch, Hugging Face, LangChain, scikit-learn, and evaluation tooling (LLM-as-a-judge, rubric-based evaluators, or custom harnesses).
  • Demonstrated experience designing automated evaluation pipelines and integrating them into CI/CD or LLMOps workflows.
  • Strong understanding of AI governance, responsible AI principles, bias detection, fairness metrics, and risk identification.
  • Experience working with structured and unstructured datasets across multiple modalities (text, tabular, documents).
  • Familiarity with vector databases, RAG architectures, and multi-step LLM workflows.
  • Excellent analytical, written, and verbal communication skills, with the ability to translate evaluation insights into clear technical recommendations.
  • Proven ability to collaborate with cross-functional engineering and product teams while independently driving evaluation strategy.
  • Experience working in agile or iterative development environments and documenting scientific processes clearly.

 

About steampunk:

Steampunk relies on several factors to determine salary, including but not limited to geographic location, contractual requirements, education, knowledge, skills, competencies, and experience. The projected compensation range for this position is $135,000 to $170,000.  The estimate displayed represents a typical annual salary range for this position. Annual salary is just one aspect of Steampunk’s total compensation package for employees. Learn more about additional Steampunk benefits here. 

 

Identity Statement

As part of the application process, you are expected to be on camera during interviews and assessments. We reserve the right to take your picture to verify your identity and prevent fraud.

 

Steampunk is a Change Agent in the Federal contracting industry, bringing new thinking to clients in the Homeland, Federal Civilian, Health and DoD sectors.  Through our Human-Centered delivery methodology, we are fundamentally changing the expectations our Federal clients have for true shared accountability in solving their toughest mission challenges.  As an employee owned company, we focus on investing in our employees to enable them to do the greatest work of their careers – and rewarding them for outstanding contributions to our growth. If you want to learn more about our story, visit (url removed).

 

 

Share This Article
Facebook Twitter Copy Link Print

Fast Four Quiz: Precision Medicine in Cancer

How much do you know about precision medicine in cancer? Test your knowledge with this quick quiz.
Get Started
Even in Winter, Life Persists in Arctic Seas

(USCGC Healy breaking through the Bering Sea waves. Credit: Chantelle Rose/NSF)   Despite…

A Biodiversity Discovery That Was Waiting in the Wings–Wasp Wings, That Is

Wing size differences between two Nasonia wasp species are the result of…

Entertainement

Coming soon

Your one-stop resource for medical news and education.

Your one-stop resource for medical news and education.
Sign Up for Free

You Might Also Like

Laser Physicist

By

Integrated Urgent Care Operations Manager for Service Delivery

By

Band 7 Aseptic Pharmacy Technician

By

Band 7 Oncology Pharmacist

By
Facebook Twitter Youtube Instagram
Company
  • Privacy Policy
  • Editorial Policy
  • Accessibility Statement
  • Contact US
  • Feedback
  • Advertisement
More Info
  • Newsletter
  • Beauty Lab
  • News & Perspective
  • Food & Diet
  • Health
  • Environment
  • Anatomy

Sign Up For Free

Subscribe to our newsletter and don't miss out on our programs, webinars and trainings.

Copyright © 2023 ScienceAbode. All Rights Reserved. Designed and Developed by Spirelab Solutions (Pvt) Ltd

Welcome Back!

Sign in to your account

Lost your password?