meta pixel

Certiport is hiring for

LLM Engineer
LLM Engineer
$60 to $100/hourFreelance
$480 to $800/day$127k to $211k/year
  • Technology (3+ years)

    Job department

  • Remote

    Office Policy

  • Remote/anywhere

    Location

  • 6 months

    Project length

Software Development

Specializations

Machine Learning & Artificial Intelligence

Required skills

LangChain
LlamaIndex

Nice-to-have skills


About Certiport

Certiport, a Pearson VUE business, was established in 1997 and is now the leading provider of certification exam development, delivery, and program management services. Certiport exams are delivered through an expansive network of over 14,000 Certiport Authorized Testing Centers worldwide.

Certiport delivers more than three million exams each year through the secondary, post-secondary, workforce, and corporate technology markets in 148 countries and 26 languages.

About the role

Participate as a Subject Matter Expert in the exam development process for an Agentic AI Practitioner exam. The job consists of attending virtual meetings and writing test questions. The project consists of three phases:

Job Task Analysis - Mid-to-late June: 10-12 hours of virtual meetings held over a one-or-two week period

Item Writing and Technical Review - July-Spetember: Independent item writing, collaborative item writing in virtual workshops, item review meetings. 3-20 hours per week, depending on your availability

Standard Setting - Individual item review and ratings: 4-6 hours; Meetings with the psychometrician: 6-10 hours

Key responsibilities

  • Share your experience in virtual meetings
  • Write items according to our style guide, either independently or collaboratively
  • Review items for technical accuracy and congruence with the objectives
  • Rate items according to the instructions provided by the psychometrician

Ideal experience

  • Built and trained a GPT model
  • Pretrained an LLM
  • Fine-tuned an LLM for a specific task
  • Implemented reinforcement learning using RLHF and DPO
  • Adapted pre-trained models for practical use cases
  • Built applications that leveraged the LLM, including RAG solutions
  • Defined and applied evaluation metrics for both automated and human-in-the-loop evaluation