ML / AI Research Engineer
Our client, a venture-backed AI Startup, is hiring a talented ML/AI Research Engineer to join their team in San Francisco. The successful candidate will lead the design, training, evaluation and optimization of agent-native AI systems, working at the cutting edge of LLMs, vector search, graph reasoning and reinforcement learning to build the intelligence layer on top of their enterprise data fabric.
Responsibilities
-
Fine-tune and evaluate open-source LLMs (e.g. LLaMA 3, Mistral, Falcon, Mixtral) for enterprise-grade applications.
-
Build and optimize RAG pipelines using tools such as LangChain, LangGraph, LlamaIndex or Dust.
-
Develop and iterate on agent architectures (ReAct, AutoGPT, BabyAGI, OpenAgents) using real-world enterprise workflows.
-
Design embedding-based memory systems with efficient, high-performance retrieval strategies.
-
Implement reinforcement learning pipelines (RLHF, DPO, PPO) to improve agent behavior and decision-making.
-
Create scalable evaluation frameworks, including synthetic evaluations, trace capture and explainability tooling.
-
Own model observability, drift detection and alignment strategies across production systems.
-
Optimize inference latency and GPU utilization across cloud and on-premise infrastructure.
Skillset
-
Strong experience fine-tuning open-source LLMs using frameworks such as HuggingFace, DeepSpeed, vLLM, FSDP and LoRA/QLoRA.
-
Hands-on experience with modern alignment techniques, including SFT, RLHF and DPO pipelines.
-
Proven ability to build high-quality training datasets and robust evaluation frameworks for LLM systems.
-
Deep understanding of scaling and optimization trade-offs, including batching, context windows, precision and quantization.
-
Experience building and deploying production-grade RAG systems.
-
Familiarity with orchestration and retrieval tools such as LangChain, LangGraph and LlamaIndex, and vector databases (Weaviate, Qdrant, FAISS).
-
Experience working across structured (SQL, graph) and unstructured data sources.
-
Experience designing agent-based systems with memory, tool use, and multi-step reasoning.
-
Strong understanding of agent workflows (e.g. Plan-Act-Reflect), including self-correction and multi-agent systems.
-
Expertise in inference and retrieval optimization, including chunking strategies, reranking, and low-latency deployment (e.g. vLLM, TGI).
Benefits
-
Salary: $180k – $240k
-
Equity.
57439
SHARE JOB