ML / AI Research Engineer

San Francisco, California

  Machine Learning

Permanent

Our client, a venture-backed AI Startup, is hiring a talented ML/AI Research Engineer to join their team in San Francisco. The successful candidate will lead the design, training, evaluation and optimization of agent-native AI systems, working at the cutting edge of LLMs, vector search, graph reasoning and reinforcement learning to build the intelligence layer on top of their enterprise data fabric.

Responsibilities

  • Fine-tune and evaluate open-source LLMs (e.g. LLaMA 3, Mistral, Falcon, Mixtral) for enterprise-grade applications.

  • Build and optimize RAG pipelines using tools such as LangChain, LangGraph, LlamaIndex or Dust.

  • Develop and iterate on agent architectures (ReAct, AutoGPT, BabyAGI, OpenAgents) using real-world enterprise workflows.

  • Design embedding-based memory systems with efficient, high-performance retrieval strategies.

  • Implement reinforcement learning pipelines (RLHF, DPO, PPO) to improve agent behavior and decision-making.

  • Create scalable evaluation frameworks, including synthetic evaluations, trace capture and explainability tooling.

  • Own model observability, drift detection and alignment strategies across production systems.

  • Optimize inference latency and GPU utilization across cloud and on-premise infrastructure.

Skillset

  • Strong experience fine-tuning open-source LLMs using frameworks such as HuggingFace, DeepSpeed, vLLM, FSDP and LoRA/QLoRA.

  • Hands-on experience with modern alignment techniques, including SFT, RLHF and DPO pipelines.

  • Proven ability to build high-quality training datasets and robust evaluation frameworks for LLM systems.

  • Deep understanding of scaling and optimization trade-offs, including batching, context windows, precision and quantization.

  • Experience building and deploying production-grade RAG systems.

  • Familiarity with orchestration and retrieval tools such as LangChain, LangGraph and LlamaIndex, and vector databases (Weaviate, Qdrant, FAISS).

  • Experience working across structured (SQL, graph) and unstructured data sources.

  • Experience designing agent-based systems with memory, tool use, and multi-step reasoning.

  • Strong understanding of agent workflows (e.g. Plan-Act-Reflect), including self-correction and multi-agent systems.

  • Expertise in inference and retrieval optimization, including chunking strategies, reranking, and low-latency deployment (e.g. vLLM, TGI).

Benefits

  • Salary: $180k – $240k

  • Equity.

57439