Multimodal ML Researcher

Seattle, Washington

  Machine Learning

Permanent

Our client, an exciting venture-backed AI-driven startup, is hiring a Multimodal ML Researcher to join their team in Seattle. The successful candidate will lead research on multi-modal input and output from LLMs, including voice and image encoders and decoders, with a focus on text-to-speech, speech-to-text and speech-to-animation capabilities.

Responsibilities:

  • As the Multimodal ML Researcher, you will conduct research on multi-modal input and output from LLMs, encompassing voice and image encoders and decoders, focusing on text-to-speech, speech-to-text and speech-to-animation capabilities.

  • You will enhance voice and vision models through training using both public and proprietary data sources.

  • Review and optimize the company’s data flywheel to ensure streamlined operations.

  • Develop methodologies to enhance model efficiency, accuracy, and overall quality.

  • Create tools for assessing and monitoring model performance and quality.

Skillset:

  • A highly skilled AI researcher with a proven track record of advancing AI products and systems.

  • Proficiency with Large Language Models or other generative AI models.

  • Experience in developing speech or vision models.

  • Strong proficiency in Python and preferably PyTorch.

  • Demonstrated ability to take initiative and achieve results.

Benefits:

  • Market-leading salary

  • Equity

  • Healthcare coverage

  • 401k matching

Interested? Apply now in the link below.

45652

related jobs