GPU Kernel Engineer
Our client, a venture-backed AI company, is hiring a GPU Kernel Engineer to join their R&D team in New York. The successful candidate will take the lead in enhancing GPU performance, contributing to the development of the most efficient engine for deploying generative AI models, ranging from precise GPU kernel tuning to extensive system optimizations.
Responsibilities
-
As the GPU Kernel Engineer, you will identify and resolve performance bottlenecks in machine learning training and inference processes.
-
Design and optimize high-performance computing kernels using Triton, CUDA or ROCm.
-
Develop and implement solutions in C/C++ and Python.
-
Conduct in-depth GPU performance optimizations to enhance efficiency and speed.
-
Work closely with the team to enhance or expand existing machine learning compilers or frameworks.
Skillset
-
Master’s degree or PhD in Computer Science, Electrical Engineering or similar.
-
Proficiency in C/C++ and Python programming languages.
-
Extensive knowledge and practical experience in GPU performance optimizations.
-
Proven expertise in kernel optimizations using CUDA, ROCm or other acceleration technologies.
-
Experience with training and deploying ML models.
-
Familiarity with distributed systems or managing distributed ML workloads.
-
Exposure to innovative open-source projects like FlashAttention, mlc-llm or vllm is a plus.
-
Working knowledge of ML frameworks or compilers such as TVM, MLIR, Pytorch, TensorFlow, ONNX Runtime or TensorRT is a bonus.
Benefits
-
Salary: Up to $200k depending on experience.
-
Equity.
-
Health insurance for you and your family.
Interested? Apply Today!
49938
SHARE JOB