Infrastructure Engineer

San Francisco, California

  Software Engineering

Permanent

Our client, an exciting AI-driven startup, is hiring an Infrastructure Engineer to join the team in San Francisco. The successful candidate will be responsible to design, build and maintain the infrastructure and tooling that drive the next generation of AI products, including shaping how the company builds, operates and monitors its systems in production.

Responsibilities

  • Design, implement and maintain AWS infrastructure across compute, networking, storage, IAM, monitoring, logging and security.

  • Manage infrastructure-as-code tooling to codify, version and deploy systems reliably.

  • Work closely with engineers and stakeholders to map dependencies, build deployment pipelines and ensure seamless rollouts.

  • Balance priorities across feature delivery, reliability, technical debt and infrastructure evolution, as well as making informed decisions, sequencing work effectively and communicating trade-offs clearly.

  • Ensure production system reliability by defining SLAs, setting up monitoring and alerting, managing incident response and contributing to post-mortems for continuous improvement.

  • Identify and execute improvements proactively, enhancing infrastructure performance, scalability and operational efficiency.

Skillset

  • Extensive AWS expertise including EC2, ECS, Lambda, VPC, S3, RDS, IAM, CloudWatch and related services.

  • Proven experience with infrastructure-as-code tools such as Terraform or Pulumi in production environments.

  • Hands-on experience building CI/CD pipelines, for example using GitHub Actions.

  • Strong understanding of reliability engineering, including monitoring, alerting, incident response, capacity planning, chaos testing and load management.

  • Excellent communication skills, able to clearly explain infrastructure, trade-offs, reliability metrics and deployment processes to both technical and non-technical stakeholders.

  • Previous startup experience is a plus.

  • Familiarity with Elixir and TypeScript is desirable.

  • Knowledge of security compliance frameworks such as SOC2 and ISO27001.

  • Experience with Kubernetes/EKS is a bonus.

Benefits

  • Salary: $185k – $250k DOE.

  • Health insurance for you and your family.

  • 401k plan.

57009