Location: New York, United States
Alldus is currently seeking a Data Engineer with Kafka to be part of the Technology team at one of our major clients. A successful candidate will support customer facing applications that are intended to optimize wellbeing of individuals in indoor spaces. The engineer would work closely with other team members to create ETL pipelines, machine learning infrastructure, and conduct analysis of the results to generate insights. It is expected that the Data Engineer would be involved in the full life cycle of the product. We are looking for a well qualified candidate with a background in Computer Science or related to experience in building data infrastructure for human-centered applications.
- Collaborate with multiple teams to lay out data infrastructure strategy and execution plans for products
- Create endpoints for data ingestion, create and maintain databases, and data warehouses for larger analytics
- Create workflows that help with automating analytics tasks, and initiate and tune machine learning models
- Serve as a subject matter expert for data infrastructure to help the company innovate on establishing new, and maintaining existing products
- Advise the VP - Data and the CTO on matters associated with data infrastructure development and innovation
- Work closely with an interdisciplinary team of engineers and scientists to gather data requirements and building research pipelines for creating new prototypes
- Experience with Kafka is a must
- B.S. or higher in Computer Science or related field
- Fluency in one of the following: Python, Java, Scala.
- Fluency in SQL
- Experience with cloud platforms, preferably AWS
- Experience with schema design
- Experience with SQL databases, as well as No-SQL databases like MongoDB
- Experience with big data tools like Spark, Hadoop, HBase, etc.
- Experience with data warehousing solutions like Apache Hive, Redshift, Snowflake etc.
- Experience with workflow orchestrators like Apache Airflow, Luigi, Metaflow, etc.
- Experience with visualization software like Apache Superset, Tableau, Chartio, Google Data Studio, etc.
- Experience with classical as well as deep learning based supervised as well as unsupervised machine learning techniques
- Experience with building data processing pipelines for machine learning with multimodal data
- 2+ years of experience in the data engineering or related field
- Experience with human-centered products and data analytics
- Experience building end-to-end pipelines starting from ingestion to insight generation is a plus
- Experience with deep learning cloud infrastructure such as SageMaker, TensorFlow Enterprise, etc.
- Experience with analytics platforms like Amplitude, MixPanel, Google Analytics, etc.
- Experience with using CI/CD systems like Travis CI, Github Actions, etc.
Complete the form below to apply for the Data Engineer role: