Job Description
OUR PRODUCT IS OPEN-SOURCE AND USED AT ENTERPRISE SCALE
Our distributed data engine Daft is open-sourced and runs on 800k CPU cores daily. This is more compute than Frontier, the world’s largest supercomputer!
Today’s data tooling (Spark, Presto, Snowflake) was built for a world of tabular data analytics, but does not generalize to the needs of modern ML/AI such as multimodal data, heterogenous compute and user-defined Python algorithms.
Eventual and Daft Bridge that gap, making ML/AI workloads easy to run alongside traditional tabular workloads.
About The Role
At Eventual, we’re pushing the boundaries of artificial intelligence and large-scale distributed data systems. As a Research Engineer focused on AI Pretraining, you will operate at the intersection of cutting-edge AI research and scalable system development. Your work will involve implementing advanced dataset and model training techniques—including multimodal learning, synthetic data generation, and reinforcement learning from human feedback (RLHF)—to drive innovation and performance on the Daft data engine.
In this role, you will collaborate closely with the Daft data engine team to build and optimize a high-performance data engine, ensuring it meets the scale and demands of modern AI workloads.
Key Responsibilities:
- Build model pretraining pipelines: build training and data pipelines in a principled and observable manner using state-of-the-art data techniques
- Develop a set of benchmarks: design and define the benchmarks for AI data workloads
- Data Engineering for AI: Collaborate with our data engine team to design and optimize the Daft data engine for these targeted AI data workloads on massive 100TB+ datasets
- AI Research: Stay at the forefront of AI research, incorporating the latest advancements into our data engine and platform capabilities
What we look for:
- Strong programming skills in Python, with experience in deep learning frameworks such as PyTorch or TensorFlow.
- Deep understanding of transformer architectures, self-supervised learning, and AI model training techniques.
- Experience with distributed training frameworks (e.g., DeepSpeed, FSDP, Horovod) and efficient model parallelism.
- Expertise in data pipelines and large-scale dataset management.
- Familiarity with ML compilers, kernel optimizations, and GPU acceleration is a plus.
- Familiarity with systems programming (Rust, C++) is a plus.
- PhD or equivalent research experience in Machine Learning, Computer Science, or related fields is preferred.
Why Join Eventual?
- Work alongside world-class experts in distributed computing and AI research.
- Build the next generation of scalable AI infrastructure and training techniques.
- Competitive salary, equity, and top-tier benefits.
- A collaborative, engineering-driven environment where innovation thrives.
Benefits and Remote Work
We are believers in both having the flexibility of remote work but also the importance of in-person work, especially at the earliest stages of a startup. We have a flexible hybrid approach to in-person work with at least 3 days of in-person work typically from Monday – Wednesday at our office in San Francisco.
We believe in providing employees with best-in-class compensation and benefits including meal allowances, comprehensive health coverage including medical, dental, vision and more.
About The Interview
INTRODUCTORY CALL [15M]
A short phone screen over video call with one of our co-founders for us to get acquainted, understand your aspirations and evaluate if there is a good fit in terms of the type of role you are looking for.
TECHNICAL PHONE SCREEN [1 HR]
A technical phone screen question over video call to understand your technical abilities.
TECHNICAL INTERVIEW PANEL [4 HR]
Technical interviews with the rest of the Eventual team with questions to further understand your technical strengths, weaknesses and experiences.
MEET THE TEAM
As many chats as necessary to get to know us – come have a coffee with our co-founders and existing team members to understand who we are and our goals, motivations and ambitions.
We look forward to meeting you!
WE’RE GROWING – COME GROW WITH US!
We are well funded by investors such as YCombinator, Caffeinated Capital, Array.vc and top angels in the valley from Databricks, Meta and Lyft.
Our team has deep expertise in high performance computing, big data technologies, cloud infrastructure and machine learning. Our team members have previously worked in top technology companies such as Amazon, Databricks, Tesla and Lyft.
We are looking for exceptional individuals with a passion for technology and a strong sense of intellectual curiosity.
If that sounds like you, please reach out even if you don’t see a specific role listed that matches your skillsets – we’d love to chat!