Research Engineer Intern, Data Engineering Environments

TensorStaxCanada, CA
Published on

About TensorStax

TensorStax is building the next generation of autonomous agents for data engineering. Backed by a $5M seed round, we’re developing an LLM Compiler, agent framework, and reinforcement learning infrastructure purpose-built for structured data workflows. Our vision is to unify the modern data stack behind intelligent agents that can plan, debug, and optimize complex pipelines across dbt, Airflow, Spark, and beyond.

The Role

We’re looking for a Research Engineer Intern to help our team build simulation environments that mirror real-world data engineering workflows. These environments are core to our RL training stack — they allow us to safely and scalably train and evaluate agents across realistic DAGs, job failures, and data state transitions. In this role, you’ll work with our research and systems teams to:

  • Build and maintain simulated environments based on real data stack components (e.g., Airflow DAGs, dbt projects, Spark jobs).
  • Script, parameterize, and modularize workloads across various data tools.
  • Set up realistic failure modes, delays, and edge cases for agents to learn from.
  • Help us wrap environments with consistent interfaces used in RL training.
  • Collaborate with ML researchers to ensure environments are reproducible and scalable.

About You

We require candidates to have hands-on experience with data engineering tools such as Spark (PySpark or Scala), Airflow, and dbt. Additionally, proficiency in Python, with clean code and testing practices is crucial. A strong systems mindset, comfortable working across infrastructure, configuration, and orchestration layers, is essential. Familiarity with containerization tools (Docker, etc.) and cloud environments is a plus. An interest in reinforcement learning or agent systems would be beneficial, but is not required. Bonus: experience with dataset generation, simulation environments, or pipeline testing.

Why Join

Join us to work directly on infrastructure that trains intelligent agents in realistic, high-leverage environments. Learn how cutting-edge LLM and RL research are translated into production systems while collaborating with a world-class research team focused on real technical depth. Enjoy a flexible work setup, fast pace, strong mentorship, and an opportunity to own meaningful projects.