Data Platform Software Lead

Cognichip Inc.Toronto, CA
Published on

About the Role

We are seeking a skilled and pragmatic Data Platform Engineer to architect and scale intelligent data systems that support our AI and ML pipelines—focused specifically on code-based text datasets. You will play a central role in building the infrastructure that powers data ingestion, transformation, and delivery for our models. This includes developing systems for web-scale data discovery and crawling, designing robust data pipelines, and enabling our scientists to experiment and iterate with confidence.

Core Responsibilities

  • Design and implement scalable data infrastructure to ingest, transform, and manage large-scale code datasets, ensuring high reliability and modularity.
  • Build systems and tools for automated web crawling, parsing, deduplication, and metadata extraction from open-source and public code repositories.
  • Develop robust data pipelines for ingesting and processing structured text datasets using distributed compute frameworks.
  • Monitor quality, throughput, and performance.
  • Collaborate across research, infrastructure, and compliance teams to meet technical, operational, and regulatory requirements.

Required Skills

  • 5 years of software engineering experience in data-intensive environments.
  • Proven experience building and maintaining scalable data systems and infrastructure.
  • Experience with web crawling, scraping frameworks, and large-scale data ingest.
  • Comfortable with AWS or other cloud environments, including storage, containerized compute, and security.
  • Working experience with a data-centric tech stack including Python, Go, or Scala; Spark or Ray; Airflow or Prefect; Kafka; Redis; PostgreSQL or ClickHouse; and GitHub APIs.

Preferred Qualifications

  • Experience curating and preparing code-based datasets for language models or code intelligence applications.
  • Familiarity with code parsing, tokenization, embedding, and static analysis.
  • Prior experience in a startup or fast-paced, high-ownership engineering environment.
  • Strong written and verbal communication skills.

What We Offer

  • Opportunity to shape the technical direction of a disruptive AI startup.
  • Work with cutting-edge technologies in AI/ML and cloud computing.
  • Competitive compensation package including equity.
  • High-caliber, talented collaborators from diverse disciplines.
  • Collaborative and innovative startup culture.

Other Benefits

Competitive compensation package including equityOpportunity to shape the technical direction of a disruptive AI startupWork with cutting-edge technologies in AI/ML and cloud computingHigh-caliber, talented collaborators from diverse disciplinesCollaborative and innovative startup culture