A
AI Data Infrastructure Engineer
Abaka AI
Onsite (Palo Alto, CA)
$110k - $160k/yr
Posted 2 weeks ago
Benefits
- Health Insurance
- Dental Insurance
- Vision Insurance
- PTO
Perks
- Equity
- Flexible Work Schedule
Skills
Python
APIs
Scraping
Data Pipelines
Automation
LLM
Agents
Data Engineering
About the Role
About Abaka AI
Abaka AI is built on one mission: to be the world’s most trusted data partner for AI companies. More than 1,000 industry leaders across Generative AI, Embodied AI, and Automotive AI rely on us to power their data pipelines. With our headquarters in Silicon Valley—and teams in Paris, Singapore, and Tokyo—we support global partners with fast, reliable, and scalable data solutions.
Our offerings include a diverse catalog of off-the-shelf datasets (image, video, multimodal, reasoning, 3D, and beyond) as well as comprehensive data collection and annotation services. Whether teams need raw data, curated datasets, or full-cycle data engineering, Abaka AI provides the foundation for building high-performance AI systems.
About the Role
We’re hiring an AI Data Infrastructure Engineer to build systems that power how large-scale datasets for LLM and multimodal models are discovered, evaluated, and scaled. This is a builder-first engineering role focused on designing LLM-powered agents, automation systems, and data pipelines. You’ll work on problems like:
-
Automatically discovering new data sources across the internet
-
Using LLMs and agents to evaluate and filter data sources at scale
-
Building systems that significantly increase data throughput without increasing headcount
-
This role sits at the intersection of data engineering, LLM systems, and applied AI infrastructure, and is ideal for someone who enjoys building from scratch and shipping fast.
Responsibilities
-
Build LLM-powered agents and automation systems for data discovery and evaluation
-
Design and implement data pipelines for ingesting, filtering, and transforming large-scale datasets
-
Develop internal tools for data quality scoring, ranking, and selection
-
Experiment with scraping, APIs, and programmatic data collection at scale
-
Rapidly prototype and iterate on systems that improve data acquisition speed and quality
-
Collaborate closely with Data Engineering and Research teams to align data systems with model needs
-
Build scalable systems that increase data throughput and efficiency
Qualifications
-
Strong technical foundation (engineering, scripting, systems, or data-focused background)
-
Experience building tools, automation, or pipelines from 0→1
-
Comfortable with Python, APIs, scraping, or backend workflows
-
Interest in LLMs, agents, or applied AI systems
-
Strong problem-solving ability and a builder mindset
-
Ability to operate independently in fast-paced, ambiguous environments
Nice to have:
-
Experience with LLM frameworks or agent systems
-
Experience with large-scale data processing or distributed systems
-
Familiarity with automation tools, workflow builders, or AI-assisted development (e.g., Cursor)
-
Startup or high-growth environment experience
Compensation & Benefits
The base salary range for this position is $110,000 - $160,000 USD annually.
Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies, and experience. Base pay is one part of the total package provided to compensate and recognize employees for their work at Abaka AI. This role is eligible for equity, as well as a comprehensive benefits package including health, dental, vision, PTO, and a flexible work schedule.
Similar Jobs
P
AI Infrastructure Engineer - Agents & Simulation Platform
PlayerZero, Inc.
Remote (San Francisco, California)
$150k - $210k/yr
A
ML Platform & Infrastructure Engineer
AGI, Inc.
Onsite (San Francisco, California)
Data Infrastructure Engineer at zaimler.ai
Jack & Jill/External ATS
Remote (San Mateo, San Mateo)
Data Infrastructure Engineer
Alljoined
Onsite (San Francisco, California)
$140k - $180k/yr
A
Infrastructure Engineer, Pre-training
Anthropic
Hybrid (San Francisco, California)
$350k - $850k/yr
Data & Infrastructure Engineer
WorkBay
Onsite (Draper, Utah)
G
Senior AI Infrastructure Engineer
Gatik AI
Onsite (Mountain View, CA)
$180k - $240k/yr
G
Data Infrastructure Engineer
Glyphic Biotechnologies
Hybrid (Berkeley, CA)
$135k - $178k/yr
X
Senior AI Data Infrastructure Engineer
XPENG
Onsite (Santa Clara, California)
$124k - $210k/yr
F
AI Data Infrastructure Engineer - Helix Team
Figure
Onsite