EngineeringFull-time

Data Pipeline Manager

제2판교 IT센터

상시 채용

Role Overview

Before a good model, you need good data and a good evaluation framework.

WoRV's Data & Evaluation Manager is not someone who accumulates data — they are someone who designs and owns the entire data pipeline. This role defines what data the product needs to improve, what criteria determine that something is "ready to ship," and then operates the full collection-refinement-evaluation loop against those criteria.

WoRV continuously searches for the best methods to acquire high-quality data quickly. That could mean running a GPS-based semi-autonomous fleet in the field to collect real-world data, or leveraging operational data from VLA-based autonomous driving. The collection method is not fixed. The essence of this role is defining and accelerating the first cycle of the data flywheel — bridging the gap between the data the Research team (VLA/VLM) needs and the data that can realistically be collected from customer sites.

WoRV is currently collecting data across multiple industry projects simultaneously — agriculture, construction, and ports among others. For data from individual projects to accumulate into shared model capability for the entire team, standardized formats, metadata schemas, quality standards, and collection efficiency management are required. The Data & Evaluation Manager owns all of this.

Responsibilities

1. End-to-End Data Pipeline Ownership

Define field data collection strategy and maximize collection efficiency (ratio of useful data to setup time).
Design and operate the end-to-end pipeline for sensor, teleoperation, and field operation data.
Define data schemas, versioning, metadata frameworks, and standard formats (LeRobot, etc.).
Manage storage infrastructure (DGX SSD, NAS HDD, etc.) and the training data lifecycle.

2. Evaluation Framework Design and Operations

Build an evaluation framework that distinguishes customer-specific KPIs from shared product KPIs.
Connect model-, module-, and system-level offline and online evaluation.
Build an experiment framework that quantitatively demonstrates the "data → model performance" relationship.
Surface failure cases and design retraining / re-evaluation loops.

3. Cross-Project Data Strategy

Coordinate data collection priorities across multiple projects (Navigation, Manipulation).
Collaborate with the Research team to define collection scenarios and topics.
Build structures that allow individual project data to accumulate into shared model capability.
Work with PM to design customer contract structures around data ownership and usage rights.

Qualifications

Experience designing and operating data pipelines or data platforms
Experience formulating data collection strategies and connecting them through to field operations
Ability to judge "what data we need" from a problem-first perspective
Ability to create metrics and design them so they don't pull the team in the wrong direction
Ability to communicate with and coordinate priorities across multiple stakeholders: Research, PM, and field operations
Data processing skills in Python, SQL, etc.

Preferred Qualifications

Experience building and operating CV / Robotics / Autonomous Driving datasets
Experience with labeling ops, taxonomy design, and annotation quality control
Experience with evaluation harnesses, benchmarks, and experiment tracking
Experience processing multi-modal sensor data (camera, LiDAR, IMU, etc.)
Experience with active learning, hard case mining, or data flywheels
Experience with the HuggingFace ecosystem (LeRobot, Datasets, etc.)
Experience with MLOps or data infrastructure
Experience leading or managing a data team

Who We're Looking For

Someone who treats data design as more important than model training
Someone who relentlessly digs into "why did we fail because this data was missing?"
Someone who can see the full flow from collection site to training server and find the bottleneck
Someone who views data as an asset and designs for reusability

Hiring Process

Application

1st Interview

Technical interview

2nd Interview

Culture-fit interview

Offer

Hired

* 서류전형 합격 여부는 3일 이내로 개별 연락 드립니다

We're looking for a manager to define and spin up the first cycle of the data flywheel

Apply Coffee Chat