DevArchitect
← Back to jobs
CÔNG TY CỔ PHẦN NGHIÊN CỨU PHÁT TRIỂN VÀ ỨNG DỤNG NGƯỜI MÁY ĐA NĂNG VINMOTION

Data Engineer Python, SQL

CÔNG TY CỔ PHẦN NGHIÊN CỨU PHÁT TRIỂN VÀ ỨNG DỤNG NGƯỜI MÁY ĐA NĂNG VINMOTION

ITviec
Vietnam3 days ago

Job Description

Top 3 Reasons To Join Us Competitive Compensation World-Class Team in Humanoid Robotics Cutting-Edge Humanoid Robot Products The Job

About the Role

We are looking for a Data Engineer to build and operate the backbone of our robotics data infrastructure. In this role, you will design and maintain scalable data pipelines that collect, process, and store large volumes of multimodal data generated from robots at the edge.

You will work closely with cross-functional teams including Vision, Conversation AI, and Robotics Engineering to ensure high-quality data flows into centralized systems for training, analysis, and intelligent querying.

 

Key Responsibilities

  • Build and Maintain Data Pipelines

Design and implement end-to-end data pipelines that ingest processed data from edge devices (robots) and deliver it to centralized storage and processing systems.

Ensure reliable, scalable, and efficient data flow across different layers of the system architecture.

  • Manage Knowledge Databases

Deploy and optimize vector databases and graph databases to manage metadata and vectorized multimodal data (audio, text, video).

Enable efficient and intelligent data retrieval for downstream AI systems.

  • Ensure Data Quality

Collaborate with internal teams (e.g., Conversation AI, Vision) to implement high-quality data filtering and distillation pipelines.

Support the development of robust processes for large-scale data processing and refinement.

  • Security and Monitoring

Implement access control, monitoring, and alerting systems to ensure secure and stable data operations across multiple sites.

Monitor pipeline health and system performance to maintain reliability.

Your Skills and Experience

Technical Requirements

  • Programming

Strong proficiency in Python and SQL for building and maintaining automated data pipelines.

  • Cloud Infrastructure

Hands-on experience with AWS, particularly EC2 and S3, including compute and storage resource management.

  • Databases

Experience with Vector Databases such as Qdrant, Pinecone, or similar technologies.

Familiarity with handling multimodal data (e.g., video, LiDAR, robot state data).

Experience with mCAP or similar robotics data formats is a plus.

  • Systems & Infrastructure

Solid understanding of distributed systems, large-scale data processing, and data synchronization mechanisms.

 

Expected Outcomes

Build a complete data pipeline infrastructure connecting cloud databases, processing servers, and local storage.

Enable:

Large-scale data movement with fast retrieval

Efficient data querying and visualization

Successfully implement the data distillation pipeline to produce high-quality datasets for downstream AI systems.

Why You'll Love Working Here
  • Competitive Compensation
  • World-Class Team in Humanoid Robotics
  • Cutting-Edge Humanoid Robot Products

Benefits

  • Competitive Compensation
  • World-Class Team in Humanoid Robotics
  • Cutting-Edge Humanoid Robot Products