DevArchitect – The Elite Engineer Career Platform

Top 3 Reasons To Join Us Attractive salary package & token bonus Working on advanced blockchain technology Talented long-time builders in crypto The Job

Senior AI Engineer (Infrastructure)

About Us

We run an AI platform with 50K+ daily active users, millions of generations per day, entirely powered by open-weight models running on our own GPU fleet.

New models drop weekly. New hardware ships quarterly. Our job is to be fast at adopting both.

The team

You'll be the AI/ML Infrastructure Engineer.

Our first built the system from scratch: dynamic LoRA serving with 100+ adapters hot- swapped per request, inference optimization (DeepCache, torch.compile, quantization, abliteration), and keeps us on the latest GPU hardware as it ships.

Together, you'll own everything between "a new model just dropped" and "it's live, fast, and cost-efficient."

A typical week

Benchmark a new open-weight model, quantize it, test LoRA compatibility, decide ship or skip
Tune block-level caching for Blackwell architecture, measure quality/speed tradeoffs
Dig into GPU utilization data, find wasted spend, redesign auto- scaling
Debug a 3 AM latency spike — OOM on two pods, fix it, write up what happened

You'll thrive here if you

Have shipped open-weight models to production at scale — not notebooks, not demos. LLMs, VLMs, image — the more architectures the better.
Can show real optimization results with numbers — Xs faster, $Y/ month saved, Z% latency reduction.
Think in cost-per-generation, not just raw performance. We care about both.
Pick up new models and hardware fast. The ecosystem won't wait for you.
Work independently. You'll figure out what to optimize — we won't hand you a roadmap.

Bonus points

Built or worked on dynamic adapter serving (LoRA hot-loading, multi-model routing)
Model surgery beyond default settings: custom quantization, abliteration, architectural pruning
Evaluated and migrated workloads across GPU generations

What we run

Models: Various open-weight LLMs, VLMs, and image models — changes constantly Optimization: PyTorch, torch.compile, DeepCache, GPTQ/AWQ Serving: Custom dynamic LoRA system Hardware: RTX 6000 Blackwell, H100 — we evaluate and migrate as new GPUs ship Infra: RunPod + on-prem · Docker · Python · Go backend

Your Skills and Experience

Why us over a bigger company

You won't spend 6 months getting access to a GPU cluster. You won't write design docs that never ship. You'll push to production this week.
The problems are real, the scale is real, and you'll see your work in the numbers every morning.

Why You'll Love Working Here

Social insurance, health insurance & private health insurance
13th month salary + year-end bonus based on real contribution
Breakfast, lunch & afternoon snacks provided
Flexible working hours
AI Learning Budget — tools, courses, subscriptions to level up your skills
Birthday leave
Competitive pay + bonuses tied directly to impact.
Macbook, iMac and monitors provided.

Benefits

Social insurance, health insurance & private health insurance
13th month salary + year-end bonus based on real contribution
Breakfast, lunch & afternoon snacks provided
Flexible working hours
AI Learning Budget — tools, courses, subscriptions to level up your skills
Birthday leave
Competitive pay + bonuses tied directly to impact.
Macbook, iMac and monitors provided.

Senior AI Engineer Infrastructure Focus

Job Description

Benefits