RL Systems Engineer

RL Systems Engineer

$180,000 - $250,000

$180,000 - $250,000

3+ years experience

3+ years experience

Apply Now

Apply Now

About The Role

RadixArk is looking for an RL Systems Engineer to build large-scale reinforcement learning systems for post-training, fine-tuning, and alignment. You'll work on Miles, our open-source RL framework, scaling workloads across multi-node clusters and optimizing throughput, stability, and cost for frontier model development. You'll partner closely with model researchers, systems engineers, and infrastructure teams to productionize RL at scale.

Requirements

  • 3+ years experience building reinforcement learning systems or large-scale ML training infrastructure

  • Bachelor's or Master's degree in Computer Science, Machine Learning, or equivalent industry experience

  • Strong experience with reinforcement learning systems and algorithms (PPO, RLHF, DPO, or similar)

  • Hands-on experience scaling RL or online training workloads across distributed clusters

  • Solid systems background in distributed training or large-scale ML systems

  • Proficiency in Python and PyTorch/JAX with production-quality code standards

  • Experience with multi-node training frameworks (DeepSpeed, FSDP, Megatron) is a plus

Responsibilities

  • Build large-scale reinforcement learning systems for post-training, fine-tuning, and alignment

  • Scale RL workloads across multi-node clusters, optimizing throughput, stability, and cost

  • Work closely with model, systems, and infra teams to productionize RL at scale

  • Design and implement efficient sample collection, reward modeling, and policy optimization pipelines

  • Debug and resolve training stability issues, convergence problems, and distributed synchronization bugs

  • Optimize memory efficiency and computation for large-scale RL training (billions of parameters)

  • Contribute to Miles and other open-source projects with features, optimizations, and best practices

  • Create monitoring, observability, and debugging tools for RL training workflows

  • Write technical documentation and guides for scaling RL systems

About RadixArk

RadixArk is an infrastructure-first company built by engineers who've shipped production AI systems at xAI, created SGLang (20K+ GitHub stars, the fastest open LLM serving engine), and developed Miles (our large-scale RL framework). We're on a mission to democratize frontier-level AI infrastructure by building world-class open systems for inference and training. Our team has optimized kernels serving billions of tokens daily, designed distributed training systems coordinating 10,000+ GPUs, and contributed to infrastructure that powers leading AI companies and research labs. We're backed by well-known investors in the infrastructure field and partner with Google, AWS, and frontier AI labs. Join us in building infrastructure that gives real leverage back to the AI community.

Compensation

We offer competitive compensation with significant founding team equity, comprehensive health benefits, and flexible work arrangements. The US base salary range for this full-time position is: $180,000 - $250,000 + equity + benefits. Our salary ranges are determined by location, level, and role. Individual compensation will be determined by experience, skills, and demonstrated expertise in RL systems and ML infrastructure.

Equal Opportunity

RadixArk is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

See other positions

Copyright. RadixArk @2025

contact@radixark.ai

Copyright. RadixArk @2025

contact@radixark.ai