About The Role
RadixArk is looking for an RL Systems Engineer to build large-scale reinforcement learning systems for post-training, fine-tuning, and alignment. You'll work on Miles, our open-source RL framework, scaling workloads across multi-node clusters and optimizing throughput, stability, and cost for frontier model development. You'll partner closely with model researchers, systems engineers, and infrastructure teams to productionize RL at scale.
Requirements
3+ years experience building reinforcement learning systems or large-scale ML training infrastructure
Bachelor's or Master's degree in Computer Science, Machine Learning, or equivalent industry experience
Strong experience with reinforcement learning systems and algorithms (PPO, RLHF, DPO, or similar)
Hands-on experience scaling RL or online training workloads across distributed clusters
Solid systems background in distributed training or large-scale ML systems
Proficiency in Python and PyTorch/JAX with production-quality code standards
Experience with multi-node training frameworks (DeepSpeed, FSDP, Megatron) is a plus
Responsibilities
Build large-scale reinforcement learning systems for post-training, fine-tuning, and alignment
Scale RL workloads across multi-node clusters, optimizing throughput, stability, and cost
Work closely with model, systems, and infra teams to productionize RL at scale
Design and implement efficient sample collection, reward modeling, and policy optimization pipelines
Debug and resolve training stability issues, convergence problems, and distributed synchronization bugs
Optimize memory efficiency and computation for large-scale RL training (billions of parameters)
Contribute to Miles and other open-source projects with features, optimizations, and best practices
Create monitoring, observability, and debugging tools for RL training workflows
Write technical documentation and guides for scaling RL systems
About RadixArk
RadixArk is an infrastructure-first company built by engineers who've shipped production AI systems at xAI, created SGLang (20K+ GitHub stars, the fastest open LLM serving engine), and developed Miles (our large-scale RL framework). We're on a mission to democratize frontier-level AI infrastructure by building world-class open systems for inference and training. Our team has optimized kernels serving billions of tokens daily, designed distributed training systems coordinating 10,000+ GPUs, and contributed to infrastructure that powers leading AI companies and research labs. We're backed by well-known investors in the infrastructure field and partner with Google, AWS, and frontier AI labs. Join us in building infrastructure that gives real leverage back to the AI community.
Compensation
We offer competitive compensation with significant founding team equity, comprehensive health benefits, and flexible work arrangements. The US base salary range for this full-time position is: $180,000 - $250,000 + equity + benefits. Our salary ranges are determined by location, level, and role. Individual compensation will be determined by experience, skills, and demonstrated expertise in RL systems and ML infrastructure.
Equal Opportunity
RadixArk is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.
See other positions
