MLOps Engineer - AI/ML Systems & Deployment (TS/SCI Preferred)
Role details
Job location
Tech stack
Job description
At Rackner, we build systems where advanced technologies move beyond prototypes and into real-world operational use., We are seeking an MLOps Engineer to support the deployment and lifecycle management of AI/ML systems within a secure, mission-focused environment.
This is not a research role.
This is where models become reliable, deployable, and auditable systems.
You will operate at the intersection of:
- machine learning
- cloud-native infrastructure
- distributed systems
…and ensure AI/ML systems are production-ready in environments where reliability and performance matter.
What You'll Do
Own the ML Lifecycle (End-to-End)
- Build and operate production-grade ML pipelines
- Orchestrate workflows using Kubeflow, Airflow, or Argo
- Implement model versioning, lineage, and reproducibility standards
Operationalize AI/ML Systems
- Deploy models into secure and constrained environments Transition workflows from experimentation containerized pipelines production systems Enable both batch and real-time inference architectures
Engineer for Reliability
- Design systems for reproducibility, auditability, and stability
- Monitor model performance and system health using Prometheus, Grafana, OpenTelemetry
- Detect and resolve issues such as model drift and system degradation
Build Cloud-Native ML Infrastructure
- Deploy and manage Kubernetes-based ML workloads
- Containerize pipelines using Docker
- Support scalable training and inference workflows
Establish Data Discipline
- Support feature engineering and dataset preparation
- Implement data versioning and governance practices (e.g., lakeFS)
- Apply metadata and data management standards
Create Repeatable Systems
- Develop runbooks, playbooks, and documentation
- Build systems that are operationally sustainable and transferable, This role is a career accelerator for engineers who want to:
- Move beyond experimentation and own production systems
- Work across ML, infrastructure, and deployment pipelines
- Build in high-trust, secure environments
- Develop high-demand MLOps expertise in constrained systems
- Deliver systems that are used, not just built
Requirements
- Experience deploying ML systems into production environments
- Strong programming skills in Python
- Hands-on experience with:
- ML pipeline tools (Kubeflow, Airflow, Argo)
- Experiment tracking tools (MLflow, ClearML)
Infrastructure & Systems
- Experience with Kubernetes and containerized systems (Docker)
- Familiarity with CI/CD pipelines
- Understanding of distributed systems and scalable architectures
ML Application Exposure
- Experience working with:
- LLMs or transformer-based models
- Computer vision systems (YOLO, Faster R-CNN)
- Focus on deployment and integration, not pure research
Mindset
- Systems thinker who prioritizes reliability over novelty
- Comfortable operating in complex, evolving environments
- Focused on delivering real-world outcomes
Clearance Requirements
- Active TS/SCI clearance strongly preferred
- Candidates with an active Secret clearance may be considered and supported for upgrade
- Candidates without an active clearance must be:
- U.S. citizens
- eligible to obtain and maintain a clearance
- able to work in a CAC-enabled or secure environment
Benefits & conditions
Health insurance, 401(k) matching, Paid time off, Vision insurance, Dental insurance, Life insurance, Disability insurance