This job has been archived and is no longer active.
ML OPS (DevOps)
5500$
Prohibited locations: RF, Ukraine, RB
English: B2
Years: ML - 6+ months, DevOps -3+ years commercial experience
Key Responsibilities
Development and support of end-to-end ML pipelines (training, validation, deployment, monitoring, retraining)
Construction and operation of CI/CD for models (test automation, packaging, and deployment)
Design of LLM/RAG pipelines, context management,
embedding dashboards (embedding quality/dynamics dashboards), index regeneration, prompt and fact-check testing (Grounding/citation)
MLOps platform setup: experiment tracking, model registry, feature store, monitoring
Management of ML infrastructure and environments (GPU/CPU pools, Kubernetes/EKS, Docker)
Implementation of deployment strategies: canary, shadow, A/B testing
Ensuring model quality monitoring (accuracy drift, data drift, PSI, SLO/SLA)
Artifact management (data, models, metadata, versions)
Security compliance (encryption, access control, auditing, operation in private VPCs)
Integrating ML models into backend services (API, gRPC, REST)
Collaborating with Data Engineering and Data Science teams
Documenting processes and best practices for ML infrastructure
Managing the cost and scaling of ML infrastructure in AWS
Data governance: storage policies (S3 lifecycle), dataset versioning (DVC/LakeFS), data lineage (OpenLineage), quality gates in CI/CD
Requirements
ML Ops Tools
MLflow or Kubeflow (experiments, registry)
Feature Store (Feast, Tecton, or custom)
Airflow, Prefect, or Kubeflow Pipelines (ML workflow orchestration)
Infrastructure and Containerization
Docker, Kubernetes/EKS
AWS S3, ECR, EKS, IAM, KMS, VPC
Terraform or Pulumi (IaC)
GitHub Actions, GitLab CI, or Jenkins (CI/CD)
Autoscaling, AWS Batch/Step Functions for offline processing and retrieval
Monitoring and Observability
Prometheus, Grafana, CloudWatch, CloudTrail
Model Quality Metrics (AUC, F1, Brier, logloss)
Stability metrics (drift detection, PSI)
LLM-specific metrics: tokens/sec, context length, prompt/response size, grounding rate, citation coverage, hallucination rate.
Key Competencies
Building a stable and secure ML infrastructure
Automation Full-cycle ML: from data to inference services
Quality control and stability of models in production
Effective collaboration with data science and data engineering teams
Joining Valletta Software Development means:
🌍 A Global, Thriving Team
Join 100+ specialists from 20+ countries, united by a passion for outstanding
IT solutions.
🚀Diverse projects: Fintech, MedTech, AI/ML, e-commerce, and more. Switch
teams or industries to broaden your skills.
💡 Support at Every Step Client interview prep: We train you to succeed + give actionable feedback.
✔️ Strategic stability: Well-structured processes, strong management, and long- term vision.
✔️ Core values: Honesty, flexibility, innovation, and a people-first approach.
💸 Regular salary review based on your personal results
✨ Paid rest days and sick leaves;
Published on: 10/17/2025

Valletta Software
Valletta Software - custom mobile/web software developer in the US and Europe.