Site Reliability Engineer (SRE)

WorldwideRemoteMiddleSenior

We are looking for a Site Reliability Engineer (SRE) who is passionate about infrastructure reliability, automation, and building scalable production systems.

Your main tasks will be:

  • Own and improve production infrastructure reliability and stability

  • Prepare, execute, and support deployments and infrastructure changes

  • Build and maintain Infrastructure-as-Code solutions using Ansible and Terraform

  • Support and optimize Kubernetes-based and containerized environments

  • Develop automation scripts and internal operational tooling

  • Monitor system health, investigate incidents, and proactively improve observability

  • Participate in CI/CD improvements together with Development, QA, DevOps, and SRE teams

  • Work with monitoring and alerting systems to reduce downtime and improve system performance

  • Maintain technical documentation, runbooks, and operational procedures

  • Support DNS, WAF, CDN, and caching infrastructure where required

We expect from you:

  • 3+ years of experience in SRE, DevOps, System Administration, or Build/Release Engineering

  • Strong Linux administration and troubleshooting skills

  • Hands-on experience with Kubernetes and containerization technologies (Docker/Podman)

  • Experience with CI/CD pipelines, preferably GitLab CI

  • Practical experience with Infrastructure-as-Code and configuration management tools (Ansible and/or Terraform)

  • Strong Bash scripting skills

  • Experience with observability and monitoring tools such as Prometheus, Grafana, Zabbix, or VictoriaMetrics

  • Good understanding of networking fundamentals, DNS, HTTP/HTTPS, load balancing, and troubleshooting

  • Experience with Git and modern software delivery workflows

  • Ability to work independently, take ownership, and proactively improve infrastructure

  • English level B1–B2 for technical documentation and team communication

Nice to have:

  • AWS or GCP experience

  • RabbitMQ / AMQP experience

  • Cloudflare, Akamai, WAF, CDN experience

  • Python scripting skills

  • Experience with tracing and advanced observability tooling

  • Public GitHub profile, open-source contributions, or personal engineering projects

What we offer:

  • 🏠 Fully remote work format

  • 🌍 International team of 1000+ professionals

  • ✈️ Work-from-anywhere culture

  • 🎁 Strong benefits package

Published on: 5/23/2026

Social Discovery Group

Social Discovery Groupverified company badge

Social Discovery Group (SDG) is the third-largest social discovery company in the world, uniting over 60 brands with 500 million users. We solve the problems of loneliness, isolation, and disconnection by transforming virtual intimacy into the new normal.

Website

See all 11 jobs at Social Discovery Group

Please let Social Discovery Group know you found this job on Wantapply.com. It helps us to get more jobs on our site. Thanks!

Similar jobs