Site Reliability Engineer

EuropeCyprusRemoteSenior

We're looking for an SRE Engineer to support and optimize large-scale distributed systems, ensuring high availability, performance, and reliability across production environments. You will monitor system health, troubleshoot complex issues, and drive improvements through automation, observability, and site reliability engineering best practices.

Responsibilities

  • Production Support and Incident Response

  • Identify, analyze, and resolve issues in production and non-production systems.

  • Participate in incident response, root cause analysis, and follow-up actions.

  • Take part in an on-call rotation and support production incidents when needed, including outside regular working hours.

  • Help develop and improve the observability system.

  • Collect and analyze metrics from operating systems, infrastructure, and applications.

  • Use monitoring data to support performance tuning, fault finding, and capacity planning.

  • Implement, maintain, and improve CI/CD processes.

  • Create sustainable systems and services through automation and continuous improvement.

  • Reduce manual work and improve operational efficiency.

  • Partner with development teams to improve service reliability, testing, deployment, and release processes.

  • Support platform stability, scalability, and operational readiness.

  • Work closely with development, QA, infrastructure, and other cross-functional teams.

  • Create and maintain clear technical documentation, runbooks, operational guides, and support procedures.

Requirements

  • Strong SQL skills (T-SQL preferred), including query optimization, performance tuning, and data integrity management.

  • Hands-on experience with Microsoft SQL Server, database design, migrations, and partitioning strategies.

  • Experience with monitoring and observability tools such as Prometheus, Grafana, and ELK.

  • Familiarity with cloud platforms (AWS, GCP, Azure).

  • Proficiency in Python and scripting (Bash/PowerShell) for automation, ETL processes, data manipulation, and API integrations.

  • Basic understanding of networking concepts and protocols (HTTP, DNS, CDN).

Additional Skills:

  • Experience with Apache Airflow, Docker, Kubernetes, Ansible/IaC, and CI/CD tools (GitLab, Jenkins).

  • Strong communication and collaboration skills, with a proactive, problem-solving mindset.

  • English level: Intermediate (B1) or higher.

  • Experience with Airflow, Docker, Kubernetes, Ansible/IaC, and CI/CD pipelines.

  • Strong communication skills and a proactive approach to problem-solving.

  • English level: B1+.

Benefits

  • Quarterly bonuses based on Company performance

  • 24 working days of annual leave 

  • Corporate events and team building activities

  • Udemy Business unlimited membership & language training courses 

  • Professional and personal development opportunities in a fast-growing environment 

Published on: 5/29/2026

Libertex

Libertex

The multi-awarded online trading platform, Libertex, enables traders to access the market and invest in stocks or trade CFDs with underlying assets being commodities, Forex, ETFs, cryptocurrencies, and others.

See all 3 jobs at Libertex

Please let Libertex know you found this job on Wantapply.com. It helps us to get more jobs on our site. Thanks!

Unlock access with PlusPlus

Similar jobs