SRE Manager

CyprusGeorgiaHybridLead

TBILISI, CYPRUS

We are the HUB SRE team — the group responsible for the reliability, availability, and performance of some of the most critical and heavily loaded services in the company. Our infrastructure is a hybrid environment: a mix of cloud services and our own bare-metal servers, each with its own operational model and failure domains. We don't just keep things running — we engineer reliability into the system.

Responsibilities

  • Lead and manage the HUB SRE team, building a culture grounded in SRE principles: SLOs as contracts, error budgets as decision-making tools, toil reduction as a continuous practice.

  • Define, implement, and evangelize SLOs/SLIs/error budgets for the company's most critical services — make reliability measurable and actionable.

  • Drive toil reduction: identify repetitive operational work, set toil budgets, and ensure the team spends the majority of its time on engineering, not firefighting.

  • Own and evolve incident management processes: on-call rotations, structured incident response, blameless post-mortems, and follow-through on action items.

  • Build and improve observability across the stack: metrics, alerting, distributed tracing, and dashboards that give teams real-time understanding of system behavior — not just system status.

  • Drive capacity planning and performance engineering: ensure critical services handle growth without degradation, model capacity needs, and prevent outages before they happen.

  • Collaborate with HUB backend teams as a reliability partner: review architectures for failure modes, advocate for reliability improvements, and push back when error budgets are exhausted.

  • Build and evolve CI/CD pipelines toward one-click deployments with automated rollbacks and progressive delivery — make deploying safe and boring.

  • Champion runbook-driven operations: ensure every critical procedure is documented, tested, and ready for execution under pressure.

  • Mentor engineers in SRE practices and thinking, help them grow, and build a team that balances operational excellence with engineering ambition.

What makes you the perfect fit

  • Proven experience as an Engineering Manager, SRE Lead, or Reliability Engineering Lead managing a team of engineers.

  • Deep understanding of SRE as a discipline: SLOs/SLIs, error budgets, toil classification, capacity planning, incident management — not just tooling, but the philosophy and organizational practices.

  • Strong technical background in backend systems, Linux, networking, and distributed systems — you understand the services your team is responsible for at a deep level.

  • Experience working with hybrid infrastructure: cloud providers and bare-metal servers, understanding the reliability trade-offs of each.

  • Solid experience building and improving observability: monitoring, alerting strategies, distributed tracing, and meaningful dashboards.

  • Experience building and optimizing CI/CD pipelines for complex, multi-service environments.

  • Strong incident management skills: structured response, blameless post-mortems, driving systemic improvements from incidents.

  • Excellent communication, people management, and the ability to influence engineering teams you don't directly manage.

Will be a plus

  • Background with high-load systems serving millions of requests with strict latency and availability requirements.

  • Experience with bare-metal server operations: provisioning, networking, hardware failure handling.

  • Familiarity with chaos engineering or proactive reliability testing (game days, fault injection).

  • Experience defining on-call compensation models, sustainable on-call rotations, and escalation frameworks.

  • Background in performance engineering: profiling, load testing, bottleneck analysis.

  • Knowledge of Infrastructure-as-Code tools (Terraform, Ansible).

What we offer you

  • Flexible working hours and a hybrid work format.

  • Well-equipped offices for focused and collaborative work.

  • A global, distributed team of 500+ professionals.

  • Learning, mentorship, and long-term career growth.

  • Relocation support and private health insurance.

  • Performance-based bonuses.

  • TradingView Premium access.

  • Regular team events and company-wide meetups.

Join the TradingView team and help us build a product used by millions of traders and investors worldwide. We look forward to hearing from you!

TradingView is an equal opportunity employer. We embrace diversity and are dedicated to fostering a diverse and inclusive workplace. Our success is driven by 600+ professionals from 40+ countries who speak nearly 20 languages.

Published on: 3/14/2026

TradingView

TradingViewverified company badge

TradingView is the world’s largest financial analysis platform with more than 100M users across 180+ countries.

Website

See all 18 jobs at TradingView

Please let TradingView know you found this job on Wantapply.com. It helps us to get more jobs on our site. Thanks!

Similar jobs