SRE Manager
TBILISI, CYPRUS
We are the HUB SRE team — the group responsible for the reliability, availability, and performance of some of the most critical and heavily loaded services in the company. Our infrastructure is a hybrid environment: a mix of cloud services and our own bare-metal servers, each with its own operational model and failure domains. We don't just keep things running — we engineer reliability into the system.
Responsibilities
Lead and manage the HUB SRE team, building a culture grounded in SRE principles: SLOs as contracts, error budgets as decision-making tools, toil reduction as a continuous practice.
Define, implement, and evangelize SLOs/SLIs/error budgets for the company's most critical services — make reliability measurable and actionable.
Drive toil reduction: identify repetitive operational work, set toil budgets, and ensure the team spends the majority of its time on engineering, not firefighting.
Own and evolve incident management processes: on-call rotations, structured incident response, blameless post-mortems, and follow-through on action items.
Build and improve observability across the stack: metrics, alerting, distributed tracing, and dashboards that give teams real-time understanding of system behavior — not just system status.
Drive capacity planning and performance engineering: ensure critical services handle growth without degradation, model capacity needs, and prevent outages before they happen.
Collaborate with HUB backend teams as a reliability partner: review architectures for failure modes, advocate for reliability improvements, and push back when error budgets are exhausted.
Build and evolve CI/CD pipelines toward one-click deployments with automated rollbacks and progressive delivery — make deploying safe and boring.
Champion runbook-driven operations: ensure every critical procedure is documented, tested, and ready for execution under pressure.
Mentor engineers in SRE practices and thinking, help them grow, and build a team that balances operational excellence with engineering ambition.
What makes you the perfect fit
Proven experience as an Engineering Manager, SRE Lead, or Reliability Engineering Lead managing a team of engineers.
Deep understanding of SRE as a discipline: SLOs/SLIs, error budgets, toil classification, capacity planning, incident management — not just tooling, but the philosophy and organizational practices.
Strong technical background in backend systems, Linux, networking, and distributed systems — you understand the services your team is responsible for at a deep level.
Experience working with hybrid infrastructure: cloud providers and bare-metal servers, understanding the reliability trade-offs of each.
Solid experience building and improving observability: monitoring, alerting strategies, distributed tracing, and meaningful dashboards.
Experience building and optimizing CI/CD pipelines for complex, multi-service environments.
Strong incident management skills: structured response, blameless post-mortems, driving systemic improvements from incidents.
Excellent communication, people management, and the ability to influence engineering teams you don't directly manage.
Will be a plus
Background with high-load systems serving millions of requests with strict latency and availability requirements.
Experience with bare-metal server operations: provisioning, networking, hardware failure handling.
Familiarity with chaos engineering or proactive reliability testing (game days, fault injection).
Experience defining on-call compensation models, sustainable on-call rotations, and escalation frameworks.
Background in performance engineering: profiling, load testing, bottleneck analysis.
Knowledge of Infrastructure-as-Code tools (Terraform, Ansible).
What we offer you
Flexible working hours and a hybrid work format.
Well-equipped offices for focused and collaborative work.
A global, distributed team of 500+ professionals.
Learning, mentorship, and long-term career growth.
Relocation support and private health insurance.
Performance-based bonuses.
TradingView Premium access.
Regular team events and company-wide meetups.
Join the TradingView team and help us build a product used by millions of traders and investors worldwide. We look forward to hearing from you!
TradingView is an equal opportunity employer. We embrace diversity and are dedicated to fostering a diverse and inclusive workplace. Our success is driven by 600+ professionals from 40+ countries who speak nearly 20 languages.
Published on: 3/14/2026

TradingView
TradingView is the world’s largest financial analysis platform with more than 100M users across 180+ countries.
Please let TradingView know you found this job on Wantapply.com. It helps us to get more jobs on our site. Thanks!





