SRE Observability Engineer – Ticketmaster team
Ticketmaster gives millions of fans – worldwide – fair and easy access to the biggest and best in live entertainment.
Driven by innovation, unparalleled scalability, and unmatched support, Ticketmaster is the definitive leader in professional ticketing solutions. Over 12,000 artists, teams, and venues around the world are giving trust to power their amazing performances daily — with more than 500 million tickets sold each year.
Bakson Ltd company collaborates with Ticketmaster for 19 years already. Bakson Ltd delivers the software used by millions of users across the globe.
The Ticketmaster SRE team builds and runs large, complicated, resilient, and reliable distributed systems that operate at a huge scale. Our focus is around continual improvement of systems – ensuring our capacity / saturation is optimal, and looking for opportunities to improve performance, simplify infrastructure and architecture, eliminate work that doesn’t directly add value, through automation, and provide information and actionable insights to help understand our systems when they inevitably misbehave. You will be part of the Marketplace SRE team, which is responsible for the stability of our systems, the team is fully working with the engineer teams to deploy all requirements and keep the systems safe. The role is remote, but there is the option to work in any Ticketmaster
office, and the team convene to work together throughout the year.
THE JOB
You will be working on our observability stack, collaborating with various development teams across US and EU time zones, most of whom have services deployed in Kubernetes, either on-prem (Rancher) or in AWS (EKS), so we would expect you to have experience of K8s or similar container orchestration platforms. We have many complex and highly distributed systems, spanning decades of history, so we would expect you to be comfortable working in a complex, multi-technology platform. Our business model demands that we sustain exceptional high traffic at specific times, so staying calm and being able to troubleshoot under high pressure is essential. Our team works closely with engineers developing a wide range of fan-facing services, and we would expect you to integrate quickly with these teams and with your own team members, mindful of the challenges posed by different time zones and first languages (less than a third of our team speak English as a first language). Networking with engineers across the organisation is an important part of what we do, so we would expect you to build connections with senior engineers across Ticketmaster. We believe that monitoring, alerting, and observability are foundational to reliability, so we would want you to be contributing to continuous improvement of measuring and alerting on both potential causes and symptoms, as we define service level indicators and objectives, and strive to meet them.
WHAT YOU WILL BE DOING
- Support the full system lifecycle for automation and tools including the design, assessment, selection, commissioning, validation, and implementation of systems.
- Provide input into the design, development and implementation of systems automation and tooling for software engineering teams to achieve their goals.
- Work closely with peers in software engineering teams to implement solutions that are scalable, secure, and easily maintained.
- Develop tools, both command line and web based, that are responsible for maintenance and management functions of development and production systems.
- Work with systems and software engineers to develop and document requirements and functional specifications.
WHAT YOU NEED TO KNOW (TECHNICAL SKILLS)
Minimum qualifications:
- Strong knowledge of Prometheus monitoring stack.
- Expert in building informative dashboards in Grafana.
- Good knowledge of JavaScript.
- Good knowledge of caching services.
- Knowledge and experience of containers and Kubernetes cluster.
- Technical writing skills for documenting environments and procedures.
Preferred qualifications:
- Good knowledge of load testing via k6.
- 3 years of experience in progressively more complex environments.
- A strong understanding of core network protocols and services.
- A strong Linux experience.
- Experience architecting, developing, and troubleshooting observability systems.
- Solid knowledge of working with third party APIs.
- knowledge of CDN (Fastly) is beneficial
- Any alternative observability solutions experience is beneficial
YOU (BEHAVIOURAL SKILLS/COMPETENCIES)
- Autonomous and proactive.
- Passionate about technology and transformation.
- Self-motivated, activator, energetic and tenacious.
- Comfortable with working in cross functional and multidisciplinary teams.
- Excited about taking on challenging technical problems and devising creative solutions.
WHAT TO EXPECT?
- A flat team structure and a highly collaborative culture that values progress over perfection and encourages creativity, innovation and diversity
- A highly motivated global team of colleagues
- A flexible working style – primarily remote position. Working from the office on demand
WHAT WE OFFER
- Mix of serious projects and a great working atmosphere, well recognized on the market.
- Dynamic international work environment.
- Skilled and senior co-workers.
- Proper financial compensation.
- Private medical care
- Personal and professional development – personal education budget, internal Tech talks and soft skills trainings
WHO WE ARE?
Bakson Ltd is a software development company based in Belgrade. We are working with teams around the world and take pride on variety of projects we handle and technology we use.
HOW DO WE WORK?
Our workflow is inspired by Agile and Lean principles. We’re not devoted to Scrum or any other framework, but are trying to work in small batches, with fast feedback and very close interaction with product owners.
The emphasis in our team is on collaboration and mutual support – sharing project workflow with globally distributed teams, contributing code to core global services and applications, and encouraging cultural exchange between development groups. Ticketmaster encourages working from home, and the distributed nature of our teams requires us to have flexibility around working hours. We’re familiar with asynchronous and remote work. A Software Engineer in our company is a core writer of code, but also an inspirer and an exemplar to other developers…
Basically, what we care about is that you are a self-starter, happy to work with others, and prepared to adapt and do your best.
We aim for our hiring process to be as collaborative and realistic as possible, so it’ll be focused on writing and reviewing code – both written by you and by others. We want you to feel like you’d be comfortable working with us, and we also want to feel the same way, so you’ll meet quite a few of the team, and interact with them in as close to a life-like way as possible. This is a two-way street – we’re keen for you to like us as much as the other way around. If you’d like get started, you can apply by pressing the “apply” button on this webpage or by sending a CV or an introductory email to [email protected]