Imagine a monitoring system so smart that it not only alerts you to actionable problems but actively guides you to a solution - transforming reactive troubleshooting into proactive operation. This session, led by Panos Tsilopoulos, will tackle today’s critical challenge: the deluge of data from digital ecosystems and the escalating costs of downtime. He’ll dive into how cutting-edge machine learning techniques with a focus on anomaly detection, forecasting to GenAI copilots enabling Root Cause Analysis (RCA) and agentic AI performing automated incident management—can revolutionize observability systems. Learn more about:
Session Reserved for Honeycomb
SLOs are a powerful tool for aligning engineering teams around reliability goals - but only when they’re grounded in real user experience and business impact. This panel explores how leading organisations are designing meaningful service level objectives, enforcing error budgets, and using them to guide trade-offs between reliability, velocity, and innovation.
S&P Global brought together several acquired businesses - each with its own tools, processes, and culture - into one cohesive Site Reliability Engineering (SRE) practice. In this session, Paul Maddocks share his journey with reference to:
Session Available For Sponsorship
Toil — manual, repetitive, and low-value work — drains engineering capacity and slows down innovation. This session explores how SRE and platform teams are automating routine operational tasks to reclaim developer time and improve reliability. From self-healing infrastructure to intelligent alerting and workflow orchestration, we’ll look at the strategies and tools helping teams scale their operations without scaling their headcount.
Most enterprises begin their journey toward observability through fragmented landscapes. This session explores how platform and SRE teams are transforming their approaches by consolidating these tools into cohesive observability platforms that not only reduce noise but also enhance usability, improve incident response, and enable end-to-end observability at scale. The presenter will uncover practical strategies and insights that pave the way for not just an observability platform, but a thriving ecosystem that empowers teams and drives success. Throughout this journey, we will explore the following key themes:
As modern systems grow more complex, engineering teams must evolve from reactive troubleshooting to delivering measurable, user-focused value. This session, led by Martin McLarnon, explores how OpenTelemetry—paired with Coralogix—enables precise definition and tracking of Service Level Objectives, helping teams reduce resolution times and improve system reliability. Through real-world insights and two live demos, attendees will learn more about:
As observability data volumes explode, so do the costs - leaving many organizations struggling to balance visibility with financial sustainability. This panel explores how engineering, platform, and finance leaders are working together to control observability spend without sacrificing critical insights. We’ll discuss strategies for smarter data collection, tooling consolidation, and aligning observability investments with business value.
Session Available For Sponsorship
Join Henry and Martin to see how their team built an end user device monitoring solution that keeps 90,000 end user devices in check. By combining SRE principles with on site support workflows and the open source LGTM stack, they’ve turned reactive break fix into proactive monitoring.
• How did we even get here? Why did we build this thing?
• Lessons from scaling monitoring to tens of thousands of endpoints.
• The not-so-obvious benefits of having a presence on all end user devices.
Henry Kühl, Senior Engineering Manager Observability, A.P. Moller - Maersk
Martin Jaeger, Lead Engineer – Platform Engineering, A.P. Moller - Maersk
As enterprises scale their observability programs, developer teams are often caught between rigid corporate tooling mandates and the need for fast, flexible response. This keynote explores how leading organizations are striking the right balance - building scalable, secure observability platforms without slowing down engineers or compromising developer autonomy. We’ll examine the cultural tensions between central IT and engineering teams, and how platform thinking, internal advocacy, and intuitive tooling can help observability scale with developers, not against them.