HomeBlog

Observability in 2026: The New Stack for Spring Boot Apps

May 4, 2026

ObservabilityOpenTelemetrySpring BootGrafanaDevOps

If you're still treating observability as "logs over here, metrics over there, and traces if we have time," you're running an old mental model.

In 2026, the big change is not that teams suddenly got better at telemetry. It's that the stack finally converged. OpenTelemetry won the standard war, and that matters more than which vendor UI you buy or which storage backend you run.

For Spring Boot teams, that convergence has a practical consequence: you no longer need to invent your own observability strategy from scratch. There is a default shape now.

It looks roughly like this:

That is the new baseline.

The remaining hard parts are no longer "how do I emit a span?" They're:

That's where teams still get this wrong.


OpenTelemetry is the standard now

This is the most important thing to get right conceptually.

When people say "we use Datadog" or "we use Grafana" or "we use New Relic," they're usually talking about where telemetry ends up. That is not the same as the instrumentation standard inside the app.

The standard in 2026 is OpenTelemetry:

That standardization matters because observability problems usually show up at system edges:

If your telemetry is built on a standard model, those boundaries are annoying. If it isn't, they become archaeology.

For Spring Boot specifically, Micrometer remains the application-facing abstraction, and that's a good thing. Spring Boot's observability model is built around Micrometer Observation, with Micrometer Tracing bridging observations into a concrete tracer implementation such as OpenTelemetry.

That's the right split:

You should optimize for that portability.


Micrometer Tracing is the right layer for Spring Boot

Some teams still ask whether they should "use Micrometer or OpenTelemetry." In a Spring Boot app, that's the wrong framing.

Use them at different layers.

Micrometer Observation is your in-process instrumentation model. It fits Spring Boot, Actuator, and the rest of the ecosystem. It gives you timers, observations, low-cardinality tags, and the bridge into tracing.

Micrometer Tracing is the Spring-facing tracing facade. It lets Boot auto-configure tracing without hard-coding your app to one tracer implementation.

OpenTelemetry is the interoperability layer underneath and around that.

This matters because the worst Spring Boot observability setups are the ones that mix abstractions randomly:

Pick a dominant application-facing model and stick to it. In Spring Boot, that model should usually be Micrometer.

Then use OpenTelemetry for:

That split is boring. Boring is good.


W3C vs B3: use W3C by default

This is the part that still causes real production confusion.

There are two header families you will most often encounter in Spring and JVM systems:

If you're starting fresh in 2026, use W3C Trace Context.

Not because B3 is broken. B3 still exists in plenty of environments, especially older Zipkin- and Brave-shaped systems. But W3C is the cross-vendor standard, and standardization is the whole point. It is what gives you the best odds that your traces survive service boundaries, proxies, SDK differences, and future migrations.

W3C also matters because trace propagation is not mainly an application-internal concern. It's a boundary concern.

Inside one service, almost any decent library can keep context together. Problems start when requests cross:

That is where traces get fragmented.

Why this matters in practice

If one side emits W3C headers and the other only extracts B3, you don't get one broken span. You get two unrelated traces that look fine in isolation and useless together.

That is a horrible failure mode because nothing crashes. You just lose causality.

This gets worse when:

So the practical guidance is simple:

If you still have B3 services, decide whether you are:

  1. migrating them to W3C
  2. dual-reading during transition
  3. translating at the edge

What you should not do is leave that behavior implicit and hope every library makes the same assumption.

Baggage is another reason W3C wins

Spring Boot's tracing docs call out an operationally important detail: when you're using W3C propagation, baggage is propagated automatically. With B3, it is not. That difference alone is enough to create surprising cross-service behavior if you rely on request-scoped business context.

Which brings us to the next common mistake.


Baggage is useful, but only in small doses

Baggage is one of those features that sounds magical right up until a team uses it as a distributed dumping ground.

The good use case is narrow and valuable:

Examples:

Bad baggage candidates:

OpenTelemetry's own baggage guidance makes the risk explicit: baggage is transported in headers and may be forwarded to downstream systems you didn't intend to enrich.

So the rule I recommend is:

If you need broad, rich business context everywhere, fix the event or request model. Don't smuggle it through tracing.


What to actually wire up

This is where observability discussions usually become too abstract. So here's the opinionated version.

1. Start with auto-instrumentation

For Spring Boot apps, the default should be: let the framework and libraries do the obvious work first.

That means capturing, at minimum:

OpenTelemetry's Java ecosystem now gives you two practical "mostly automatic" choices:

If you're unsure, use the simplest thing your platform can operate consistently. Standardization across services is more important than winning a local purity contest.

2. Add manual spans only around business-significant boundaries

Auto-instrumentation gives you technical topology. It does not automatically give you business meaning.

Add manual spans or observations around things like:

Not around every helper method.

Good manual instrumentation answers questions like:

Bad manual instrumentation creates span soup.

If every method is a span, none of them are useful.

3. Use observations by default, low-level tracer APIs selectively

In Spring Boot, prefer ObservationRegistry and Observation for most custom instrumentation. It aligns with metrics and traces together.

Drop down to the lower-level Tracer API when you specifically need tracing-only behavior, like tighter baggage handling or explicit span lifecycle control.

That keeps the codebase consistent and avoids prematurely hard-wiring your app to one tracing implementation.

4. Wire logs for correlation, not as a primary query model

Logs still matter, but the role has shifted.

In a healthy 2026 stack:

Logs are not your first alert surface, and they should not be your only debugging tool.

They should be structured, correlated, and easy to jump into from a trace.

5. Turn on exemplars

Exemplars are one of the most underrated parts of a modern observability stack.

They're the bridge between aggregated metrics and individual traces.

Without exemplars, you see that the p95 latency spike happened. Then you go hunting manually for the trace that explains it.

With exemplars, the metric point can link directly to a representative trace from that interval.

That changes the workflow from "the graph is bad, now start guessing" to "the graph is bad, click into the trace that explains the bad point."

If you're using Prometheus-style metrics and Grafana/Tempo, this is one of the highest-leverage features you can enable.


A short Spring Boot baseline

If I were setting up a new Spring Boot service today, I would keep the application-side baseline very small:

<dependencies>
  <dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
  </dependency>
  <dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-tracing-bridge-otel</artifactId>
  </dependency>
  <dependency>
    <groupId>io.opentelemetry</groupId>
    <artifactId>opentelemetry-exporter-otlp</artifactId>
  </dependency>
  <dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
  </dependency>
</dependencies>

And a minimal config like this:

management:
  endpoints:
    web:
      exposure:
        include: health,prometheus
  tracing:
    sampling:
      probability: 0.1
    baggage:
      remote-fields: tenant-id,plan-tier
      correlation:
        fields: tenant-id,plan-tier
  otlp:
    tracing:
      endpoint: http://otel-collector:4318/v1/traces

Then I would add custom observations only where business flows actually matter:

Observation.createNotStarted("place-order", observationRegistry)
    .lowCardinalityKeyValue("channel", "web")
    .observe(() -> orderService.placeOrder(command));

A few practical notes:

That is enough to get real value without turning observability into a side project.


Loki + Tempo + Grafana is a strong OSS default

If you want an open stack, the most coherent default in 2026 is still:

Why this stack works well:

If you wire this correctly, you get the core navigational loops you actually need during incidents:

That is the point. Not "all three pillars" as a slogan, but fast movement between them.

When I would not run this stack myself

The open stack is a good default when:

I would choose a managed platform when:

The mistake is not choosing managed. The mistake is pretending "we run open source" is free.

Operating collectors, retention tiers, cardinality control, auth, tenancy, and query performance is real work. If you don't want that work, buy it.


What should alert you, and what should just be available

Most observability stacks fail here, not in instrumentation.

If everything can page, your observability stack becomes a sleep deprivation pipeline.

The best alerting guidance still holds: page on symptoms and user-impact, not on every internal cause.

That means your paging alerts should usually come from things like:

Things I usually do not want paging by default:

Those are useful signals. They should be visible in dashboards and investigation workflows. They just should not all wake a human up.

The operational split I like is:

Page on

Ticket or notify on

Keep explorable

This is where exemplars and trace-log correlation earn their keep. The things that do not page should still be easy to reach once a real alert fires.

That's the difference between useful telemetry and telemetry hoarding.


The real upgrade in 2026 is not more data

The real upgrade is coherence.

In older stacks, observability often meant separate tools, separate formats, separate teams, and separate assumptions. You could collect a lot and still not answer simple questions during an incident.

The 2026 stack is better because the pieces finally line up:

So if you're building a Spring Boot service today, my default recommendation is simple:

That's enough to build a stack that helps during incidents instead of becoming one.

References

Spring / Micrometer / OpenTelemetry

Propagation and baggage

Semantics and correlation

Alerting and SRE

Managed observability example