Linking OpenTelemetry spans across processes

Hi,
I’m having a bit of a hard time making sense of linking child spans from the Oban to the parent span and I hope somebody can clear that out for me.

Here is roughly the code from the first pass

span_ctx = OpenTelemetry.Tracer.start_span(:parent)
Task.async(fn ->
  link = OpenTelemetry.link(span_ctx) |> IO.inspect(label: "link")
  OpenTelemetry.Tracer.with_span :"my-task", %{links: [link]} do
    :hello
  end
end)
:otel_span.end_span(span_ctx, :undefined)

In the output, I can see that traced_id inside link matches the second item from the span_ctx tuple, but in Zipkin, those are two different traces and “my-task” doesn’t have a parent.


iex(5)> span_ctx = OpenTelemetry.Tracer.start_span(:parent)
{:span_ctx, 98688298365639109931104546606962518618, 12736530893077507678, 1, [],
 true, false, true,
 {:otel_span_ets, #Function<2.133377841/1 in :otel_tracer_server.on_end/1>}}
iex(6)> Task.async(fn ->
...(6)>   link = OpenTelemetry.link(span_ctx) |> IO.inspect(label: "link")
...(6)>   OpenTelemetry.Tracer.with_span :"my-task", %{links: [link]} do
...(6)>     :hello
...(6)>   end
...(6)> end)
link: %{
  attributes: %{},
  span_id: 12736530893077507678,
  trace_id: 98688298365639109931104546606962518618,
  tracestate: []
}
%Task{
  mfa: {:erlang, :apply, 2},
  owner: #PID<0.887.0>,
  pid: #PID<0.899.0>,
  ref: #Reference<0.3396824287.1941766146.188407>
}
iex(7)> :otel_span.end_span(span_ctx, :undefined)
{:span_ctx, 98688298365639109931104546606962518618, 12736530893077507678, 1, [],
 true, false, false,
 {:otel_span_ets, #Function<2.133377841/1 in :otel_tracer_server.on_end/1>}}
iex(8)>

Second pass

I attempted to start_span / end_span instead of with_span but the same result was the same - two uncorrelated spans



span_ctx = OpenTelemetry.Tracer.start_span(:parent) |> IO.inspect(label: "parent_ctx")
Task.async(fn ->
  link = OpenTelemetry.link(span_ctx) |> IO.inspect(label: "link")
  child_span_ctx = OpenTelemetry.Tracer.start_span(:child_async, %{links: [link]})  |> IO.inspect(label: "child span ctx")
  Process.sleep(20)
  :otel_span.end_span(child_span_ctx, :undefined)
end)
:otel_span.end_span(span_ctx, :undefined)

However I noticed that child span doesn’t have anything in relation to the parent span

Third pass

I briefly looked at the otel_span.erl and decided to put parent span as a current span before start a new span in the child process and it worked

span_ctx = OpenTelemetry.Tracer.start_span(:parent) |> IO.inspect(label: "parent_ctx")
Task.async(fn ->
  OpenTelemetry.Tracer.set_current_span(span_ctx)
  OpenTelemetry.Tracer.with_span(:child_third_pass) do
    Process.sleep(20)
  end
end)
:otel_span.end_span(span_ctx, :undefined)

Am I confused about the role of the links?

It’s been a few years, but I think the interpretation of “link” is up to the renderer to decide how to show the relationship, so parent seems closer to what you want.

In other languages, you also have to be careful to thread the parent span along properly as you switch between threads. Usually threading through the span between function calls gets noisy, so each language uses their language specific idiom for propagating a value for one thread between function calls implicitly. I haven’t thought about how a tracer would be implemented in elixir.