Skip to content

Conversation

@dashpole
Copy link
Contributor

@dashpole dashpole commented Jan 9, 2026

Objective

Part of #7743. I need a benchmark that can demonstrate the performance of using our API, SDK, and attributes packages together when following our performance guide. https://github.com/open-telemetry/opentelemetry-go/blob/main/CONTRIBUTING.md#attribute-and-option-allocation-management.

I settled on benchmarking three scenarios: "Precomputed", "Dynamic", and "Naive".

In the "Precomputed" scenario, it is assumed that the attribute set being measured against is known ahead of time, and that the instrumentation author can enumerate all possible sets, and precompute whatever they want, and keep references to it.

In the "Dynamic" scenario, it is assumed that the attribute set being measured against is not known ahead of time, and that it is not feasible to enumerate all possible attribute sets ahead of time. However, this scenario still assumes bounded cardinality, as writing metrics with an unbounded cardinality is not the intended use of the API. I had originally written these benchmarks with varying overall cardinality, but the cardinality does not impact the test results, as long as it is reasonable and bounded (e.g. < 100,000).

In the "Naive" scenario, it is assumed the user uses the API in the simplest, most ergonomic way. This is an attempt to measure the "default" experience of our API + SDK that users get when they use it.

I also found that relative benchmark results did not change when different levels of parallelism are used, so all benchmark results are single-threaded.

Code location

@MrAlias IIRC you have pushed back against including benchmarks like these in the metrics SDK package, so i've put them in an internal directory. LMK if I should move them. I just want them somewhere in this repo so it is easy to evaluate changes to the attributes package involving using the metrics SDK.

Results

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/internal/benchmark
cpu: AMD EPYC 7B12
                                                               │   main.txt   │
                                                               │    sec/op    │
CounterAdd/NoFilter/Attributes/1/Precomputed/WithAttributeSet    68.14n ± 12%
CounterAdd/NoFilter/Attributes/1/Precomputed/WithAttributes      68.44n ±  4%
CounterAdd/NoFilter/Attributes/1/Dynamic/WithAttributeSet        290.8n ±  4%
CounterAdd/NoFilter/Attributes/1/Dynamic/WithAttributes          367.2n ±  4%
CounterAdd/NoFilter/Attributes/1/Naive/WithAttributes            358.7n ±  6%
CounterAdd/NoFilter/Attributes/5/Precomputed/WithAttributeSet    68.62n ±  4%
CounterAdd/NoFilter/Attributes/5/Precomputed/WithAttributes      68.38n ±  2%
CounterAdd/NoFilter/Attributes/5/Dynamic/WithAttributeSet        735.4n ±  4%
CounterAdd/NoFilter/Attributes/5/Dynamic/WithAttributes          885.1n ±  6%
CounterAdd/NoFilter/Attributes/5/Naive/WithAttributes            934.6n ±  7%
CounterAdd/NoFilter/Attributes/10/Precomputed/WithAttributeSet   68.65n ±  2%
CounterAdd/NoFilter/Attributes/10/Precomputed/WithAttributes     68.12n ±  2%
CounterAdd/NoFilter/Attributes/10/Dynamic/WithAttributeSet       1.316µ ±  2%
CounterAdd/NoFilter/Attributes/10/Dynamic/WithAttributes         1.493µ ±  3%
CounterAdd/NoFilter/Attributes/10/Naive/WithAttributes           1.578µ ±  4%
CounterAdd/Filtered/Attributes/1/Precomputed/WithAttributeSet    270.4n ±  4%
CounterAdd/Filtered/Attributes/1/Precomputed/WithAttributes      282.2n ±  2%
CounterAdd/Filtered/Attributes/1/Dynamic/WithAttributeSet        515.6n ±  4%
CounterAdd/Filtered/Attributes/1/Dynamic/WithAttributes          604.4n ±  2%
CounterAdd/Filtered/Attributes/1/Naive/WithAttributes            592.7n ±  2%
CounterAdd/Filtered/Attributes/5/Precomputed/WithAttributeSet    801.6n ±  2%
CounterAdd/Filtered/Attributes/5/Precomputed/WithAttributes      798.6n ±  5%
CounterAdd/Filtered/Attributes/5/Dynamic/WithAttributeSet        1.516µ ±  3%
CounterAdd/Filtered/Attributes/5/Dynamic/WithAttributes          1.656µ ±  7%
CounterAdd/Filtered/Attributes/5/Naive/WithAttributes            1.703µ ±  4%
CounterAdd/Filtered/Attributes/10/Precomputed/WithAttributeSet   1.469µ ±  4%
CounterAdd/Filtered/Attributes/10/Precomputed/WithAttributes     1.416µ ±  9%
CounterAdd/Filtered/Attributes/10/Dynamic/WithAttributeSet       2.819µ ±  2%
CounterAdd/Filtered/Attributes/10/Dynamic/WithAttributes         3.028µ ±  6%
CounterAdd/Filtered/Attributes/10/Naive/WithAttributes           3.023µ ±  5%
geomean                                                          548.9n

                                                               │    main.txt    │
                                                               │      B/op      │
CounterAdd/NoFilter/Attributes/1/Precomputed/WithAttributeSet      0.000 ± 0%
CounterAdd/NoFilter/Attributes/1/Precomputed/WithAttributes        0.000 ± 0%
CounterAdd/NoFilter/Attributes/1/Dynamic/WithAttributeSet          88.00 ± 0%
CounterAdd/NoFilter/Attributes/1/Dynamic/WithAttributes            168.0 ± 0%
CounterAdd/NoFilter/Attributes/1/Naive/WithAttributes              232.0 ± 0%
CounterAdd/NoFilter/Attributes/5/Precomputed/WithAttributeSet      0.000 ± 0%
CounterAdd/NoFilter/Attributes/5/Precomputed/WithAttributes        0.000 ± 0%
CounterAdd/NoFilter/Attributes/5/Dynamic/WithAttributeSet          344.0 ± 0%
CounterAdd/NoFilter/Attributes/5/Dynamic/WithAttributes            680.0 ± 0%
CounterAdd/NoFilter/Attributes/5/Naive/WithAttributes             1000.0 ± 0%
CounterAdd/NoFilter/Attributes/10/Precomputed/WithAttributeSet     0.000 ± 0%
CounterAdd/NoFilter/Attributes/10/Precomputed/WithAttributes       0.000 ± 0%
CounterAdd/NoFilter/Attributes/10/Dynamic/WithAttributeSet         728.0 ± 0%
CounterAdd/NoFilter/Attributes/10/Dynamic/WithAttributes         1.414Ki ± 0%
CounterAdd/NoFilter/Attributes/10/Naive/WithAttributes           2.102Ki ± 0%
CounterAdd/Filtered/Attributes/1/Precomputed/WithAttributeSet      64.00 ± 0%
CounterAdd/Filtered/Attributes/1/Precomputed/WithAttributes        64.00 ± 0%
CounterAdd/Filtered/Attributes/1/Dynamic/WithAttributeSet          152.0 ± 0%
CounterAdd/Filtered/Attributes/1/Dynamic/WithAttributes            232.0 ± 0%
CounterAdd/Filtered/Attributes/1/Naive/WithAttributes              296.0 ± 0%
CounterAdd/Filtered/Attributes/5/Precomputed/WithAttributeSet      576.0 ± 0%
CounterAdd/Filtered/Attributes/5/Precomputed/WithAttributes        576.0 ± 0%
CounterAdd/Filtered/Attributes/5/Dynamic/WithAttributeSet          920.0 ± 0%
CounterAdd/Filtered/Attributes/5/Dynamic/WithAttributes          1.227Ki ± 0%
CounterAdd/Filtered/Attributes/5/Naive/WithAttributes            1.539Ki ± 0%
CounterAdd/Filtered/Attributes/10/Precomputed/WithAttributeSet   1.312Ki ± 0%
CounterAdd/Filtered/Attributes/10/Precomputed/WithAttributes     1.312Ki ± 0%
CounterAdd/Filtered/Attributes/10/Dynamic/WithAttributeSet       2.023Ki ± 0%
CounterAdd/Filtered/Attributes/10/Dynamic/WithAttributes         2.727Ki ± 0%
CounterAdd/Filtered/Attributes/10/Naive/WithAttributes           3.414Ki ± 0%
geomean                                                                       ¹
¹ summaries must be >0 to compute geomean

                                                               │   main.txt   │
                                                               │  allocs/op   │
CounterAdd/NoFilter/Attributes/1/Precomputed/WithAttributeSet    0.000 ± 0%
CounterAdd/NoFilter/Attributes/1/Precomputed/WithAttributes      0.000 ± 0%
CounterAdd/NoFilter/Attributes/1/Dynamic/WithAttributeSet        2.000 ± 0%
CounterAdd/NoFilter/Attributes/1/Dynamic/WithAttributes          4.000 ± 0%
CounterAdd/NoFilter/Attributes/1/Naive/WithAttributes            5.000 ± 0%
CounterAdd/NoFilter/Attributes/5/Precomputed/WithAttributeSet    0.000 ± 0%
CounterAdd/NoFilter/Attributes/5/Precomputed/WithAttributes      0.000 ± 0%
CounterAdd/NoFilter/Attributes/5/Dynamic/WithAttributeSet        2.000 ± 0%
CounterAdd/NoFilter/Attributes/5/Dynamic/WithAttributes          4.000 ± 0%
CounterAdd/NoFilter/Attributes/5/Naive/WithAttributes            5.000 ± 0%
CounterAdd/NoFilter/Attributes/10/Precomputed/WithAttributeSet   0.000 ± 0%
CounterAdd/NoFilter/Attributes/10/Precomputed/WithAttributes     0.000 ± 0%
CounterAdd/NoFilter/Attributes/10/Dynamic/WithAttributeSet       2.000 ± 0%
CounterAdd/NoFilter/Attributes/10/Dynamic/WithAttributes         4.000 ± 0%
CounterAdd/NoFilter/Attributes/10/Naive/WithAttributes           5.000 ± 0%
CounterAdd/Filtered/Attributes/1/Precomputed/WithAttributeSet    1.000 ± 0%
CounterAdd/Filtered/Attributes/1/Precomputed/WithAttributes      1.000 ± 0%
CounterAdd/Filtered/Attributes/1/Dynamic/WithAttributeSet        3.000 ± 0%
CounterAdd/Filtered/Attributes/1/Dynamic/WithAttributes          5.000 ± 0%
CounterAdd/Filtered/Attributes/1/Naive/WithAttributes            6.000 ± 0%
CounterAdd/Filtered/Attributes/5/Precomputed/WithAttributeSet    2.000 ± 0%
CounterAdd/Filtered/Attributes/5/Precomputed/WithAttributes      2.000 ± 0%
CounterAdd/Filtered/Attributes/5/Dynamic/WithAttributeSet        4.000 ± 0%
CounterAdd/Filtered/Attributes/5/Dynamic/WithAttributes          6.000 ± 0%
CounterAdd/Filtered/Attributes/5/Naive/WithAttributes            7.000 ± 0%
CounterAdd/Filtered/Attributes/10/Precomputed/WithAttributeSet   2.000 ± 0%
CounterAdd/Filtered/Attributes/10/Precomputed/WithAttributes     2.000 ± 0%
CounterAdd/Filtered/Attributes/10/Dynamic/WithAttributeSet       4.000 ± 0%
CounterAdd/Filtered/Attributes/10/Dynamic/WithAttributes         6.000 ± 0%
CounterAdd/Filtered/Attributes/10/Naive/WithAttributes           7.000 ± 0%
geomean                                                                     ¹
¹ summaries must be >0 to compute geomean

Observations

  • When the attributes are known ahead of time (Precomputed), WithAttributes and WithAttributeSet have equal performance.
  • When attributes are not known ahead of time (Dynamic), WithAttributes is much worse than WithAttributeSet (mostly 2 extra allocations).
  • When an attribute filter is applied, the performance of the SDK degrades significantly. With 10 attributes, the performance goes from ~70ns to 1400ns, and from 0 allocations to 2 allocations.

@dashpole dashpole added the Skip Changelog PRs that do not require a CHANGELOG.md entry label Jan 9, 2026
@codecov
Copy link

codecov bot commented Jan 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.1%. Comparing base (aec1082) to head (3160196).

Additional details and impacted files

Impacted file tree graph

@@          Coverage Diff          @@
##            main   #7768   +/-   ##
=====================================
  Coverage   86.1%   86.1%           
=====================================
  Files        302     302           
  Lines      22046   22046           
=====================================
+ Hits       18992   18993    +1     
+ Misses      2674    2673    -1     
  Partials     380     380           

see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dashpole dashpole force-pushed the additional_attribute_benchmarks branch 2 times, most recently from a4632e9 to 472ef54 Compare January 9, 2026 21:18
@dashpole dashpole force-pushed the additional_attribute_benchmarks branch from f11a492 to 7c8a0ad Compare January 13, 2026 21:10
@dashpole
Copy link
Contributor Author

I added more benchmark cases to try and make it comprehensive, but results are now unreadable. I'll probably get rid of the varying cardinality, as it doesn't impact any of the results, only do 1 and 10 attributes, and get rid of the no-op results (as soon we can just use the Enabled() method anyways.

@dashpole dashpole requested a review from dmathieu January 14, 2026 18:22
@dashpole
Copy link
Contributor Author

I've updated the description to explain the scenarios, and why those were chosen, and removed some of the unnecessary permutations of the benchmark. @MrAlias, you had some issues with the framing of the "Dynamic" benchmark case which I tried to address above. I did implement varying degrees of cardinality for the test, but it did not impact the results at all, so I removed it.

@dashpole
Copy link
Contributor Author

I would probably rather have this just in the metrics SDK package. @MrAlias are you ok with that? Otherwise I can keep it in a new module.

@MrAlias MrAlias mentioned this pull request Jan 16, 2026
39 tasks
Comment on lines +152 to +163
attrsSlice := attrPool.Get().(*[]attribute.KeyValue)
defer func() {
*attrsSlice = (*attrsSlice)[:0] // Reset.
attrPool.Put(attrsSlice)
}()
appendAttributes(attrsLen, attrsSlice)
addOpt := addOptPool.Get().(*[]metric.AddOption)
defer func() {
*addOpt = (*addOpt)[:0]
addOptPool.Put(addOpt)
}()
counter.Add(ctx, 1, metric.WithAttributes(*attrsSlice...))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to what was mentioned in this thread, this seems to be biasing results. #7790 is using these results as a comparison, but is using an internal map to further optimize its results. A similar approach could be taken, as pointed out here #7770 (comment).

If we are going to use these benchmarks to make comparisons they need to done in an apples-to-apples way.

Copy link
Contributor Author

@dashpole dashpole Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#7790 does use an additional sync.Map for the PoC, but that was just to avoid a larger refactor that would re-use the existing internal storage of the SDK. The bound instrument API can be implemented without using an additional sync.Map. If that is the blocker, I can do the refactor (it is just a day or two of work I'd rather not do if it doesn't matter).

The real question is whether we are going to recommend using a sync.Map for performance-sensitive users. Today, a user would need to:

  • Copy our hashing algorithm for []attribute.KeyValue
  • Implement (or copy) a sync.Map that limits cardinality, and (potentially) includes a TTL.

So I don't think it is viable today. Or at least it isn't viable enough that it is in our contributor guidelines. We could add libraries and functions to address those downsides. Maybe attribute.NewSetWithCaching(kvs ...attribute.KeyValue) would be ergonomic? But I consider that an alternative proposal, not the current state. It has other downsides, like incurring memory usage when no SDK is involved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what to say. I was told to have performance conversations here instead of #7790. Now I'm being told this should not be compared to the performance expectation of future API changes.

I'm fine not talking about future API changes here. I just don't know where to have this conversation is the problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose that we:

  1. Agree that this PR (or something similar) represents the current state of the performance of the API + SDK, and merge it.
  2. Co-write a proposal and/or PoC for how we could make a cache-based approach viable.
  3. Debate the two proposals.

There are a drawbacks to a cache-based approach, but I can't argue against a proposal that doesn't exist. I can compare prototypes against the current state of the API + SDK, which i've tried to do fairly. But if we are serious about a cache-based alternative, we should work through the design before we assume it is a viable solution to the performance problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've documented what I think the best cache-based approaches are, and why they are not an acceptable solution to the stated problem in #7743 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Skip Changelog PRs that do not require a CHANGELOG.md entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants