Help with structs benchmarks

Hi to everyone!

I am looking for some Elixir benchmarks that use structs. I would appreciate it if you could share some with me.

Why? I’m working on a project (my undergraduate thesis) whose goal is to improve the performance of the map lookup operation in BEAM. Now, I’m at a stage where I need to perform tests to analyze the behavior of the current BEAM implementation when using Elixir structs.

5 Likes

Here’s some prior discussion that seems like a good starting point:

A big change worth learning about for this specific case (Elixir structs) is the new_map_lit machinery that landed in OTP 21:

2 Likes

Thank you @al2o3cr!

In particular, we are looking for any function or library that makes heavy use of structs, especially under a loop.

So for example, if you have code or library that is traversing a large list and modifying a struct as it goes, that would be very helpful.

1 Like

In Membrane, we use nested structs and (usually small) maps for element state, which is basically a GenServer state. There are rather no heavy loops over it, but we figured out that access to structs impacts performance significantly. So we replaced Elixir’s Access by macros that construct pattern matches in the compile time. That turned out to provide significant speedup in some of our benchmarks. Not sure if that’s what you’re looking for, but I guess speeding up structs/maps themselves could give Membrane another boost.

3 Likes

I think rebuilding Commanded’s aggregate from a lot of events (when there’s no snapshot) could be a good example of that.
We’re doing that in our company too. I can provide you with some test code and data or maybe even a Benchee script to run, if you want.

2 Likes

@stefanluptak that would be perfect! An Elixir script with Mix.install at the top that installs the required depsand then proceeds to aggregate would be perfect!

@mat-hek if that can be isolated, that would be perfect too! Especially knowing the code was a bottleneck in the past.

1 Like

OK, I will do my best. But since it’s Friday already, I will send you that on Monday.

Oh!

Here’s a thing that might help. I’ve been (experimenting) with XML parsing. One iteration I did was creating a Saxy handler that parses the XML into a struct, something like the following,

Given this xml:

<ParentNode price="1">
  <MyNode>Stuff</MyNode>
  <AnotherNode>
    <Child>Stuff</Child>
  </AnotherNode>
</ParentNode>

The Saxy handler returns:

 %XMLNode{
   attributes: [%XMLAttr{name: "price", value: "1"}],
   content: [
     %XMLNode{attributes: [], content: ["Stuff"], name: "MyNode"},
     %XMLNode{
       attributes: [],
       content: [%XMLNode{attributes: [], content: ["Stuff"], name: "Child"}],
       name: "AnotherNode"
     }
   ],
   name: "ParentNode"
 }

THEN we have a library called data_schema that queries into that representation to create a different struct from it. It does that by reducing through a schema and querying into the result from Saxy, so if you have large XML you’ll be doing lots and lots of Map.gets into the result of the Saxy handler to get the values for your new struct.

I have created a branch on the data_schema repo called xpath_experiment that has the Saxy handler and a data accessor that you can use to parse XML.
See this example from the tests:

Doing this will result in a lot of Map.fetch into a struct - 1 per node and attr.

If you think this will be useful to you I can get together a large XML (~15mb) example to use, along with schemas for that.

2 Likes

Yes, I would love to use this benchmark! However, note that we won’t be able to optimize Map.fetch but we should be able to optimize pattern matching:

case map do
  %{^key => value} -> ...
  %{} -> ...
end

Or map.field.

1 Like

Nice, yea no problem it is trivial to turn the Map.fetch! into a map.field

I will get something together today for you.

Sorry for the delay, took a bit longer to get the schemas in order than I expected.

Here is a benchark that you can run with mix run bench.exs at the root of the project. It will parse a large XML in the manner I mentioned.

Let me know if you can’t access it for whatever reason.

Adz

2 Likes

Sorry, it’s a bit later… :man_facepalming:

I realized that the core what we’re doing is working with a tree which is inspired by this GitHub - japplegame/avl_tree: Pure Elixir AVL tree implementation

But that one above is using tuples, so I changed it to use structs. Here’s the fork: GitHub - stefanluptak/avl_tree at use-structs-instead-of-tuples

There’s even a benchmark script already, so you can just run mix run bench/run.exs Just do not forget to use that particular branch on my fork.

I hope it will be helpful somehow. :slightly_smiling_face:

2 Likes

It’s been some time, but the benchmark we had wasn’t straightforward to run, so I made another one: membrane_benchmarks/core_beamchmark at master · membraneframework-labs/membrane_benchmarks · GitHub

Hope it’s useful :wink:

3 Likes