Hi,
As a learning project I’m trying to write an aws client in elixir. I’m using the api definitions from the ruby sdk to auto generate elixir source files (I subsequently found aws-codegen, but half the fun at this point is figuring out how to do these things)
Some of the source files I’m generating are taking a long time to compile - the one that prompted me to write this post took 20m. Partly this is due to memory usage - it routinely gets to 9G and above, swaps and may get killed. Smaller files which don’t induce swapping can still take 20-30s to compile (this is on a machine with 16G of ram)
My issue persists if I cut back the generated code to something like this
defmodule Aws.Dynamodb do
@shapes %{
"SomeInput" => %{type: "structure", required: ["Foo", "Bar"], members: %{"Foo" => {"shape" => "AShapeName"}, "Bar" => "AnotherShapeName"},
"AnotherShapeName" => %{ ... },
... #many other shapes
}
def some_api_method(data) do
@shapes["SomeInput"]
end
... #more methods, each referencing @shapes
end
The api definition files provide a list of definitions of the inputs each api method expects. These shapes usually reference other shapes, for example a structure shape gives the shapes of each of its members. My pathological sample has about 800 of these shapes (see gist:69e163947ef170c31388518b6616f334 · GitHub ) . The api methods will eventually check input data against the correct shape, make the http request and then use another shape to decode the response, however in the gist, all they do is reference @shapes.
If I delete all the methods that reference @shapes then the file compiles in under a second. For every method I add back, compilation gets slower - with 5 methods it takes 3 seconds, with 20 takes about 14s. At some number of methods swapping kicks in and the compile times explode.
However, I have a single get_shapes
method that just returns @shapes
, and all the other methods use get_shapes
instead of @shapes
directly then the file compiles in 1s. I also noted that the generated .beam file is a lot smaller
http://elixir-lang.org/getting-started/module-attributes.html#as-constants says module attributes are used as constants, and indeed that’s what I’m trying to do. It also says
Notice that reading an attribute inside a function takes a snapshot of its current value
Does this mean that each time i declare a function that references the attribute, elixir is actually creating / storing a new copy of the attribute? This would explain the size difference in the .beam file when funelling access via @shapes, although I don’t understand why it makes it so much slower. Am I horribly misusing module attributes?
Thanks,
Fred