Strange behavio(u)r in my tests

svarlet · June 26, 2016, 4:10pm

Hi,

I’m currently working on a client for the Tumblr API. Turns out it’s way more work than anticipated but still a good way to practice.

However, I’m running at the moment into a problem with a test case where each test is taking about 2 seconds to run. These tests setup their own cowboy server (thanks to bypass) in order to simulate the calls. I have written such tests previously and never had this issue.

This is the file I’m working on:
https://github.com/svarlet/ex_tumblr/blob/rework_testing_with_bypass/test/ex_tumblr/posts_test.exs#L49

As you can see, the JSON response is read from a file ‘posts.json’ also in the github repo. It is a rather large file (400KB). So my first suspicion was about the size of the file and the time taken by Poison to parse it.

So I used ExProf to verify that:

FUNCTION                                                  CALLS        %    TIME  [uS / CALLS]
--------                                                  -----  -------    ----  [----------]
[...]
'Elixir.Enum':'-reduce/3-fun-1-'/4                          207     0.03     196  [      0.95]
'Elixir.ExTumblr.Utils.Parsing':'-to_struct/3-fun-0-'/2     206     0.04     198  [      0.96]
lists:reverse/1                                             105     0.04     226  [      2.15]
lists:foldl/3                                               236     0.04     246  [      1.04]
erlang:binary_to_atom/2                                     206     0.04     246  [      1.19]
hackney_bstr:'-to_lower/1-lbc$^0/2-0-'/2                    130     0.05     282  [      2.17]
inet_tcp:recv/3                                             280     0.05     303  [      1.08]
hackney_tcp_transport:recv/3                                280     0.05     303  [      1.08]
hackney_http:ce_identity/1                                  280     0.05     306  [      1.09]
prim_inet:recv/3                                            280     0.06     311  [      1.11]
prim_inet:recv0/3                                           280     0.06     320  [      1.14]
hackney_http:content_decode/3                               280     0.06     321  [      1.15]
'Elixir.String.Chars':to_string/1                           342     0.06     322  [      0.94]
'Elixir.String.Chars.Atom':to_string/1                      342     0.06     326  [      0.95]
gen_tcp:recv/3                                              280     0.06     326  [      1.16]
erlang:atom_to_binary/2                                     343     0.06     333  [      0.97]
hackney_response:stream_body_recv/2                         279     0.06     340  [      1.22]
'Elixir.Map':do_take/3                                      363     0.06     342  [      0.94]
erlang:port_get_data/1                                      282     0.06     344  [      1.22]
hackney_response:stream_body/2                              279     0.06     353  [      1.27]
hackney_http:te_identity/2                                  280     0.06     358  [      1.28]
hackney_http:transfer_decode/2                              280     0.06     363  [      1.30]
hackney_response:read_body/3                                281     0.07     382  [      1.36]
maps:'-fold/3-fun-0-'/3                                     207     0.07     398  [      1.92]
maps:find/2                                                 343     0.07     402  [      1.17]
hackney_bstr:char_to_lower/1                                115     0.08     475  [      4.13]
hackney_http:'-parse_body/1-fun-3-'/1                       280     0.11     608  [      2.17]
'Elixir.String.Chars':impl_for/1                            342     0.11     626  [      1.83]
'Elixir.ExTumblr.Post':'-parse/1-fun-0-'/1                  340     0.11     632  [      1.86]
'Elixir.String.Chars':'impl_for!'/1                         342     0.11     639  [      1.87]
hackney_http:execute/1                                      286     0.12     652  [      2.28]
'Elixir.Enum':'-map/2-lists^map/1-0-'/2                     381     0.12     670  [      1.76]
hackney_http:'-parse_body/1-fun-2-'/2                       280     0.12     674  [      2.41]
hackney_response:stream_body1/2                             560     0.12     685  [      1.22]
inet_db:lookup_socket/1                                     282     0.12     693  [      2.46]
prim_inet:async_recv/3                                      280     0.12     700  [      2.50]
hackney_response:recv/1                                     280     0.13     714  [      2.55]
prim_inet:enc_time/1                                        281     0.13     724  [      2.58]
prim_inet:ctl_cmd/3                                         287     0.13     741  [      2.58]
hackney_http:parse_body/1                                   561     0.13     761  [      1.36]
erlang:port_control/3                                       287     0.14     785  [      2.74]
'Elixir.Poison.Parser':array_values/3                       711     0.14     791  [      1.11]
'Elixir.Poison.Parser':number_frac/2                        788     0.15     830  [      1.05]
'Elixir.Poison.Parser':number_start/1                       788     0.15     838  [      1.06]
erlang:binary_to_integer/1                                  788     0.15     839  [      1.06]
'Elixir.Poison.Parser':number_int/2                         784     0.15     853  [      1.09]
hackney_response:stream_body/1                              281     0.16     883  [      3.14]
'Elixir.Poison.Parser':number_exp/3                         788     0.16     903  [      1.15]
maps:from_list/1                                            729     0.27    1514  [      2.08]
'Elixir.Poison.Parser':number_complete/2                    788     0.28    1571  [      1.99]
'Elixir.Poison.Parser':number_digits/1                      784     0.29    1620  [      2.07]
hackney_http:execute/2                                      567     0.29    1641  [      2.89]
erts_internal:port_control/3                                287     0.34    1905  [      6.64]
erlang:setelement/3                                        1728     0.39    2228  [      1.29]
'Elixir.Poison.Parser':object_pairs/3                      5077     1.00    5634  [      1.11]
'Elixir.Poison.Parser':number_digits_count/2               5947     1.20    6765  [      1.14]
'Elixir.Poison.Parser':object_name/2                       5077     1.80   10167  [      2.00]
'Elixir.Poison.Parser':value/2                             5769     2.14   12061  [      2.09]
erlang:iolist_to_binary/1                                  9324     2.23   12558  [      1.35]
erlang:binary_to_integer/2                                11587     2.58   14537  [      1.25]
'Elixir.Poison.Parser':string_escape/2                    19399     4.10   23114  [      1.19]
'Elixir.Poison.Parser':skip_whitespace/1                  21712     7.95   44850  [      2.07]
'Elixir.Poison.Parser':string_continue/2                  46845     9.96   56241  [      1.20]
'Elixir.Poison.Parser':string_chunk_size/2               303284    60.04  338888  [      1.12]
-------------------------------------------------------  ------  -------  ------  [----------]
Total:                                                   457768  100.00%  564404  [      1.23]

Indeed, parsing takes time (value indicated in µs) but this is about half a second. When I run my 4 tests I could expect the test suite to take 2 seconds if they run sequentially. Yet, the test suite takes more than 10 seconds to run.

mix test --trace test/ex_tumblr/posts_test.exs

ExTumblr.PostTest

test parse/1 can transform a generic map into a Post struct (6.5ms)

ExTumblr.PostsTest

test emits request to the right path (195.3ms)

test parses the http response into a Posts struct (55.6ms)

test specifies the api key in the query (33.7ms)

test sends a GET request (28.5ms)

Finished in 10.1 seconds
5 tests, 0 failures

What I notice every time the suite runs is a ~10 seconds pause between the two test cases. To be more specific:

mix test --trace test/ex_tumblr/posts_test.exs

ExTumblr.PostTest

test parse/1 can transform a generic map into a Post struct (6.5ms)
############### 10 SECONDS PAUSE HERE ##############
ExTumblr.PostsTest

test emits request to the right path (195.3ms)

test parses the http response into a Posts struct (55.6ms)

test specifies the api key in the query (33.7ms)

test sends a GET request (28.5ms)

Finished in 10.1 seconds
5 tests, 0 failures

If I comment all these tests but one, I observe the same behaviour with a ~2 seconds pause. Every test basically adds its own ~2s pause.

I already posted this in the slack chat where @micmus and @ericmj were really kind and helpful but we couldn’t find the cause.

Can anybody find the problem ?

Thanks

svarlet · June 27, 2016, 3:22pm

I also ran :observer to get more data about this problem. Running this test case (only this one) takes about 700MB of memory!

svarlet · June 27, 2016, 3:23pm

mosic · July 1, 2016, 8:52am

I’d love to hear what’s the reason behind it, but the cause of the slowdown in this case seems to come from using a module variable for loading prebaked posts.

If you switch to using a variable set in a setup callback the tests duration goes down dramatically (from 3.3 seconds to 0.3 on my machine).

This is the whole diff needed for that improvement:

diff --git a/test/ex_tumblr/posts_test.exs b/test/ex_tumblr/posts_test.exs
index c10f477..a8bd765 100644
--- a/test/ex_tumblr/posts_test.exs
+++ b/test/ex_tumblr/posts_test.exs
@@ -49,12 +49,14 @@ defmodule ExTumblr.PostsTest do
 
   alias ExTumblr.Posts
 
-  @prebaked_response File.read!(Path.join(__DIR__, "posts.json"))
+  setup context do
+    {:ok, prebaked_response: File.read!(Path.join(__DIR__, "posts.json"))}
+  end
 
   test "sends a GET request", context do
     Bypass.expect context.bypass, fn conn ->
       assert "GET" == conn.method
-      Plug.Conn.resp conn, 200, @prebaked_response
+      Plug.Conn.resp conn, 200, context[:prebaked_response]
     end
     Posts.request(context.client, "peacecorps.tumblr.com", nil)
   end
@@ -62,7 +64,7 @@ defmodule ExTumblr.PostsTest do
   test "specifies the api key in the query", context do
     Bypass.expect context.bypass, fn conn ->
       assert "api_key=ck" == conn.query_string
-      Plug.Conn.resp conn, 200, @prebaked_response
+      Plug.Conn.resp conn, 200, context[:prebaked_response]
     end
     Posts.request(context.client, "peacecorps.tumblr.com", nil)
   end
@@ -70,14 +72,14 @@ defmodule ExTumblr.PostsTest do
   test "emits request to the right path", context do
     Bypass.expect context.bypass, fn conn ->
       assert "/v2/blog/peacecorps.tumblr.com/posts" == conn.request_path
-      Plug.Conn.resp conn, 200, @prebaked_response
+      Plug.Conn.resp conn, 200, context[:prebaked_response]
     end
     Posts.request(context.client, "peacecorps.tumblr.com", nil)
   end
 
   test "parses the http response into a Posts struct", context do
     Bypass.expect context.bypass, fn conn ->
-      Plug.Conn.resp conn, 200, @prebaked_response
+      Plug.Conn.resp conn, 200, context[:prebaked_response]
     end
     %Posts{blog: blog, posts: posts, total_posts: total_posts} = Posts.request(context.client, "peacecorps.tumblr.com", nil)
     assert blog != nil

svarlet · July 4, 2016, 8:04am

Hi @mosic,

Thanks for sharing your solution. It is extremely counterintuitive though as:

a module attribute is created at compile time, therefore the json file is read only once and only at compile time.
reading the file in the setup callback means the file is going to be read at runtime as many times as there are tests.

I’m about to check your measurements but, like you, I’m curious about the explanation if I can observe the same results.

EDIT: I confirm that my test suite took 0.5 seconds to run entirely vs 10.1 seconds before your patch. Amazing. I really don’t understand why though.

ericmj · July 4, 2016, 10:39am

You can use setup_all that is run a single time for all tests in the module.

svarlet · July 4, 2016, 10:40am

Good point @ericmj !

ericmj · July 4, 2016, 10:42am

A note for the future is to not post questions on both the forum and the mailing list. It would be a shame to have duplicated efforts from people helping.

svarlet · July 4, 2016, 11:06am

@ericmj sure, I originally posted in the mailing list but I didn’t see any activity there for a couple of days after posting. Then I saw this message:

This is the (deprecated) Elixir talk mailing list. We welcome developers to join the Elixir Forum for questions, discussions and everything else related to Elixir.

Perhaps this message was misleading ?

So I posted my message here too and then started receiving messages in the mailing list.

I’m the first casualty in this duplicated thread as I have to update both all the time.

josevalim · July 7, 2016, 9:01am

The problem here is that the module attribute is created at compile time and compiled into the caller at compile time. So the slowdown you see is the Elixir compiler compiling the module attribute, in this case a 400kb file, into every test that uses it.

In other words, the cost of compiling the 400kb file largely offsets the cost of reading the file.

Thank you for further investigating it!

svarlet · July 7, 2016, 9:44am

the cost of compiling the 400kb file

What extra step(s) is the compiler doing exactly ? Isn’t it just a string ? Isn’t it just inlining a giant string wherever the attribute is used ?

Thank you !

josevalim · July 16, 2016, 3:13pm

A compiler has multiple passes. So the first step is to convert the string into Elixir AST. That’s easy because strings are valid AST literals. Then we need to convert it to Erlang AST. At this point, the string "foo" becomes:

{:bin, 0,  [{:bin_element, 0, {:string, 0, 'foo'}, :default, :default}]}

Now every other compiler pass will traverse such nodes. Later we need to convert the node above into core erlang and then into other formats. So you can imagine that a big string will imply in a lot of work being done by the compiler.

However, I haven’t measured. Maybe most compiler passes are handling it really fast with exception of one pass being really slow that could be quickly optimized. It would be worth investigating.

josevalim · July 16, 2016, 3:19pm

More information. I have created an erlang file with a single function that exports a large char list (14MB). Here are the results:

$ erlc +time foo.erl
Compiling "foo"
 remove_file                   :      0.000 s       2.7 kB
 parse_module                  :      5.969 s  210108.2 kB
 transform_module              :      0.000 s  210108.2 kB
 lint_module                   :      0.016 s  210108.2 kB
 expand_module                 :      0.004 s  210109.0 kB
 v3_core                       :      0.009 s  210115.7 kB
 sys_core_fold                 :      0.007 s  210111.3 kB
 core_transforms               :      0.000 s  210111.3 kB
 sys_core_dsetel               :      0.001 s  210111.3 kB
 v3_kernel                     :      0.004 s  210111.4 kB
 v3_life                       :      0.002 s  210110.6 kB
 v3_codegen                    :      0.005 s  210109.8 kB
 beam_a                        :      0.003 s  210109.6 kB
 beam_block                    :      0.005 s  210109.8 kB
 beam_except                   :      0.001 s  210109.8 kB
 beam_bool                     :      0.002 s  210109.8 kB
 beam_type                     :      0.002 s  210109.8 kB
 beam_split                    :      0.001 s  210109.6 kB
 beam_dead                     :      0.001 s  210109.8 kB
 beam_jump                     :      0.000 s  210109.8 kB
 beam_peep                     :      0.001 s  210109.8 kB
 beam_clean                    :      0.000 s  210109.8 kB
 beam_bsm                      :      0.002 s  210109.8 kB
 beam_receive                  :      0.001 s  210109.8 kB
 beam_trim                     :      0.001 s  210109.8 kB
 beam_flatten                  :      0.001 s  210109.6 kB
 beam_z                        :      0.001 s  210109.5 kB
 beam_validator                :      0.003 s  210109.5 kB
 beam_asm                      :      0.662 s       2.8 kB
 save_binary                   :      0.000 s       2.8 kB

However, if I convert it to a binary, there are two passes that take really long:

$ erlc +time foo.erl
Compiling "foo"
 remove_file                   :      0.000 s       2.7 kB
 parse_module                  :      5.837 s  210108.3 kB
 transform_module              :      0.000 s  210108.3 kB
 lint_module                   :      0.018 s  210108.3 kB
 expand_module                 :     21.961 s 2836421.5 kB
 v3_core                       :     26.641 s      10.8 kB
 sys_core_fold                 :      0.009 s       6.4 kB
 core_transforms               :      0.000 s       6.4 kB
 sys_core_dsetel               :      0.002 s       6.4 kB
 v3_kernel                     :      0.004 s       6.5 kB
 v3_life                       :      0.002 s       5.7 kB
 v3_codegen                    :      0.005 s       4.9 kB
 beam_a                        :      0.003 s       4.6 kB
 beam_block                    :      0.007 s       4.9 kB
 beam_except                   :      0.001 s       4.9 kB
 beam_bool                     :      0.002 s       4.9 kB
 beam_type                     :      0.002 s       4.9 kB
 beam_split                    :      0.001 s       4.7 kB
 beam_dead                     :      0.002 s       4.9 kB
 beam_jump                     :      0.000 s       4.9 kB
 beam_peep                     :      0.001 s       4.9 kB
 beam_clean                    :      0.000 s       4.9 kB
 beam_bsm                      :      0.002 s       4.9 kB
 beam_receive                  :      0.001 s       4.9 kB
 beam_trim                     :      0.001 s       4.9 kB
 beam_flatten                  :      0.001 s       4.6 kB
 beam_z                        :      0.001 s       4.6 kB
 beam_validator                :      0.003 s       4.6 kB
 beam_asm                      :      0.128 s       2.8 kB
 save_binary                   :      0.000 s       2.8 kB

josevalim · July 16, 2016, 8:39pm

And voila: Move expansion of strings in binaries to v3_core by josevalim · Pull Request #1131 · erlang/otp · GitHub

svarlet · July 18, 2016, 3:24pm

Thanks for taking the time to explain and to fix it !