PragTob
Tail Call Optimization in Elixir & Erlang – not as efficient and important as you probably think
I wrote up a detailed blog post about tail call optimization in Elixir/Erlang and its performance. The TLDR; sort of is that none tail call optimized recursive functions (body-recursive) can be faster and more memory efficient than TCO functions. This is something that I never thought before, that TCO is always faster seems to be a common misconception. My example is also not contrived - it’s an implementation of map.
Posting here to share as I think it’s important knowledge
Also to get feedback, maybe someone has another implementation that blows mine out of the water.
PS: Hope this is the appropriate forum/tag?
Most Liked
rvirding
Sorry, there are some cases where TCO functions are absolutely critical. Kill your system and make it worthless if you don’t do it critical. You just have to get it right.
The examples you show are all calls to functions which (eventually) return and in those cases TCO is a nice but not need. It may not save you heap but it will save you stack, which can actually save you GC as well[*].
No, the main case where is TCO is critical is in process top-loops. These functions never return (unless the process dies) so they will build up stack never to release it. Here you have to get it right. There are no alternatives. The same applies if you top-loop is actually composed of a set of mutually calling functions. There there are no alternatives. Sorry for pushing this again, and again, but it is critical. 
In other cases my way of choosing is the one which gives me the most intelligible code. And that varies depending on what the function is supposed to do and how I choose to do it.
Robert
[*] This is because the heap and stack share the same memory area and when it is full you get a gc. So keeping the stack small can delay gc.
PragTob
And more resurrection, I used benchee 1.0 to take this (as one of the “original” benchee benchmarks) up for a spin to see how things have improved and found some peculiar behaviour: https://pragtob.wordpress.com/2019/04/08/revisiting-tail-call-optimization-in-elixir-erlang-with-benchee-1-0/
PragTob
I just updated the post after all this time (somebody commented) - when the list size is increased it seems like tail-recursive functions get faster again. E.g. instead of the original 10_000 element list I used a 10_000_000 list and recorded the benchmarking results here. The tail-recursive functions are almost twice as fast as the body-recursive version and the standard library version.
So, always benchmark your own use case and it’s time that I give benchee its multiple inputs feature 
(sorry for doing a little resurrection but I thought this was a rather important point to raise here)







