Announcement here
Erlang literals are no longer copied when sending messages.
NICE. Anyone know if this would apply to :ets
tables as well?
Aww, I was just coming here to post this. ^.^
Here is the Release doc on Github at least: Release OTP 20.0 · erlang/otp · GitHub
I was keeping a watch on this feature so I am pretty sure the answer is âYesâ, but it works on literals. A literal is not something like 42
in this context, rather a literal is something like:
defmodule Blah do
def bloop(), do: %{a: 42, b: %{zwoop: 3}}
end
The %{a: 42, b: %{zwoop: 3}}
part is the literal if my understanding is right, it is âbaked into the compiled sourceâ, thus it âalwaysâ exists, meaning garbage collection does not need to be run on it and it can be passed in messages via a pointer (internally) instead of copying. Thus this should apply to ETS as well (as long as the information was passed in from the running system and not loaded from a file or so)
This is a huge performance feature for certain styles, Iâve been wanting it for a long time.
Yeah itâs a pretty big boon to Absinthe as well because our schema types involve a lot of literals, and removing the copy penalty for those opens up a lot of interesting techniques for us that would have performed poorly before.
HmmmâŠ
defmodule Foo
def foo(pid, list), do: send pid, list ++ [1,2,3]
end
Okay, list
will be copied, thats out of question, but what happens to the appended tail?
Some notes I want to focus on:
-
erlang:garbage_collect/2 for control of minor or major GC
: Whoo hoo! This will make Benchee and the like much more accurate for testing memory usage. -
In the OTP 20 release candidates the function erlang:term_to_binary/1 changed the encoding of all atoms from ATOM_EXT to ATOM_UTF8_EXT and SMALL_ATOM_UTF8_EXT. This is now changed so that only atoms actually containing unicode characters are encoded with the UTF8 tags while other atoms are encoded ATOM_EXT just as before.
: Time to update our ETF libraries!!! Note this anyone that uses one!!! -
Dirty schedulers enabled and supported on VM with SMP support.
: Whoo we can assume Dirty Schedulers exist in the system now! -
erlang:system_info/1 atom_count and atom_limit
: This is quite useful for detecting runaway atom growth âbeforeâ it becomes a problem. -
Pattern matching for maps is optimized
: Maps are even faster for matching now! -
Atoms may now contain arbitrary unicode characters.
: No I donât think Elixir will allow smiley-emoji function names. -
Significantly updated string module with unicode support and many new functions
: I wonder how much of this can be offloaded from Elixir back to Erlang now⊠-
A new event manager to handle a subset of OS signals in Erlang
: This is utterly awesome and Iâve wanted it for so long! -
erl_tar support for long path names and new file formats
: Whoo-hoo Release improvements! -
New math:fmod/2
: Whooo finally! I was just needing this a few days ago!
Well ++
will get called every time that foo
is called, meaning that list will be copied, meaning that is not a âliteralâ send, so normal copying as it already does now. However âthisâ could be optimized if list
were a literal as it is sent verbatim (consequently the literal [1,2,3]
will also not be copied in this case):
defmodule Foo
def foo(pid, list), do: send pid, {list, [1,2,3]}
end
It depends on what you send, not how you transform it.
âHoweverâ, the âcontentsâ of the list in your ++
might not be copied.
It has to get copied, because the last cons-cell gets changed, and wie have copy-on-write
in the BEAM.
The pointers to the content would get copied, but not necessarily the contents. If the list
was populated by calling, say, this:
def blah, do: [{:blah, 42}, {:bleep, 64}]
Then those two tuples would not be copied but rather pointed to directly. It would have to recreate the list itself, but not necessarily the âcontentsâ of the list.
Hereâs the spec for Unicode support in syntax in Elixir 1.5 https://hexdocs.pm/elixir/master/unicode-syntax.html
This means itâs possible to use emojis in quoted atoms/function names, but unquoted emoji functions are not supported. But all Unicode letters should be supported - this has probably the biggest impact on test names since they can now contain arbitrary Unicode.
The tail of the list doesnât need to be copied on ++
, this means it will remain a literal and shouldnât be copied under the new optimisation.
We donât plan to offload much because our implementation is faster since it works exclusively on binaries. On average, 3x faster. The exception is String.normalize/2
that is faster in Erlang. We could likely make ours faster but since their version is fairly encapsulated in the :unicode
module, it makes sense to depend on their implementation.
For integration between Elixir and OTP 20, there is this issue: Support Erlang 20 new features · Issue #5851 · elixir-lang/elixir · GitHub
JosĂ©, I suspect you donât hear it enough, but I am very thankful and grateful to you for keeping up with all of these sorts of details and directing the Elixir language in a wise way. Thank you.
Whoo, that is definitely nice!
Very much so, I see how much you talk on the OTP tracker as well, you push a lot of development.