Erlang 21.2 erts-10.2 released

http://www.erlang.org/news/125

A few good fixes and enhancements, SSL added more methods and is faster, socket polling is faster, however 2 big things that I like!

  • New counters and atomics modules supplies access to highly efficient operations on mutable fixed word sized variables.

The new :atomics module wraps the low level hardware atomic instructions without any locking, it’s use-cases are limited but EXTREMELY useful for where they are useful. I find it odd how the array part is 1-indexed though, erlang has a lot of mixed 0-index and 1-index code, and 1-indexed means the math is harder… blah…

The new :counters module uses the new :atomics module to do ETS-style counter handling but using hardware atomic instructions so there is no locking or cost. Same thing with the really weird non-mathematical 1-indexed arrays though.

  • New module persistent_term!. Lookups are in constant time! No copying the terms!

You know the global module that discord put out that recompiles a module to bake terms into it for super-fast referencing and no-copies and so forth? That’s basically the new erlang persistent_term library! Here’s a use session in Elixir!

iex(1)> :persistent_term.info()                        
%{count: 0, memory: 152}
iex(2)> :persistent_term.put(:blah, 42)
:ok
iex(3)> :persistent_term.info()        
%{count: 1, memory: 192}
iex(4)> :persistent_term.get(:blah)
42
iex(5)> :persistent_term.get()     
[blah: 42]
iex(6)> :persistent_term.put(:blorp, make_ref())
:ok
iex(7)> :persistent_term.get()                  
[blah: 42, blorp: #Reference<0.3525248772.1957167107.195526>]
iex(8)> :persistent_term.erase(:blah)
true
iex(9)> :persistent_term.get()                  
[blorp: #Reference<0.3525248772.1957167107.195526>]
iex(10)> :persistent_term.info()
%{count: 1, memory: 216}

So it’s slow to put but crazy-fast to get, reference, and pass around the values stored within! In general you shouldn’t use this unless you know what you are doing as there is no garbage collection, should generally be used for global settings and data, etc…

24 Likes

These are cool, thanks for the examples! :slight_smile: I use ETS to store my blog’s data in memory, think I could use that storage instead? :smiley: Does it have atomic updates of terms? Then I would be really interested.

Think of it this way: If it’s something you would normally compile ‘into’ a models code for efficiency then use it, else you probably want ETS (or mnesia to keep it serialized to disk and hold it in memory on load).

So for a blog I’d personally use either flat-files on disk or store it in mnesia with disk and memory copies, I’m not sure I’d use this for that, though your data is static enough that it ‘should’ be fine.

Just remember, :persistent_term has a max storage size of 1 gig unless you set the _MIscs 2048 or so option (that one would raise it to 2 gigs, the value is in megabytes).

When a new value is put then it is visible everywhere immediately on the node because when an old value is replaced or erased then a global garbage collection is run, which can pause things, that collection will replace any links to the global data in all processes to a local copy of the data instead.

Just remember that accessing data in :persistant_term is way fast, but changing data in any way is very slow, and in some cases it can be really really slow. But it’s still faster than recompiling a module to ‘intern’ the terms (which would destroy linked processes if it got updated, so :persistent_term is definitely better there). :slight_smile:

2 Likes

How slow is slow in this case? Any rough estimates?

In my use case, all the blog posts are stored in the filesystem so persistence is not an issue. ETS works fine for serving them from memory, but I’m enticed by this API that is much simpler (ETS is just awful in that regard).

BTW is Application env similar? It has put_env that is global, how is that implemented? Don’t worry, I’m not planning on abusing it for this. :stuck_out_tongue:

Depends on many factors, like how many terms are in the persistent cache, how many processes are running in the system, how many modules are loaded, etc… etc… On average except up to a second, some cases could potentially be a few seconds, on an empty system with about nothing running in the BEAM it will be milliseconds I’d think.

Application env’s are in ETS (I think?).

Changing anything in persistent_term means running a GC over all the processes in the system and copying all of the data in persistent_term. It can get very expensive on bigger systems.

7 Likes

4 posts were split to a new topic: The use of 1 or 0 for indexing

This would totally work for my blog then. I update the data maybe once a month at most (when I write a post) and the only thing running are the Raxx request handlers.

One thing to ask is, if it would be actually beneficial. Binaries (and I assume your entries are just plain binaries) are stored off-heap, so they are not copied when retrieved from ETS anyway. This removes the biggest advantage of persistent_term over plain old ETS.

5 Likes

Ah very good point. Maybe I’ll just forget about it for this use case. :slight_smile: Thanks for all the responses.

Are their any anecdotes yet reflecting relative performance of :persistent_term versus functions that bake in data?

:persistent_term looks like a possible good approach for storing locale data in my ex_cldr package since its large and static. However it is a fairly complex Map.t() so their would be some penalty to pay in keeping the data as one large map as apposed to the current approach which decomposes the map into different functions.

1 Like

Very interesting stuff!

Would :persistent_term in theory be the proper way in new Erlang versions to build a dispatch-system akin Elixir’s protocols?

Hmm, don’t know, let’s test, I whipped up a simple benchmark between persistent_term, ets, and head dispatch (which is how protocols dispatch), using 1000 trivial entries of ints and 1000 entries of 32-byte binaries, the source:

defmodule BenchDefs do
  def range(:ints), do: 1..1000

  def range(:binaries32),
    do: unquote(Enum.map(1..1000, fn _ -> :crypto.strong_rand_bytes(32) end))
end

defmodule HeadDispatch do
  for i <- [BenchDefs.range(:binaries32), BenchDefs.range(:ints)], j <- i do
    def match(unquote(j)), do: unquote(j)
  end
end

defmodule PersistentTermBench do
  def classifiers(), do: [:ints, :binaries32]

  def time(_), do: 2

  def inputs(cla),
    do: %{
      "First" => BenchDefs.range(cla) |> Enum.into([]) |> List.last(),
      "Last" => BenchDefs.range(cla) |> Enum.into([]) |> List.last()
    }

  def setup(cla) do
    Enum.each(BenchDefs.range(cla), &:persistent_term.put(&1, &1))
    tab = :ets.new(PersistentTermBench, [])
    Enum.each(BenchDefs.range(cla), &:ets.insert_new(tab, {&1}))
    tab
  end

  def teardown(_, tab) do
    :ets.delete(tab)
  end

  def actions(_, tab),
    do: %{
      ":persistent_term" => fn inp -> :persistent_term.get(inp) end,
      ":ets" => fn inp -> :ets.lookup(tab, inp) end,
      "HeadDispatch" => fn inp -> HeadDispatch.match(inp) end
    }
end

And the results:

Benchmarking Classifier: ints
=============================

Operating System: Linux"
CPU Information: AMD Phenom(tm) II X6 1090T Processor
Number of Available Cores: 6
Available memory: 15.67 GB
Elixir 1.7.4
Erlang 21.2

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 2 s
memory time: 2 s
parallel: 1
inputs: First, Last
Estimated total run time: 36 s


Benchmarking :ets with input First...
Benchmarking :ets with input Last...
Benchmarking :persistent_term with input First...
Benchmarking :persistent_term with input Last...
Benchmarking HeadDispatch with input First...
Benchmarking HeadDispatch with input Last...

##### With input First #####
Name                       ips        average  deviation         median         99th %
HeadDispatch           22.21 M      0.0450 μs     ±6.97%      0.0440 μs      0.0540 μs
:persistent_term       15.28 M      0.0655 μs     ±3.23%      0.0650 μs      0.0740 μs
:ets                    8.07 M       0.124 μs   ±172.07%       0.120 μs       0.180 μs

Comparison: 
HeadDispatch           22.21 M
:persistent_term       15.28 M - 1.45x slower
:ets                    8.07 M - 2.75x slower

Memory usage statistics:

Name                Memory usage
HeadDispatch               136 B
:persistent_term           136 B - 1.00x memory usage
:ets                       200 B - 1.47x memory usage

**All measurements for memory usage were the same**

##### With input Last #####
Name                       ips        average  deviation         median         99th %
HeadDispatch           22.38 M      0.0447 μs     ±6.12%      0.0440 μs      0.0530 μs
:persistent_term       15.21 M      0.0658 μs     ±7.30%      0.0650 μs      0.0750 μs
:ets                    8.08 M       0.124 μs   ±143.17%       0.120 μs       0.180 μs

Comparison: 
HeadDispatch           22.38 M
:persistent_term       15.21 M - 1.47x slower
:ets                    8.08 M - 2.77x slower

Memory usage statistics:

Name                Memory usage
HeadDispatch               136 B
:persistent_term           136 B - 1.00x memory usage
:ets                       200 B - 1.47x memory usage

**All measurements for memory usage were the same**

Benchmarking Classifier: binaries32
===================================

Operating System: Linux"
CPU Information: AMD Phenom(tm) II X6 1090T Processor
Number of Available Cores: 6
Available memory: 15.67 GB
Elixir 1.7.4
Erlang 21.2

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 2 s
memory time: 2 s
parallel: 1
inputs: First, Last
Estimated total run time: 36 s


Benchmarking :ets with input First...
Benchmarking :ets with input Last...
Benchmarking :persistent_term with input First...
Benchmarking :persistent_term with input Last...
Benchmarking HeadDispatch with input First...
Benchmarking HeadDispatch with input Last...

##### With input First #####
Name                       ips        average  deviation         median         99th %
HeadDispatch           10.98 M      0.0911 μs   ±271.68%      0.0800 μs       0.180 μs
:persistent_term        8.17 M       0.122 μs   ±240.94%       0.120 μs       0.180 μs
:ets                    4.74 M        0.21 μs    ±85.30%        0.20 μs        0.31 μs

Comparison: 
HeadDispatch           10.98 M
:persistent_term        8.17 M - 1.34x slower
:ets                    4.74 M - 2.32x slower

Memory usage statistics:

Name                Memory usage
HeadDispatch               184 B
:persistent_term           136 B - 0.74x memory usage
:ets                       248 B - 1.35x memory usage

**All measurements for memory usage were the same**

##### With input Last #####
Name                       ips        average  deviation         median         99th %
HeadDispatch           10.86 M      0.0921 μs   ±481.96%      0.0900 μs       0.170 μs
:persistent_term        8.10 M       0.123 μs   ±254.71%       0.120 μs       0.180 μs
:ets                    4.79 M        0.21 μs    ±72.56%        0.20 μs        0.30 μs

Comparison: 
HeadDispatch           10.86 M
:persistent_term        8.10 M - 1.34x slower
:ets                    4.79 M - 2.27x slower

Memory usage statistics:

Name                Memory usage
HeadDispatch               184 B
:persistent_term           136 B - 0.74x memory usage
:ets                       248 B - 1.35x memory usage

**All measurements for memory usage were the same**

So interned head matchers are still the fastest, barely, followed closely by persistent_term, with ETS over 2 times slower after that. So keeping protocols as head matchers is still fastest.

I’d say keep it as functions in a module.

Thus no. :slight_smile:

1 Like

Hi there, I created a Hex package to experiment for distributed code compilation within an Elixir cluster which would take away the “pain” of writing: https://github.com/archan937/clustorage

I actually started building this with Discord’s FastGlobal package as inspiration.

I’m curious for your thoughts about it :slight_smile:

2 Likes

Hmm, interesting style, API seems good. Since it looks like the functions run on the cluster side, what happens if they crash? What happens if they crash the entire node?

1 Like

Good questions. The package is not build into that extend right now. Fellow alchemists are welcome to contribute :sweat_smile:

At the moment, all I can say is that the designated loader node will crash first if the compilation is corrupt.