Is this a correct way to avoid wasting or reaching the atom limit?

Well, if you know for a fact that the number of possible names for :“node_name@DNS_name” is limited (which it probably is, since you don’t really create an unlimited amount of nodes, right?) then it could be save enough to use.

2 Likes

Yes, that’s what I thought… Until M. Virding said that dynamically creating atoms (even occasionally) is BAD ! :laughing:

Fortunately we can’t have the many nodes in a distributed system. Yet. :wink:

OK, I will qualify that and add “unless you know there will only be a few new atoms”. Recreating existing atoms is ok as that just reuses them.

3 Likes

YES !!! Thanks you, Robert !!!
I promise to NOT generate too many nodes ! :smiley:
…Hmm, BTW, what do you considere as “few” ? :thinking:
In one of my experimentations, I currently use each node as a client/server entity which exchange files with my others distributed nodes. So, inside one node, I have to “generate”, at some point, the (atoms) names of each node that this node communicate (ie exchange files) with…
But perhaps am I following the wrong way (I’m an Elixir noob so…)

1 Like

1 million atoms is actually a metric shit ton of possible atoms, ask the guys working in Absinthe (that library creates a lot of atoms and wants us to to create even more with all the objects, fields etc) - it is usually plenty and some if you are building a regular system. If you create 100 atoms with that node thing, it’s close to nothing, 10_000 would be a few, 100_000 would be a seizable amount but still not enough to threaten your system.

Let’s say you create 100 new nodes every day, it would take you more than 2 years to reach the limit (provided the system keeps running, no duplicates and that the rest of the system doesn’t create an equally insane amount).

But not at runtime, based on some users input. They are generated once, and after the system has loaded all of its modules, atom count won’t change anymore (at least not because of absinth).

1 Like

Yes, don’t get me wrong! I was not trying to suggest absinthe is doing it wrong or anything, just saying that it requires you to create a lot of atoms, and we are still not even near the danger zone.

After asking my Project Manager, one of my node file server (the big “supervisor” one) should be connected to a max of 2000 others nodes but others node file servers should be connected to an average 5 to 10 others.
So I have to “generate” some 2000 atoms on one server(node) and only 5 to 10 on the others…

Is this using “distributed erlang”? If yes, then you’ll end up with 2000(2000 - 1) / 2 = 1999000 connections unless you are not very careful in how you grow the cluster.

Given these numbers I would not rely on distributed erlang but connect to the other servers by other means that are easier to controll and as a side effect probably do not rely on erlang node-names anymore but regular hostnames/ip-addresses.

2 Likes

Yeah as @NobbZ notes, “connect” can’t mean “connect via distributed erlang”, it simply can’t support clusters that large. If it’s a file server I’d consider serving files via ordinary tcp / http.

1 Like

Ok.
Thanks for the advices.
I give up this “node” (distributed Erlang) option then to focus on my other one (Elixir sftp server)

Interesting. I was under the impression that we are talking about nodes with static names.

The whole thing made me think about how elixir/erlang create process IDs. Are they always unique, or could there be a finished process with the same ID?

You could have 2000 nodes connecting to a “manager” if you make it a hidden node or use -connect_all false. The default in Erlang clustering is to have a fully connected mesh, but it’s not mandatory, if you can live without :global and :pg2.

Yes, PIDs get reused: https://stackoverflow.com/questions/46138098/can-erlang-reuse-process-ids-if-so-how-to-be-sure-of-correctness

1 Like

Or you use :erlang.system_info(:atom_count).

2 Likes

My current project loads hundreds of TOML files into a shallow tree of maps. The top-level keys are strings (relative path names), but the interior keys are all converted to atoms. For safety, this conversion uses String.to_existing_atom/1.

Before loading the tree, I load several “schema by example” files, using String.to_atom/1. This defines atoms for a few dozen legal keys. I believe that this is a safe approach, because the schema files are small and I’m in control of their content.

3 Likes

Anyway if I managed to read/save files between distributed nodes, I didn’t find a way to stream files with this configuration.
This work (nearly) ok on my SFTP version…