Is this a correct way to avoid wasting or reaching the atom limit?

Ok. So, in my case, I only need String.to_atom(str) as this call will not create a new atom in the table nor throw an error if the atom already exists.

If you expect the atom to already exist I’d always use String.to_existing_atom to prevent new ones from being created accidentally. It will throw if the atom does NOT yet exist, but you try to convert to it.

1 Like

I’m a bit confused now… :pensive:
My only need is to generate an atom from a string:
If the atom already exists, then ok, I use the existing one (BTW if my node name already exists, that’s trapped later in my code).
If it doesn’t then create it.

As you wrote that String.to_atom(str) does that and won’t ever try to recreate (accidentally) an existing atom, this function is suffisant for me and I don’t need to verify with String.to_existing_atom.

From your replies, I think you have a slight misunderstanding of atoms. There can only ever be one atom with a given name. For example, there will ever only be one :foo. Once it has been created, all attempts at creating or referencing an atom with the same name will just reference the existing atom.

The danger of overflowing the atom table comes from an attacker using their input to generate :foo1, :foo2, :foo3, and so on, until the system has too many atoms and crashes. But even in that case there will only be one copy of any specific atom in the atom table.

1 Like

If you expect strings, which don’t yet exist as atoms than you need String.to_atom. I think you’re now aware of the issues around it.

Just one last thing. There’s no such thing as “recreating an atom”. Atoms when being created are registered once by being put into the atoms table. They’ll continue to exist from that point onward. Using or converting to an already registered atom just reuses it’s value. Creating or converting to a new one triggers the registration in the atoms table.

As others have already pointed out there is only one truly safe way to handle dynamically creating atoms and that is DON’T. Even if you feel you must: don’t. Even if it is only occasionally: don’t.

14 Likes

Why, concretely, do you need to convert to atoms? Perhaps we can suggest an alternative.

Sir, Yes, Sir, Master !!!.. :smile:
… If I could catch the ones who decided to associate nodes names with atoms instead of GC strings ! :innocent:

Yep, that one’s just unfortunate :sweat_smile:

As I wrote: to generate several nodes which are created using… atoms ? :“node_name@DNS_name”
If you know some others simple way (not via) to create nodes…

Well, if you know for a fact that the number of possible names for :“node_name@DNS_name” is limited (which it probably is, since you don’t really create an unlimited amount of nodes, right?) then it could be save enough to use.

2 Likes

Yes, that’s what I thought… Until M. Virding said that dynamically creating atoms (even occasionally) is BAD ! :laughing:

Fortunately we can’t have the many nodes in a distributed system. Yet. :wink:

OK, I will qualify that and add “unless you know there will only be a few new atoms”. Recreating existing atoms is ok as that just reuses them.

3 Likes

YES !!! Thanks you, Robert !!!
I promise to NOT generate too many nodes ! :smiley:
…Hmm, BTW, what do you considere as “few” ? :thinking:
In one of my experimentations, I currently use each node as a client/server entity which exchange files with my others distributed nodes. So, inside one node, I have to “generate”, at some point, the (atoms) names of each node that this node communicate (ie exchange files) with…
But perhaps am I following the wrong way (I’m an Elixir noob so…)

1 Like

1 million atoms is actually a metric shit ton of possible atoms, ask the guys working in Absinthe (that library creates a lot of atoms and wants us to to create even more with all the objects, fields etc) - it is usually plenty and some if you are building a regular system. If you create 100 atoms with that node thing, it’s close to nothing, 10_000 would be a few, 100_000 would be a seizable amount but still not enough to threaten your system.

Let’s say you create 100 new nodes every day, it would take you more than 2 years to reach the limit (provided the system keeps running, no duplicates and that the rest of the system doesn’t create an equally insane amount).

But not at runtime, based on some users input. They are generated once, and after the system has loaded all of its modules, atom count won’t change anymore (at least not because of absinth).

1 Like

Yes, don’t get me wrong! I was not trying to suggest absinthe is doing it wrong or anything, just saying that it requires you to create a lot of atoms, and we are still not even near the danger zone.

After asking my Project Manager, one of my node file server (the big “supervisor” one) should be connected to a max of 2000 others nodes but others node file servers should be connected to an average 5 to 10 others.
So I have to “generate” some 2000 atoms on one server(node) and only 5 to 10 on the others…

Is this using “distributed erlang”? If yes, then you’ll end up with 2000(2000 - 1) / 2 = 1999000 connections unless you are not very careful in how you grow the cluster.

Given these numbers I would not rely on distributed erlang but connect to the other servers by other means that are easier to controll and as a side effect probably do not rely on erlang node-names anymore but regular hostnames/ip-addresses.

2 Likes