How are PIDs generated?

How are PIDs generated? What does each segment ie <a.b.c> denote?

The problem. Im using Task.async_stream/5 to parse very large files. There are occasions where if a new value meets the search criteria but comes earlier in the file it should be ignored. Im using ordered: true but need to ensure if newer values are favored over older ones.

The way I want to do this is by coveting the PID to a 3 part tuple and determine which thread is newer by its ID. I have noticed they seem to be some what sequential. The last number denoting something like a series.

That said I really need to understand how PIDs are generated to know if this is a viable option.

The PID format is opaque so likely to cause issues if you rely upon some expectation. Maybe you could Stream.with_index/1 the list of files into your Task.async_stream/5 so that you know the order definitively?

8 Likes

No, you really don’t. If a type’s values are opaque that should give you a hint you are focusing on the wrong imagined solution.

Can you please describe your exact problem? I can’t understand by your explanation. You want things coming in an ordered fashion but since each “thing” (a file?) has a different size you are worried the smaller ones will finish first regardless, or?

Can you tell us more about what you want to do?

3 Likes

Yeah as @dimitarvp notes this is not a viable option. Pid values are not guaranteed to be ordered. They may appear to be for a while, but as the system goes on it’ll start reusing numbers from earlier dead pids and the order will be lost.

Can you show the code you’re doing?

1 Like

I’d go to the solution of giving an additional index as input to the task themselves, so they can return it along with the result and you can order them by it if you really need it.

Just to add to the already mentioned problems with using an opaque data to yield information, afaik the first number in the PID’s triple is the node number, but I’m unaware about the others :slight_smile:

1 Like

You can use Enum.with_index/1 on your data before calling in async stream.

3 Likes

What the fields in a pid mean is an internal an implementation detail and it is upto the implementation to decide and interpret their meaning. For the sake of compatibility there will always be 3 fields. The only thing you can more or less with certainty get out of pid is the meaning of the first field, if it is 0 it is the pid of a local process otherwise it is a process on another node. There is no way to work out on which node by the value of the first field, except for 0. The only real way to get the node is to use node(pid).

You have the same issue with references and ports as well.

13 Likes