With Beam JIT should we forget NIFs for more situations?

Rustixir · January 8, 2022, 11:57am

before JIT. beam was pretty fast and in many situation NIF make slower app than elixir/erlang code

after JIT Beam become very fast even
native json parser is faster than NIF type.

so for now we just use NIF for below situation or exist more situation :

when need mutable data structure ( like thing discord used before )
when need low level access to machine
when need video/image processing
when need implement machine learning. (Nx make it in elixir and it release vsn 1)

exist other situation to need NIF ??

hauleth · January 8, 2022, 12:33pm

Few more:

you need to access data that is exposed only via the library code (for example reading systemd journal files, you can circumvent this, but if you want direct access, then the NIF is the only option out there)
you need to access OS data that is not exposed via OTP, for example right now there is no way to get UID, GID, EUID, or EGID from the OS without writing your own NIF
high-performance tight loops (even JIT do not always help there), for example if you want to run WASM code from within your VM
implementing features that aren’t provided by the ERTS/OTP NIFs - for example broader control over sub-process. Ports are quite limited in their capabilities and cannot:
- close stdin while keeping stdout running, useful for data processing where the output is produced only after whole input is read
- pass additional FD to the sub-process, useful when you want to keep data off-band and leave stdin/stdout only for the logging, or for example you want to pass socket to separate process for further processing
interact with external libraries that are too expensive to rewrite in Erlang/Elixir/Gleam/other BEAM language (for example HTTP 3/QUIC libraries written in other languages like C or Rust are much more production-ready than anything written in Erlang right now)

Rustixir · January 8, 2022, 1:01pm

Thanks,
I have two question !!

How much is cost NIF ??
( I know about must be run under 1 millisecond )

( Table exist in ETS )
I need fetching data from 3 table ( each have
~200 entry ) need to fetching by some filter and result length is
30 entry extracted from those and run a rule based sorting ( integer ) over those, Elixir is ok for this type or i should use NIF ?

hauleth · January 8, 2022, 1:07pm

I have no idea, you need to bench it. It also depends on the amount of the data you pass to and from the NIF.

Bench it. But with so small inputs I think it will be faster to do everything in VM, without NIF.

dom · January 8, 2022, 5:36pm

Regular NIFs must return quickly, but dirty NIFs can run for a long time if they want, they run in a separate thread pool so they don’t block scheduler threads.

Regarding overhead, it’s a bit old so the numbers may be off nowadays, but check out: https://medium.com/@jlouis666/erlang-dirty-scheduler-overhead-6e1219dcc7

Rustixir · January 8, 2022, 7:58pm

I know its great without doubt (dirty scheduler) !!

But in all resource recommended create one thread per core !!

But how in beam create many scheduler for handling process and make another pool for handling like this !!

This is not impact on performance like context switch between these to much thread???

josevalim · January 8, 2022, 8:48pm

Typically no because your “dirty” workflow is often a fraction of the work being done by the VM. But if you are in a position where you know you will be using dirty threads a lot, and have many context switches, I believe you can set some specific cores to be used only for dirty purposes, for example.