Thanks! I’m still in the process of writing up documentation, but I’ll update this thread once I have that done
I was just looking at this the other day wondering if there were any libraries doing active development on this. I can’t wait to see how this works with my library that’s diff’ing AST (HTML parsed into AST) and pushing patches over the wire. I imagine berts will be quite the optimization
Let me know how that turns out! I’m curious as well
@dgmcguire Benchmark it. ETF serialization is not as fast as you’d think, even JSON can be faster at times (though less expressive).
Yup, I’ve learned to trust nothing but benchmarks and to not trust them either
Bert.js with your AST diffing (is that iolistt?), they could work with Phoenix.LiveView. That would be interesting.
I’m not really sure this is a good idea: using :erlang.binary_to_term() may lead to atoms exhaustion.
A malicious user might craft a payload that contains a huge number of new atoms, with just few malicious payloads this might happen: “Atoms are not garbage-collected. Once an atom is created, it is never removed. The emulator terminates if the limit for the number of atoms (1,048,576 by default) is reached.”. In my opinion this sounds like a denial of service.
I think that JSON might be enough for your case. Also, like I said it before, don’t diff AST. Diff a special internediate representation.
I think bert-encoded-html over websocket has been already done by n2o project a few years ago. I don’t remember it using any kind of diffing back then, though.
@alexdovzhanyn you might be interested in reading through the sources of some other bert encoders/decoders in js like https://github.com/synrc/n2o/blob/master/priv/protocols/bert.js and https://github.com/discordapp/erlpack, in case you haven’t already.
:erlang.binary_to_term/2 has an option to prevent DoS via the atoms table
I still need to go back and reread all of your advice, but I’m wondering if you have a different idea of what I mean by AST. I’m not talking about diffing elixir AST. I’m diffing html AST
(and really it’s not even a tree diff right now, like I said about using floki at runtime - right now I’m focusing on a usable library before an optimized library, but I am getting very close to moving to optimizations)
when you say intermediate representation do you mean the flattened out nodes list?
We are talking about the same thing then. Sorry.
I think decoding functions would also create atoms.
The safe version of the decoder also prevents that.
You can pass an option to
:erlang.binary_to_term/2 to tell it not to create new atoms (will error if someone tries). (EDIT: as @dgmcguire says further below that post)
Thanks, good point
I made a pull request to Bert.js, https://github.com/ElixiumNetwork/bert-elixir/pull/1 to make it clear
I think we should always keep Elixir comunity security aware
There is a discussion on a related topic here: Anyone using Erlang External Term Format (ETF) instead of e.g. JSON?, so I posted there the link I was going to post here.
Hey, tanks for mention our library!
Just wanted to notice that we updated bert.js URL in master:
and I also wrote some comments on our implementation: