Hex: bin_struct | Hex
Git: GitHub - 4ait/bin_struct
Docs: bin_struct v0.2.13 — Documentation
BinStruct is a library which main function is to convert readable as possible declarations to robust and performant implementations. It will generate set of functions to make parse → decode → create_new → send process trivial.
What BinStruct is not
BinStruct is not an protocol itself, there is no goal to replace asn1, protobuf, erlang binary term or any other protocols. if you can solve your problem using existing protocol - stick with it.
BinStruct is not replacement for binary pattern matching. If your job can be done via pattern match only it will be always better to use it directly. There is layer of complexity this lib adds to make it achive it’s main goal - write declarations, generate implementations automatically. When complexity grows only sane way to keep with it has general declarative structure of each part you working this.
BinStruct is by no means a framework and does not force you to follow any specific structure of how its parts will be used together. Each BinStruct you create is completely self-contained and can be used as you see fit. Whether you want to validate CRC, add encryption, or implement something else inside or outside—it’s entirely up to you, and the library imposes no restrictions on these choices.
What BinStruct is primarily
BinStruct is s tool. Tool to support developer from very beggining with reach set of generated features, allowing to exlore data in every step, to very end running your app in production.
I believe BinStruct is an essential tool for developers. Simply transferring declarations from your protocol documentation into BinStruct special syntax is enough to start parsing your data, decoding it, and exploring its structure. This lets you build an understanding of how to proceed next. It is especially helpful when working with a protocol that is new to you. If you’re unsure where to start or what to focus on, just transfer what you see in the documentation into BinStruct declarations and experiment. At some point, things will start falling into place, and you might even find that the application almost writes itself before you realize it. Even the smallest fragments you implement can already be put to use. You can parse and decode binary data to gain a better understanding of what you’re dealing with without needing to fully implement every detail or dynamic callback. You’ll gradually build out your protocol implementation step by step, and over time, these pieces will naturally connect as your codebase grows. You don’t need all the advanced features like virtual fields, auto-generated fields (builders), or type conversions beyond the basic managed (human-readable) one right away. You can always add them later if you think they’ll make the process easier.
Basic syntax overview
defmodule PngChunk do
use BinStruct
#all dynamic behaviour is callback
#if we are not specifying type_conversion this is always 'managed' also known as 'human readable'
register_callback &data_length/1, length: :field
#with fields you build shape of your binary data
field :length, :uint32_be
#use expanded constructs whenever possible, this is both easier to read and will be validated at parse time
#its always better to expand arrays/flags/enums even if you don't use them for now, it will help moving forward
#as you will have more complete picture
#and also it will give you opportunity to be dispatched as dynamic variant later (read it as if we received something and it has type distinct from listed below it's not this struct, we can catch it via upper variant_of later)
field :type, {
:enum,
%{
type: :binary,
values: [
"IHDR",
"PLTE",
"IDAT",
"IEND",
"cHRM",
"gAMA",
"iCCP",
"sBIT",
"sRGB",
"bKGD",
"hIST",
"tRNS",
"pHYs",
"sPLT",
"tIME",
"tEXt",
"zTXt",
"iTXt"
]
}
}, length: 4
#consuming dynamic behaviour into length_by
field :data, :binary, length_by: &data_length/1
field :crc, :uint32_be
#dynamic behaviour implementation
#we returning always 'managed' type conversion, in this case length field will be automatically converted to elixir number
#and we return this number as it
defp data_length(length), do: length
end
Performance notes
The library compiles into Elixir binary pattern match and uses optimizations like composing every part with known size into single pattern, always inlining for encoders and static values, caching every requested value.
If in registered_callback
field A
requested from both B
and C
, A
will be converted to requested type conversion before B
(late as possible) and later passed same value to C
.
All functions, except for main public function like parse/2
, are declared in the same module and marked as private (defp
), giving maximum optimization opportunities for the Erlang compiler (erlc
).
You can expect performance equal to manually written pattern matches, with some differences: modular structure, validation after each step, and creating structs as the result. It is not correct to compare simple manual parsing patterns directly to what this library does.
I have created small intro post few days ago about it: What is the Elixir way of decoding/parsing binary data? - #20 by Ridtt
I also included example implementation of png parser using BinStruct as alternative to suggested in article in this thread way using raw pattern matching.
For anyone interested in I suggest to start exploring with png example bin_struct/examples/png.exs at master · 4ait/bin_struct · GitHub
Then docs for main macros BinStruct — bin_struct v0.2.13
And then docs for types binary — bin_struct v0.2.13
More complex examples:
When things are hidden in integer: bin_struct/examples/extraction_from_integer.exs at master · 4ait/bin_struct · GitHub
When things are hidden in buffer: bin_struct/examples/extraction_from_buffer.exs at master · 4ait/bin_struct · GitHub
Implementing transport packet: bin_struct/examples/packet_via_higher_order_macro.exs at master · 4ait/bin_struct · GitHub
Dynamically working with recursive data structures: bin_struct/examples/recursive_sequence.exs at master · 4ait/bin_struct · GitHub
Future perfomance improvements:
Problem: you don’t always need all values to be decoded. And it’s always will be not optimal solution no matter there will be decode single field function or not. We can solve it with compile time use cases.
Compile time decode_only use case: bin_struct/examples/compiled_decode_use_case.exs at master · 4ait/bin_struct · GitHub