The table is persisted using the :ets.file2tab/2 and :ets.tab2file/3 functions.
Table is to be created with PersistentEts.new/3 in place of :ets.new/2. After that all functions from :ets can be used like with any other table, except :ets.give_away/3 and :ets.delete/1 - replacement functions are provided in this module. The :ets.setopts/2 function to change the heir is not supported - the heir setting is leveraged by the persistence mechanism.
Like with regular ets table, the table is destroyed once the owning process (the one that called PersistentEts.new/3) dies, but the table data is persisted so it will be re-read when table is opened again.
With Dets every operation (read or write) hits the disk. For many application such a performance penalty (compared to ets) is not acceptable. Furthermore Dets tables are limited to 2GB. Dets doesnāt support the ordered_set table type either.
With PersistentEts, the table remains in memory, so all read and write operations have the same performance they would have with pure ets. Only periodically the table state is saved to a file. Thereās also no file limit, besides the memory and disk limitations. Since itās a regular Ets table, all types are fully supported.
This is serendipitous; Iām prototyping something at the minute, and this fits the bill exactly. Wanted to have ETS tables that held a specific state for users while they were all connected that could easily be saved for recovery when users came back online (itās a procedural generation toy, the ETS table provided āterrainā that all users of the toy, and all their controlled processes, can access). Mnesia didnāt quite seem to fit the bill, seemed a bit of a faff, just wanted something brutally simple to get thing running quickly, so thanks for this
Is there a benchmark of it compared to Mnesia with duplicate_bag tables using dirty read/writes (basically ETS that is DETS backed at that point) and similar settings for PersistentEts?
Does it only persist to disk āon occasionā or after every write? Does it do it when the owner process is terminated? Iām guessing via the file2tab and such that it is serializing out the entire ETS table every write out instead of only the differences?
Data access, sure. But the performance of PersistentEts can not be the same as Ets: Ets does not include persistence. So while data access is surely the same via the :ets api, the overall performance of PersistentEts is still interesting imhoā¦
I donāt find it easy to understand what those numbers mean without knowing the hardware involved, and also not overly meaningful without understanding how often persistence occurs and in which circumstances. Looking at the code, by default it writes the table to disk once per minute, regardless of changes to the table. (ā¦ as well as on table owner exit) Am I reading that correctly?
What is the intended use case for this? It canāt be for āvaluableā data, as any changes to it within the persist timeout will be lost. It canāt be for large tables, since hitting disk for such a time period could be pretty undesirable if there is other I/O happening, not to mention messages to the PersistentEts process which may be piling up behind it?
Looking at the code, Iām also unsure what happens if it so happens that PersistentEts.new is called when another PersistentEts is busy writing out a table ā¦ ?
So ā¦ perhaps this is intended for āwell behavedā applications with āsmallā datasets in ets tables?
Iām sure there must be a good use case for this, just trying to understand what it is. (So please donāt take the above as too critical ā¦ just walking through my thoughts as I look over the code)
That is basically how it is with mnesia as well, using dirty reads/writes is near the same as calling ETS straight, except it can still serialize the data out in the back-ground after a write instead of needing dumping periods.
Although could not get to 10gigs with it, at least on a 32-bit system. ^.^
I tried to look into the sources of tab2file but didnāt have enough patience to dig deep down into what theyāre using there for file manipulation
I just saw that theyāre using message passing and that was enough for me to understand that there will definitely be overhead.