Ai.google: KV storage is old,"stateful application servers or custom in-memory stores" better

gdub01 · June 24, 2019, 12:59pm

Currently #1 on HN Fast key-value stores: An idea whose time has come and gone.

Remote, in-memory key-value (RInK) stores such as Memcached and Redis are widely used in industry and are an active area of academic research. Coupled with stateless application servers to execute business logic and a database-like system to provide persistent storage, they form a core component of popular data center service architectures. We argue that the time of the RInK store has come and gone: their domain-independent APIs (e.g., PUT/GET) push complexity back to the application, leading to extra (un)marshalling overheads and network hops. Instead, data center services should be built using stateful application servers or custom in-memory stores with domain-specific APIs, which offer higher performance than RInKs at lower cost.

Just interesting because elixir seems to have a lot of the ability to do this out of the gates. Still a little harder to scale across a number of nodes than using redis, but the building blocks are all here for stateful servers are immediately accessible which is awesome. Anyways, just thought it was interesting =)

amnu3387 · June 24, 2019, 1:55pm

:mnesia.start()

peerreynders · June 24, 2019, 2:42pm

The paper is about replacing RInK (Remote, In-memory, Key-value) stores with LInK stores (Linked, In-memory, Key-value) stores.

A LInK store is a high level abstraction over an auto-sharder that provides a distributed, in-memory, key-value map with rich application objects as values, rather than strings or simple data structures.

I wonder if the next step would push those “stateful application servers” to the edge.

amnu3387 · June 24, 2019, 4:19pm

But mnesia can be used as an in-memory store (or as a disc based store, or both), automatically distributed, with acid transactions included and table sharding (although this does seem to require some more knowledge to get going correctly), so basically you could keep (if wanted) the application stateless (mentioned as being the motivator for rink), while still having a “local” cache (or distributed yet transparent, or both), so covering the issues mentioned as the reason for doing those link stores?

(plus, not only as a pure KV but also as a “bag” or ordered KV)

peerreynders · June 24, 2019, 5:45pm

I’m not saying you are wrong that mnesia could be used in that capacity.

I was casting a much wider net to speculate where this could be going (considering another recent trend)?

Whether in the long run this is pushing towards an architectural style that allows an ephemeral deploy of a stateful application or application shard to an edge compute node to serve 1 to n users primarily from a local cache - something that is neither “serverless” or “serverful” because most of the processing doesn’t happen on the client or near persistent storage.

dimitarvp · June 24, 2019, 11:09pm

I never believed in Redis or Memcached even in my Rails days – to me they were a bandaid. I always thought, even as far back as when I coded C, C++ and Java, that every runtime needs its own builtin caching system, or, at least, a system allowing you to put stuff in in-memory database and use it almost transparently.

That’s one of my reasons for sticking with Elixir – it has that and more. OTP itself, plus ETS, persistent_term, atomics / counters, and a few more, gives you an experience that Redis and Memcached can only dream of (sans some querying though).

I don’t think Amazon, Google and Microsoft (and any cloud compute/storage provider) would risk their business like that. But even in my small country some internet business prefer to co-locate their own servers with their corporate ISP of choice.

amnu3387 · June 25, 2019, 9:18pm

I think these “edge” solutions are great for content and assets. They take care of geographical distribution, caching (it’s just a fall-through stop, if it’s not in cache go fetch) from the origin, resilience (no way to DoS it at a reasonable cost), they take load off from the servers, and improve response times, probably save on bandwidth as well (and devops)? Specially if you’re building a MVP or something, but once you have a team working full-time it might become cheaper to build your own “edge” system, tuned to your needs?

But there’s other things that I think a “local” cache that can distribute itself is really useful (and many things that can’t be cached in any useful manner by a server that is at the edge operating as a fall-through). For instance if you need to keep a cache of expired tokens, or rate limiting info, and you need them to be shared across a group of nodes then mnesia makes it very easy to do and share that across a cluster automatically (without requiring a centralised store/instance to be queried) - and you get to store all that deserialised in any term format (or even as iolists) - you can access it at ets speed if needed or with transaction guarantees in any node.

I’ve started playing with it because of some issues I was solving with keeping consistency across nodes, to store some “session” state about the users. Basically, if they’re in a queue, they can’t join or open a game, if they’re doing something else (have an open game) they can’t try to do something else - it’s a key and a struct describing what they are doing, I shoved that info into an ETS initially, but then if you go multi-node, it’s no longer as simple, as requests can hit different nodes, so then I started replicating the operations with rpc casts/calls across the nodes and at that point just thought, perhaps instead of writing all this (definitively bug-ridden implementation) crap it’s better to just use mnesia.

And the same applies to “open” games that people could join (it has some complexities, and validations, for a player creating one, then a player joining, then a period of response for who created to say start, reject or cancel, locking resources, rejecting the second player if the accept times out, etcetc). So basically these became two tables in mnesia, when a node joins the network it loads its own local copy from the other nodes, and no matter where the request hits, you can access that locally from ram, which is super fast.

All of this could be stored in postgres, but where’s the fun in that - I’m kidding it’s just that imagining you have a lot of requests to open, join, start games, and these are requests that a user expects to be fast, hitting the db at all times is overhead - and the same for rate limiting or token validation, if you store those in a db every request hits the db n times through its lifecycle.

Plus it’s also quite versatile, as in you can do some operations as acid transactions, while others can be done without those guarantees (say token invalidation or rate limiting if it’s not 100% crucial), so it lets you fine tune it quite a bit. (and you can mix, doing transactions in mnesia with transactions through your “db” and aborting both if something goes wrong in either)

I don’t feel brave enough to write all the DB (I’m still using postgres for plain users, etc) in mnesia, as you kinda need to know what you’re doing, to ensure you don’t f*** up the copies, replication, etc, and right now it still works fine by loading from pg into ets or processes for certain things. (disclaimer, I’m no expert in either mnesia or anything else, it’s just my reading, from the docs and the - small - experiments I’ve done, I’m sure it’s no silver bullet, but a very nice tool it seems, ps: I mean no disrespect by referring to mnesia as a “cache”)