I have serious doubts that a small entity can pull of a spanner/F1 inspired DB
For distributed document databases, I’m surprised that no one has mentioned Elasticsearch. They recently released version 5 which has improved resiliency greatly. Some of that is based on feedback from this Aphyr article. Elasticsearch doesn’t yet have real-time change feeds like CouchDB or MongoDB, but I don’t know that change feeds belong in the database. You can just implement change feeds in your application layer, especially when you’re using Elixir.
I use Elastic Search everyday at work. It is a PITA. It scale poorly, it is super hard to configure properly, their environment is a mess, and even with the v5 they still have tons of problems of resiliency. (in particular, Aphyr broke it again…but he needed more time this time.)
And stability is really not that great. We have cluster dieing under us regularly.
And let not talk about the “facets” problem. So far, i would not advise to use ES before a couple other years to clean the problems.
It is not a bad search engine, but it is not a great document DB.
Using it as primary store is not a very good idea, it’s better used as it was originally intended as an indexing and search engine solution. Can’t say it is unstable but by design it is definitely not suitable as a primary store.
Oh i know. But even as indexing engine it is… fiddly.
Honestly, I’ve yet to see anything that PostgreSQL is not suited for once properly configured other than maybe Riak for its utterly extreme scaling capabilities. ^.^
PostgreSQL can be a relational database, a KV database, a document storage database, can write queries in PG/SQL, PG/Python, PG/Lua, among others. Has plugins that can bring in external databases in other systems like they are local tables. Etc… etc…
I heard that you need a monitor tool checking if Elastic Search is alive. Another person told me that they put Kafka before Elastic Search catch high load before hit Elastic Search
Quite controversial but I heard that PostgreSQL is not good when you have to many updates compare to inserts
Yes you can use Kafka for insert, which is something we are going to do, but so far, most of our problems are due to query far more than insert.
This is a bit more complex.
They decided to build their own Database and schema on top of an existing engine.
Basically Uber built their own Key Value store on top of Postgress nodes. It happen that MySQL is better at being a KV store than Postgress, which make sense.
I advice this interesting post about it http://use-the-index-luke.com/blog/2016-07-29/on-ubers-choice-of-databases
Not really, if you read their article and postgres mailing list about it you see that it was only because uber’s initial database design was not efficient, then they could not fix it later on, and the inefficiency in their design hit a base worst-case scenario in postgresql (which they are now working on). Basically if you treat it as a proper KV store with immutable data-stamped data then it will be probably the fastest database out right now. Well even more basically if you do not treat it as something that you can edit secondary indexes on repeatedly it is still one of the fastest databases out. ^.^
We’ve used both PostgreSQL and MySQL (and MSSQL and Oracle) here at work, postgresql was so consistently faster than mysql that we eventually got rid of mysql (well I think it is still running on the old redhat server because of something old there, but still).
It would be quite interesting to see a database written purely in Elixir, or maybe Elixir + Rust (or whatever). Every time I think about starting a new project I think about it. If we had something with:
- Riak core with its ring (Distributed Hash Table) for distribution
- ONLY accepting CRDTs for ease of implementation on the client side and avoiding plenty of concurrency dilemmas
- fully queryable (maybe with GraphQL)
- With a performant data transfer format like Messagepack (one that is schemaless and not schemaful like protocol buffers)
- With persistent connections to leverage Elixir’s strenghts
- Demand driven on high load (GenStage)?
This could be the Mnesia of Elixir. It would follow some of the goals of LASP and its Lasp-lang HanoiDB though could be much simpler configured in mix
Ah nice - good to see all their hard work didn’t go to waste
CockroachDB is a distributed, scale-out SQL database which relies on hybrid logical clocks to provide serializability, given semi-synchronized node clocks. In this Jepsen analysis, we’ll discuss multiple serializability violations in CockroachDB beta-20160829 through beta-20160908. As a result of our collaboration, fixes for these issues are included in beta-20160915 and beta-20161013. This work was funded by Cockroach Labs, and conducted in accordance with the Jepsen ethics policy. Cockroach Labs has also written a blog post with more context.
ArangoDB and Neo4J.
Graphs are awesome!
I use sqlite. But then I make server plus desktop client software that installs like a typical software, kinda like owncloud where you get to choose the database.
No Mnesia love? I liked it back in the day - sharding was nice : )