Itâs awesome! And I think of more interest to our forum-mates would be its CAP theorem characteristics (technically CP, but can be treated as CA). And the following post comes from Eric Brewer himself - the guy who coined the CAP theorem. I didnât know he worked at GoogleâŠ
https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Spanner-and-the-CAP-Theorem.html
And the white paper: https://research.google.com/pubs/pub45855.html
To be totally precise : They are nor CP nor CA in terms of CAP theorem.
-
They only have â5 nines of availabilityâ and they do not define exactly what they mean by availability here (transaction? Uptime of the server ?)
-
They are not Consistent in the CAP theorem proof meaning. They are serialiazable, not linearizable. That is not a big loss, but still.
-
To achieve that, they use a synchronised enough clock, using atomic clocks and GPS clocks. Which is quite the heavy machinery.
-
They âcheatâ a bit on what they allow as operation. No DML operation, a lot of locking. And your response time will not be consistent for request that are the same and all. They claim âACIDâ⊠but that term is loosely defined in term of distributed system and transaction meaning.
All in all : yes it is great. Use it if you want it. But no it is not a real solution to the CAP problem if it was a problem that mattered to you
Basically⊠nearly noone should need that, but everyone will use it because âaaaaaaaah we are loosing data !!!â. This is a great business decision from google pov, at oracle db level. But you probably do not need it.
On initial reading it actually seems very similar to Riak (though not exactly).
Other than being ACID and SQL
Which makes me wonder without looking too deep in to it, what happens when two servers on opposite sides of the world both create the same file with conflicting contents? There has to be âsomeâ communication or resolver, but if using a resolver that happens âafterâ the set then what happens if the servers that made both of those already ran code assuming the data is written or do they have to âwaitâ until a quorum is reached (as riak does) to ensure success?
they wait for a quorum as far as I remember
So it is RIAK but with SQL. ^.^
Something like CockroachDB
CockroachDB is inspired by Googleâs Spanner and F1 technologies, and itâs completely open source.
So Google GCP want to take some cloud pie by big data and machine learning.
Hard to compare Cockroach to Spanner as Spanner is running in production for at least 5 years handling a huge production load in a cluster composed of hundreds of thousands of nodes. Spanner also relies on having proper hardware and network infrastructure that is hard to replicate on your own.
I agree 100%. As I remember from cockroachdb-ben-darnell podcast spanner relay heavy on network time synchronization, what is hard to achieve outside google infrastructure.
But it nice to play with something similar on own computer
They âcheatâ. They use a quorum lock, but they do it faster because they have a distributed clock they can rely onâŠ
Hmm, how does that work? Iâm not sure how I can see that would save on communication time to make sure no conflicts happened at the same time?
It does not, but it enables them to use timestamp, because they consider they have a shared synchronised time that has a finer precision than their machine.
So that simplify a lot the problem in the end.
Hmm, true, still leaves unresolved issues of overall latency that the initial reading of it implies that they mostly eliminated, and there are methods to compare without time (Riak uses vector clocks for example)âŠ
Dude you really just have to waist a bit of time watch the video or waste a bit more time on reading the paper.
Building type systems has been distracting me. ^.^
Plus starting tomorrow, huge project at work being started, my mostly free time will be gone for a bit again (it always ebbs and flows).
Very useful talk they need to make it clearer on the product site that when you buy 1 node you are actually getting 3 replicas across 3 zones.
OH well no magic the cloud strikes again for 30% writes 70% reads to get to 30K tps you are looking at 30 nodes WTF!!! thatâs 20K a month you can get the same with synch commit on PG on 2 3K boxes