"Smartening" up Postgrex for YugabyteDB

Folks,

I’m having a serious look into YugabyteDB (YB) and engaging with that team on several matters. In some respects YB is plug compatible with PostgreSQL (currently still v11.2 going on v15) but they’ve all but replaced PostgreSQL’s storage layers with their own in order to achieve a performant and ACID compliant distributed database as opposed to PostgreSQL which supports read replicas but is fundamentally single-master for writes.

I personally believe there’s a lot of natural synergy between Elixir/Phoenix and YugabyteDB. Both environments are edging towards the distributed approach and horizontal scaling. I’d be a lot more confortable with my applications future scaling opportunities and challenges if my Phoenix code isn’t still dependent on a vertically scaling monolithic database server architecture but rather something like YB that also scales horizontally.

On a practical level, Postgrex works unmodified against YSQL, which is YB’s PostgreSQL compatible layer (it also has YCQL which is a Cassandra type NoSQL database and YREDIS which is their Redis version, both also inherently distributed).

However, being a multi-master distributed database YSQL allows clients to choose which node of which cluster to connect to and to fail over to others if that should become unavailable. That is something Postgrex is blisfully unaware of.

YB is also Kubernetes-aware so that’s one ddeployment option. It’s also the easiest way to have some semblence of load balancing across the nodes. But running in that mode has drawbacks such as that it makes running custom extension almost impossible. Without K8s in the mix the load balancing would require hardware or a cloud provider’s facilities.

To help clients get better use from their DB YB themselves wrote a number of what they call “smart drivers” for some popular platforms (only Java, Go, node.js and Python at this stage) which are aware of the nodes and clusters and can distribute and application’s connections accordingly. Unsurprisingly Elixir/Phoenix/Ecto/Postgrex didn’t make it onto their list of platforms to provide drivers for, so now I’m trying to see what it would take to conjure up a version of Postgres that can be configured as such a “smart driver” for YugabyteDB.

For the sake of simplicity I’m hoping (probably in vein) that all it would take is just a well-conceived Postgrex extension but I’ve no idea if such extensions can involve themselves in one of the connection methods I’ve seen in the code or not, and even less of an understanding of how Postgrex goes about its business to even guess at where to start.

May I please ask that someone familiar with Postgrex and/or the mechanism and limitations of Postgrex Extensions discuss this with me. If it’s anywhere near my capacilities I’ll be more than happy to write the code and present it as a PR but I don’t want to go barking up the wrong tree.

1 Like

Hi, Marthin. Postgrex extensions are for types, so I don’t think they help for your use-case.

I’m not sure exactly about what you would be implementing, but if I understand correctly it’s some sort of connection orchestration. This could be perhaps achieved by implementing an alternative to DBConnection or EctoSQL adapter while still using Postgrex for communication with the database.

1 Like

Skimming through the Node driver’s documentation, it seems like the changes are around connection handling:

In particular, the loadBalance flag and the topologyKeys list.

IMO a good place to start looking would be at the boundary between DBConnection and Postgrex, specifically where the needed metadata to make a connection is assembled.


I didn’t check all the other smart clients - are there other, query-specific choices that could be made (similar to the prefix option in most Repo functions)?

1 Like

Ah, yeah, to my seriously untrained eye everything between my ecto repository and the database is just “Postgrex”. If I’m picking up the right vibes there’s quite a few components involved and if Postgrex is only the part that formulates queries etc then that would be the part that is least affected. Thanks for that insight.

I can confirm from my understanding of what makes the YB drivers different that it most certainly is all about the connection and not at all about what happens after the connection is made. Possibly connection pooling is involved as well, but I guess that’s part of DBConnection, right?

I’m clearly out of my depth here - didn’t even know about the existence of DBConnection. Any reference to where I can pick up the thread about that would be much appreciated.

Ultimately what is required is a way to specify not the usual single address where to connect to the database but a set of addresses / credentials arranged in some priority order so the driver can spread the load across multiple servers and have alternatives for those at hand if they end up not responding.

I suspect the main parts to implement would be the repo definitions in the environment files and the some logic that keeps track of which of those db endpoints are “live” and to spread the pool of connections at least evenly across the live db endpoints. That’s on the assumption that the connection pool is used in a way that sees all slots being used equally. If the reality is more like the pool always gets used from one end so how many slots are used depends on the overall load, then we’d have to introduce the round-robin (and possibly queue length) monitoring around a little. The objective remains to send the Ecto traffic through not one but several possible database “heads” in a way that makes sense. For example, if a client would benefit from going through the same connection to the same server session as the last time then we should make that happen, but if sessions are cleared out completely at the end of a transaction then there would be little to. I gain from such sticky connections.

I’ve not looked yet but I sense we can probably get a fair idea about what config data the original YB drivers found useful or essential and mimic that.

@chrismccord It’s hard to guess where your experiments with YugabyteDB (I noticed on the YugabyteDB slack channel) will lead to but if it’s anywhere near Phoenix and Ecto you’ll probably face this issue as well before long. I’d love to take your thoughts about it on board before we go of on a silly tangent.

I’ve compared a few of the special drivers now, drew some conclusions and figured perhaps we might want to approach it a little differently.

Firstly let me put it out there what’s at play here with these smart drivers and YugabyteDB. YB sets itself up to run in connected clusters they call universes which replicates synchronously on the assumption that though a universe may have geographically distributed clusters the network latency between them are low enough. When latency is too high they have an asynchronous form of replication called xCluster which trades more tolerance for latency for lower performance overall. But the name of the game is to enable very high write and read performance by distributing the workload across multiple computers. The workload involved can be a mix of loads of fast requests coming from many clients and long running requests of any origin with the work getting shared over multiple nodes in the cluster.

The special connection parameters to YB’s
“Smart” drivers is normally

  • a flag to turn load balancing on or off,
  • the usual name or address at which to find database server gets a backup address in case the primary isn’t available, and
  • the connection specifies which set of servers it prefers to work with in the form of a list of region+cluster+node “wildcards” listed in order of preference.

Importantly, apart from the alternative address these parameters only really come into play once the initial connection has been made. Once that happened the driver is meant to query the universe it’s connected to’s topography from the server it connected to and then initialise its connection pools in accordance with the preferences specified in the topology parameter compared with what the metadata has to say about which actual regions, clusters and nodes are available for service at any given time. By overridable default that metadata is refreshed every five minutes and the pools adjusted accordingly.

That seems to be the general modus operandus of a smart driver for YugabyteDB.

Secondly I have to mention that YugabyteDB positions itself in the high performance general purpose big data corporate database space. Nothing wrong with that. But it does mean that among their base assumptions they expect to be serving a multitude of different applications running mostly on independent clients which often means their personal computers.

The Elixir/Phoenix/Ecto applications we’re all about on this forum are nothing like that. We write applications serving those end-users by running on servers and accessing the database servers directly on behalf of the client. We also scale arrange ourselves to scale horizontally so we can server more clients simultaneously. Most of the time though, our application servers would run very close to where our application’s database servers are being hosted if not on the exact same machines.

*It’s an open and/or interesting question whether our distributed, load balanced applications have anything to gain from staying alive and accessing data from more remote servers rather than letting the “death” of our closest database server take the instances of ourselves it serves down with it so that the load from our clients would get diverted to other nodes that can get the job done quicker with local data access? *

It’s almost akin to the principle of failing hard and early rather than trying to patch things up and recover from unexpected input or conditions on the fly.

It strikes me that it might be useful to distinguish between an application running against some corporate database it shares with many other applications and end-users accessing it directly through other means, versus an application and database pair that’s largely dedicated to each other. Neither is right or wrong, but at the upper end of scaling and performance in the face of really large numbers both on the active user side and the amount of data involved, the two kinds of application-database pairing would result in dramatically different strategies being used.

How it relates to Postgrex for YugabyteDB is perhaps less dramatic, simple even. If such a driver/adaptation is available it would be easy for both types of application to configure database access appropriately. But applications using dedicated and colocated YugabyteDB databases could potentially assert sufficient control over the database connections and load balancing from the application layer itself. It would be a load of extra work and complexity to deal with but it can be done. Applications that needs to run against a distributed set of database servers they share with many other applications and end-users would have a much more pressing need to use a YugabyteDB-specific version of Postgrex(‘s DBConnnection and connection pooling implementation).

I’ve only been playing with Yugabyte In the context of fly.io, and I think Fly removes the need for any extra handling on the postgrex/driver side. For example, with our DNS, you can do top1.nearest.of.my-yuga-cluster.internal, top2.nearest.of.my-yuga-cluster.internal, etc. So you can get closest replica connection and/or redundant connections out of the box. If you need to spread out connections across replicas, you could start multiple repos against each replica and then random load balance across them with a Repo interface that proxies the repo requests to the randomly load balanced one, but I’d only reach for that if you run into replica bottlenecks. For my needs, running a few replicas with app instances by those replicas naturally spreads the load so I haven’t had the need for inter region/zone load distribution.

3 Likes

You may also want to check out put_dynamic_repo: Ecto.Repo — Ecto v3.11.2

With yuga’s built-in connection pooling, you could create a new connection per client and tailor that session with yb_follower_read_staleness_ms specific to each client. You can also handle the follower reads at the standard repo level by setting the follower read things in an after_connect callback for each connection in the DB pool.

2 Likes

Thank you, that could become very useful in time.

On balance of evidence I’ve chosen not to adjust my plans to leverage or accommodate YugabyteDB in any form until further notice. They’ve chosen their target market which sets their priorities and focus but neither my application nor my organisation can be regarded as part of their target market. If you’re so inclined you’re welcome to go through my rationale on this with me before finalising what you present on the 10th.

Not everyone needs distributed SQL, so it’s fine if it didn’t fit your needs. For posterity for others, you can also use Repo.checkout to check out a connection, set the yb_follower_read_staleness_ms and do the rest of your queries in that checkout. This lets you scope per query staleness without dynamic repos. I’ll show this stuff off next week in my screencast, as well as some other neat things like stale, but consistent reads for each user by tracking side effects.

2 Likes

Yeah, reacting to staleness is a luxury problem. Despite both of them being horizontally scaling globally distributed systems there’s a impedence mismatch between my application and YugabyteDB. The impact on my design, even if I stick exclusively to the open source version completely overshadows any potential benefit I could expect to derive from using it. I wish it wasn’t so because on paper it reads like a match made in heaven. Reality’s been a real party-pooper in this case.