I am looking through an existing code base and saw database lock being used. Alarms started sounding off in my head. Isn’t database lock bad practice ? Am I over worrying? If I had multi processes editing the same row I would put an actor / process in front of the database and all processes will have to go through it to execute the updating. Is that a good idea? Or are there other ways to avoid using locks
it all depends…
and mostly comes down to guarantees - I’ve used DB locks in the past for selling items with limited quantity… ensuring that you never oversell…
same could be done by having a single GenServer or similar queuing it up - but here you end up with more loose guarantees, and what if you have multiple servers, clustered?, cluster split?, green/blue deploys? all sorts of complexities…
so depends on your philosophy - I like to have absolute requirements/guarantees in the DB layer, so no matter what is going on that guarantee is enforced…
what are the locks for? something related to money/important stuff - or less important things?
Yes something important related to money. So from your answer I assume there is actually nothing essentially wrong with database locking. Is there every a chance I will land up in a dead lock situation? Cause if it’s managed on the otp level, process can fail fast. But if database fails. Someone will need to restart it or something
If you’re using postgres, locks are tied to transactions, so when the Elixir process crashes due to a timeout, the transaction aborts and the lock is released. Unless the code is going out of it’s way to use session based advisory locks, you aren’t going to end up with the database itself locked up.
TLDR; A deadlock is not a database failure any more than “row not found” is not a database failure. Modern database will detect a deadlock in nearly all cases and return an error to you - but careful app design will avoid deadlocks in most cases. Avoiding deadlocks does not mean you can avoid unexpected database errors at any time.
As @outlog say, it depends on the guarantees you require. A lot of applications consider the database as the source of truth as a given point in time. Since there is typically concurrent access and update to a database then the guarantees you require need to be considered in that context. Its not typically a major issue if the count of the number of comments on a post in a social system isn’t accurate. But as you said, when money goes missing its bad.
Typically what you are looking for is an appropriate level of transaction isolation. That means what state does the data base needs to be in when a transaction starts to update it. And what state is visible to other transactions when I am doing the update.
There isn’t a one size fits all and so while “optimistic locking” (which isn’t really a lock in the database sense at all) is a common pattern its definitely not appropriate for all use cases.
imho a DB lock where needed is a safe, great, strong and simple tool - of course only use them where absolutely needed, and be aware of the potential bottleneck, and design around that if needed (eg don’t sell 1 million game consoles on launch day through a single DB row lock )
well anything can happen, but I’ve personally never seen it happen, and I’m unaware of how exactly it should occur, you open transaction, lock that row, do stuff, close things down… I assume if things go wrong there are larger issues at play…
Let’s say one process does a
SELECT […] FOR UPDATE. Other processes wait for the first to do the actual update so that they can take their turn. But the first one is dragging far too long. What then? You wrote “if it crashes due to timeout”. Which timeout did you have in mind? One of the PostgreSQL defined ones? Like
idle_in_transaction_session_timeout? Or something else? IOW how does one ensure that a) locks are not held “forever” and b) that other processes do not wait for lock acquisition “forever”?
You can tell Postgres how long to wait on locks with lock_timeout.
In any case Postgrex (the elixir Postgres driver) will also abort a query that is taking too long (also configurable).
Right, I can configure PostgreSQL DB with the documented set of settings, which include some timeouts. Still more interested if Ecto or Postgrex have mechanisms that allow some control/fallbacks on the application level. Other than manually manipulating Postgres runtime variables, I mean. You wrote that Postgrex has some, right? Quick checking through the docs didn’t reveal anything though. Any link(s)?
My last post has a link in it to the Postgres docs.
It’s not possible to manage locks at an application level - the necessary lock state is in the database, so has to be managed there. The only thing the app level can do is abort a statement if it takes too long.
(I’m in a mobile device so apologies for the tersness)
Sure, no problem!
In general I know PostgreSQL docs and its runtime variables but I still can imagine some supporting mechanisms on the application level, even if implemented only as wrappers on the PGSQL ones. I was curious about this part:
And if the timeout mentioned by @benwilson512 is the DB level one or application level one (like e. g. what you mention).
The one mentioned by Ben is the Postgrex timeout.
There are timeouts enforced by both postgres and Elixir. Postgres is generally configured to terminate queries after a certain amount of time. Also Ecto will terminate a process if it takes too long. What’s cool is that because Ecto terminates the actual DB connection, this guarantees that Postgres will also release the lock at that point because the connection has been terminated.
A slightly different scenario… let’s say one needs to
SELECT [sth] FOR UPDATE in order to prevent other processes from changing the data while he works with it. This is a part of transactional chain of operations that need to either all be successful or be all rolled back. Now, in the rollback case, the update will not be done but the lock needs to be released. When done as a transaction that should be taken care of, right? So a
Ecto.Multi doesn’t seem to provide
SELECT [sth] FOR UPDATE equivalent. Should
run/3 function be crafted instead? Will that work? Other suggestions?
run will work, and it’s probably your best option.