I had a talk on the subject about a month ago on a conference in Warsaw/Kraków. I don’t believe the videos are up just yet but I’ll link here when they do.
In short: the answer to Joe’s question is - because of several factors.
- Our stacks have/had limitations. Poor concurrency and synchronization pushed people towards using relational databases as the default back-ends.
- Our tools tell us to do so. If you generate Rails app, it - by default - requires that you set up database. Again, the default.
- We’ve been told to do so. At university, by colleagues. Everyone. This is something we no longer think about.
I hired a programmer a few months ago, and gave her task to build some software to help me manage VPS instances that run in the cloud. UI to start/stop/scale etc. I did not even think we need a database here, but the next day she did come to me with 2 seets of paper with UML diagrams of 1) database 2) classes that largely map to the database tables. And she was in deep shock when I told her to simply use the API AWS provides as a back-end.
So yeah, it’s a combination of factors that resulted in programmers thinking about having a relational(or NoSQL!) database.
There is also false notion that using a relational database will be faster than storing stuff in memory. I suspect it is because in languages that force you to create everything and drop everything during the HTTP request lifecycle this actually is true.
But hardly anyone has a proof that this is slower. People just assume. Don’t map/select stuff in memory, use your database to filter records mantra thing. Largely because agan - it is true in many, many cases, and because of the tools. Say you use ActiveRecord - where creation of objects/destruction is increadibly expensive and slow. The in memory filtering of stuff will be slow. And you have no really good way to persist those objects between requests, so you are forced to do so.
Getting rid or limiting use of database, is only possible when you do a mental shift. Shift from having a stateless back-end to having a stateful back-end. And you need proper tools to do so. Ruby won’t cut it. Elixir/Erlang will.
There are excellent examples of such architecture. You can orchestrate one yourself too with GenServers and storing/reading state using term_to_binary/from_binary and saving to files as you please. A ready to use “framework” would be here:
Good example, it actually does use PostgreSQL but not in the usual way. Your domain is modeled in memory. Database is used to store events/audit trail, and it can be used to generate projections used by read state. But it does not have to. I was thinking the read projections could be exactly the same as domain models - just kept in memory.
There’s another false assumption people do have about keeping state in memory. They think this will eat up a lot of memory and they won’t be able to do it. I have heard it multiple times: “but we can’t just keep it in memory, it won’t fit”.
Well, this is false because:
a) computers have crazy high amounts of RAM these days
b) if you want to use your SQL database in performant manner, you need to make sure the data fits into memory anyway
c) you don’t have that much data - do you?