A Database-less App Framework?

I suspect many apps these days have enough memory available for many tasks.

The problem with applications is that they crash and lose memory, so you need some storage, but we have Mnesia etc.

After the recent talk about component and design etc. I started thinking about this.

Could you have a ‘Phoenix’ with an Elm / Redux like architecture and backup state every so often to disk (automagically).

Of course you could include DB reliant features, but by default state would be backed up in the background.

I suspect that would simplify things quite a bit (perhaps too much - but at least you could get up and running quickly).

Perhaps this already exist in other frameworks?

4 Likes

You might like this post from Joe Armstrong:

And the discussion that followed:

:003:

7 Likes

Yeh his words have stuck with me - I remember him saying this years ago…

Databases make things tough, but we kind of accept them…

2 Likes

Well, Phoenix itself is totally dbless, just remove Ecto and :tada:.

The automagical state backup would be a nice thing to have though!! But I would prefer it at least a little bit explicit and not embedded to a framework, a new lib for that would be really interesting. :slight_smile:

3 Likes

What you are, most likely, looking for is something like Microsoft Orleans in Elixir:
https://dotnet.github.io/orleans/

There is none, I think in Elixir, but here is something in Erlang that we can possibly use: https://github.com/SpaceTime-IoT/erleans

I didn’t try it, and I suspect it might be in early stage. But such “framework” would allow us to avoid many design issues such as persistence etc. This should be rock-solid base to build in, that will allow us to focus on building the application and not do infrastructure tasks like saving/restoring state, which is also error-prone.

3 Likes

In a way that’s what riak_core and now lasp are - your application is your database.

3 Likes

Who needs state at all? Much of Phoenix is stateless : ) and that’s something I really like…

1 Like

Yeah! I share your thoughts. Unfortunately, a lot of the products people value nowadays are a stateful solution to problems that could be solved by stateless ones. I keep forcing myself to find stateless solutions to the problems I have, but unfortunately I mostly fail. Sometimes I even think state is really an inevitable evil for almost every solution. :confused:

2 Likes

IMO a lot of the RDBMS inertia also comes from the added value that SQL can bring – namely reports. Business people, legal teams and finance departments just LOVE reports.

5 Likes

GraphQL gives similar flexibility, without needing a database. ^.^

1 Like

Yeah but how will you do the various aggregations, means etc. that a professional SQL dev can do? GraphQL is just an interface; something below it has to actually create the report.

1 Like

We are talking about state here. In my opinion if you have business data that needs to be persisted you should be using a SQL database. I also think that you should design and manage this outside of an application context. I.e use SQL scripts for DDL and migrations.

The business data is likely to outlive your application by a big margin. Having this completely separate from the app will make it much easier to maintain in the long run. Perhaps after 5 years you want to use a completely different framework. Your business data though is still the same and by having it managed outside of the application makes this transition much easier. You can also add multiple different applications to your data without having the design choices of one framework affect any other.

On the other hand. If the data is only there to help run the application and not a thing in itself then the database-less approach is interesting. This auxiliary data can be stored in many different ways and can be quite tightly coupled with your application as it is more there to help the application than being data you care about in the long run.

A third type of data would be ephemeral data which can be recreated at any time. This is usually stored in memory can can be recreated when needed.

Likely your application will contain all three and there may some overlap between the three. What I think is important is that business data and the auxiliary data is loosely coupled. On is the core of your business and the other is just there to help.

8 Likes

Object databases are great when you know the shape of every query you will ever run on your database. Otherwise, it can be very slow and laborious to write those reports, and many reports won’t be worth writing at all, when you could have done the same thing in 5 minutes in SQL. You can also run into major performance challenges when you simply need to group your data differently from how it is stored in an object database.

In most domains I’ve worked in RDBMS is the right choice for these reasons. Even specialized use cases where you can make a strong case for a different model, can turn out to be more trouble than it is worth in the long run.

6 Likes

I am not disputing the three kinds of data you are talking about but it’s my experience that most project members opt for a singular storage to avoid potential complexity. That’s not an invalid concern.

I guess we are starting to tread into the configuration storage again as a by-product of this discussion, though.

1 Like

and that is fine. As long as they are somewhat logically separated and not coupled with the business data (i.e many foreign key constraints and other relationship between tables).

Yes, configuration also comes in many multiple layers. I’d say that the same separation works here too. I.e business configuration, auxiliary configuration needed to just run the software (ports, number of processes, etc), and ephemeral configuration (temporary log levels, tracing)

1 Like

The reality is most commercial apps need some sorts of database, not files. Files-based storage were there way before relational database, files-based storage didn’t work for most business companies and corporations. That’s why folks like Boyce and Codd invented relational database, and C.J. Date preached about it. That’s why Oracle, Microsoft and IBM makes billions of dollars on it, because they help companies and corporations to do their business.

Yes, relational database is slow… We know it really well. Guys with 50 years experience know it, fresh-grads know it. That’s our job as engineer to tackle with it. Plenty of way to do performance optimisations.

2 Likes

In fact, the reality is that most sizable apps need more than one storage mechanism. Keeping everything in database may lead to disaster, keeping everything in files as well, not to mention keeping everything in memory.

If the commercial app is big and complicated, it usually requires some sort of relational database, some (possibly remote) file storage, something to record events quickly and something analytical, and possibly a persisted caching layer…

That’s something to remember when you try to build “database-less” application: you most likely will need database anyway at some point, do not exclude the possibility that it may be beneficial to save some stuff in database, and query it as well. I guess we have to be pragmatic about what we use.

4 Likes

Okay, well not always. Try implementing in-memory joins / mapreduce on your virtual actors to filter them out to some subset matching given criteria, and then compare the speed to your SQL query that’d do the same. Or implement a full text search and compare it with elasticsearch.

Generally SQL/database will likely to be quicker when you need to deal with querying large data sets, and in-memory solution will be quicker to access to individual ones.

9 Likes

I have thought that a database-less framework would be a good thing to have and for a lot of cases just having enough in memory replicas would be fine. you could perhaps have a follower process backing data up to a db to deal with catastrophic failure but in normal operation the DB simply gets no writes at all.

I am currently working on a project that experiments with such a solution. temporary erlang processes that die when a machine dies are replaced with entities that are transparently moved between nodes when their host node dies. These can still be thought of as actors but there are a few interesting difference between normal processes. i.e. don’t use self()

That is certainly a consideration for many cases and probably I’m first thinking about tackling the state that doesn’t live that long. In game state that needs to live for perhaps days but after the game has completed only results are needed

4 Likes

I suppose, what I was thinking about, how about changing the default away from DB storage…