Using Elixir Processes To Persist Data (No Database)

Woody88 · May 22, 2017, 7:14pm

So I have been reading this post Create Phoenix based application without database but with Ecto very interesting what learn from it. I also purchased this book Functional Web Development with Elixir, OTP, and Phoenix, unfortunately its still in beta. I have few questions about some concept that they talk about in the book, but that will be for another post.

Now there are few things that I would like some of you to clarify for me please. The other posted did not really answer some of these questions(or I did I miss the answer?):

If you use Elixir processes as a mean to persist your data (save your data), it is because you are assuming that your system/application will always be running. What are the concept/principles applied if the system/app crashes or needs to be restarted? Pretty much I have users info, some transactions data how to I retrieve them back? (how do I return to my original state before the crash)
What are the questions do I have to answer to know that I do not need a database? Or, how do you gauge that a system/application does not need a database? Since I do not fully understand the power of the processes/memory I do not know how to go about to make the decision to not use a database. (I’m kind off a visual person… I need to be able to form a sort of link to the concept.)
What are my advantages/disadvantages when using a statefull server(I believe this is the term used when not using a database to persist but memory)?
Is it ok to keep sensitive data in memory? Something tells me yes because I think that it would be harder to directly access the memory from outside…?

I don’t think the title is right if anyone has a better suggestion please do say so. I just want specific answer related to not using a database or other means instead of always using a database to build an application. As I’m reading the book, reading posts, and viewing some videos. Its seems to me that Phoenix is trying to refrain developers from automatically modelling their system or app based on a database but to think more of what is actually needed. The problem for me is that I lack of knowledge in both areas(memory vs database) i’m not sure on how to make these decision.

Can anyone help? Thanks in advance.

OvermindDL1 · May 22, 2017, 7:43pm

If you don’t want to lose the data, serialize it out somewhere. ^.^

Do you need to keep data around between restarts? Do you need to keep more data in memory than about half the size of your ram? Then you need to serialize it out somewhere, whether to a file or a database or something.

Data access is fast! Not really fast enough for a human to notice in most cases though.
It is easier to deploy.
Not really a lot of benefits, the easier to deploy one is probably the best.

Tons of disadvantages, especially the ability to Query and the not worrying about restarts losing data.

If you have to access it at any time then it has to be loaded into memory eventually in any case, so that does not matter. If someone has root/debug access to your process, well, you are screwed anyway, they can get anything, lock down your box better. ^.^

Woody88 · May 22, 2017, 8:26pm

Then why not just use a database in the end? Is reading from a file faster than a database?

Then what is the point of introducing it has an option for persisting data?(Let’s think aside from very very small project.)
If accessing fast and deploying fast are the only advantages, then based on your answer I do not really see why someone would ever module their app in such a way… I feel like I’m missing something, do you have a real life example where you would use this concept ?

Thanks!

OvermindDL1 · May 22, 2017, 8:32pm

Not usually. Database’s do a lot of optimizations.

Most don’t, most servers (not just elixir but anywhere) use a database.

Something real that I’ve made before was a simple file server with nginx, like my blog.overminddl1.com site, no database on it. ^.^

peerreynders · May 22, 2017, 9:15pm

The short version: “your database is not your application - in fact it’s just an implementation detail.”

The motivation behind this thinking is expressed to some degree in

Robert Martin: No DB (and more recently A Little Architecture)

One approach would be to simply log all the data changes to a file (system) and on startup “replay” these changes in the system to ultimately reproduce the system’s state before the crash. Event sourcing is essentially a refined version of this approach.

The issue is that a database is always assumed to be part of most applications, when in fact it needs to justify it’s existence just like any other part of an application’s architecture. All too often the database is used because it is already there or because it simply makes the developer’s (who are already are familiar with it) lives easier - without considering the full potential cost and limitations that the database may burden the overall application design with. Even if you do have a valid reason for employing a database - not all data you are dealing with has to go through that database; while data storage may be cheap, managing some non-essential data could impose an additional complexity cost (like having to archive it in order to keep the database at peak operational efficiency).

That is not my understanding. RESTful Web Services (p.86):

Statelessness means that every HTTP request happens in complete isolation. When the client makes an HTTP request, it includes all information necessary for the server to fulfill that request. The server never relies on information from previous requests. If that information was important, the client would have sent it again in this request.

So in a stateless implementation of the FWDEOP’s Islands game the client would send the entire game board and the current move to the server and the server:

would process the move to generate the updated game board
pass the new board to the opponent (via webSocket or polling)
respond to the last player with the updated game board
and then the server would “forget” about the game board entirely.

So “stateful” simply refers to the notion that client (and session) specific state is stored on the server - it does not refer to the means (process, database, flat file, etc.). In the case of Islands the state of the game board is the shared state between both players (clients). Using databases to store client state can quickly become an impediment to scalability when you are dealing with a large number of simultaneous clients.

mkunikow · May 22, 2017, 10:11pm

Maybe something helpful
https://www.confluent.io/blog/making-sense-of-stream-processing/
https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-apache-kafka-whats-connection/
https://softwareengineeringdaily.com/2017/05/02/data-intensive-applications-with-martin-kleppmann/