How to choose correct database system?

hariharasudhan94 · May 16, 2017, 6:45am

I am going to start new Phoenix project, but i am not clear about choosing proper database for my application. For example, My app fully based on relational database but in Relational Database we have mysql, postgresql , mariadb ,etc, In Phoenix they use PostgreSQL as default, So which is the best database ?

cjk · May 16, 2017, 7:03am

This highly depends on your workload. Most web applications are suited
very well with using PostgreSQL, but there are cases where it might make
more sense to use something different. So as a rule of thumb: use
PostgreSQL if you are planning to write a basic CRUD application.

IMHO, of course.

cmkarlsson · May 16, 2017, 7:07am

From my point of view Postgres is the better choice. It behaves more like a database should and is built on a sound foundation.

MySQL might have more tooling, better support in the cloud and there are likely more people with MySQL knowledge out there.

My advice is if you don’t have a strong preference for a particular database: choose postgres.

Hoegbo · May 16, 2017, 7:12am

If you need a relational database and have to have deeply nested datastructures, Postgres will get the job done. As for best that would vary from project to project. also because you kan turn postgres to a KV store.

If you should find you didn`t need relational data. or that at one point you need flat filesystem, or a in memory system for perfomance. Maybe you even want to combine disk based and memory based storage for optimized read speeds.

Starting with the database decision at the beginning might lock you into that decision later ,especially if other people are involved.
Use ecto or make a layer that wraps data to any data source and decide what to use sometime before deployment.

Especially with larger projects the designed software is usually never the actually built software.

hariharasudhan94 · May 16, 2017, 7:36am

thanks a lot, I started using mongodb as database, later i find lack of transactions in mongo, so i plan to shift relational database, my app involves lot of read operations, as per your suggestion i will try postgres

marcuslankenau · May 16, 2017, 8:08am

I have seen big applications work fine with both postgresql and mysql. And I have seen a lot of failure with MongoDB. Using anything besides the big sql dbms is an experiment that can work fine or fail. Just my $0.02 to that topic
Of course there are challenges with postgresql/mysql in production, like migrations of big tables but there are solutions as well. I personally use postgresql with some jsonb to store structured data but beeing preprod I cannot say how well that will work.

svsdehh · May 16, 2017, 8:41am

we are using mongodb successfully. and there are probably many failures with postgres and mysql, too.

but if any one want to use elixir and can choose the database one should not choose mongoDB in the moment, as the driver support is subpar.

marcuslankenau · May 16, 2017, 10:02am

My intention was not to say that MongoDB is not a proper database and you cant fail while using mysql/postgresql. I think it all depends on what you are about to to do.

hipertrix · December 7, 2017, 10:58am

Hi, @svsdehh
I’m starting a new project using mongoDB and Phoenix.
Do you have any experience or path you can share?

brightball · January 18, 2018, 4:56pm

It definitely depends on your needs but Postgres is the best starting point for most web applications. It is flexible enough to handle the needs of just about any style of web application without needing to reach for outside tooling, can be tuned endlessly, has custom data types for just about everything and as you grow there are EXCELLENT scaling options like Citus (for multitenant) and Aurora (for most everything else).

If you have an application that would heavily depend on one specific piece of functionality, it will make sense to pull in a 3rd party tool but one of the other great things about PG is that is makes it extremely easy to hook to outside systems and push data to them directly from the database.

It’s harder and harder to justify anything else as a starting point for a web project.

http://www.brightball.com/articles/why-should-you-learn-postgresql

psabhay · January 18, 2018, 11:25pm

Thanks for the important information you just shared. After reading all the comments I can clearly see that the Elixir community is heavily inclined towards Postgresql. I, being new to Elixir and Postgresql, have no idea why. But I trust your advice and have started trying out Postgresql with Phoenix. And here the first problem appears. I want to use full text search. I went through your blog post but didn’t succeed to make it work. The schema complains invalid or unknown type :tsvector for field. I have done the indexing for full text search following the official postgresql documentation page. It would be great if you can help me out with some sample code or point me to some other place on internet where I can get some more details about how to do full text search in postgresql + Phoenix?

brightball · January 19, 2018, 3:27am

The actual Postgres documentation is the ideal starting point. If I remember correctly it’s about 4 pages that walks through setting up full text search.

You should definitely go through it with straight SQL the first time you do it just to see how all of the various parts work. I’ll see if I can find a Elixir focused write up.

My blog post isn’t really a walk through, so definitely don’t use that as a working basis.

In terms of the community, PG is the default for Phoenix but your database is all about your project needs. PG just happens to cover so many of those cases that it’s an ideal starting point MOST of the time.

Learning a new database and language at the same time can be asking a lot though. I’d opt for one at a time, personally.

kjwvanijk · January 19, 2018, 3:46pm

I think the first step is trying to figure out if you need a database to begin with. then when the answer is yes you need to figure out what storage solution best fits your problem domain (Document, key-value, relational, etc.) only after you answered these questions can you make an informed choice about what DB storage solution to choose. And I think you can only realistically know this after you have at least implemented some of your business logic.

Some nice insight from someone working a lot longer than me in this inustry

Uncle Bob -No DB

leifericf · July 30, 2018, 12:59pm

Disclaimer: I’m by no means and expert in this area, but I would like to share my thoughts to contribute to the discussion and possibly get some input on how I’m mistaken in my reasoning. Please continue reading with a skeptical attitude; any corrections will be greatly appreciated.

My current understanding is that the particular database technology one ought to choose depends largely on on two primary factors:

1) The essential structure which is expressed naturally by the data. Example questions: Is the data you want to persist deeply nested or relatively flat; relational or non-relational? Textual or binary?

Asking and answering such questions about the size, shape and interconnectedness of data entities, might help in choosing between the broadly categorized SQL and NoSQL databases.

2) The context in which your data will be mutated and consumed. Example questions: Is the data ephemeral? Will consuming applications only do reading, or also writing? How frequently will data entities need to be read and written? Will the database need to be cached and/or replicated?

Asking such questions about how the consumers would interact with your data, might help in choosing the best possible way to persist and expose your data. If your data is ephemeral, for example, you might not need a database at all; you could perhaps get away with keeping the data only in memory, and rebuilding it when/as necessary (e.g. directly from disk).

If your Phoenix application is a typical “CRUD application,” then I suppose the best bet would be to stick with PostgreSQL, which is the default for a reason. Depending on the particulars of your Phoenix project, you might want to opt for a different kind of database for the whole application or maybe just for certain parts of the application.

Also, based on my limited knowledge (I’m still in learning the ropes), you might be able to use Ecto as a database abstraction layer, such that you could relatively easily swap out the underlying database with something else in the future, if need be. If so, you could defer the decision of which database to use until you have gotten some real-world experience and knowledge (measurements) from your application running in a production setting.

I don’t know if this makes sense, but I hope it might contribute to the discussion.

andre1sk · July 30, 2018, 1:31pm

I think given the current state of things I am hard pressed to come up with a use case outside of caching that would specifically call for NoSQL solution.

leifericf · July 30, 2018, 2:00pm

I can think of some potential use cases:

In the games industry, I have seen NoSQL databases used with great success for large-volume semantic logging, using REST over HTTP. We would “fire-and-forget” log messages from millions of game clients and servers into a NoSQL database for later analysis. We implemented it using NHibernate, so that we could swap out the underlying database to compare Oracle to CouchDB and MongoDB. This was many years ago, though. It might not make sense today.

Another use case, which I’m currently considering, is to leverage CouchDB’s unique replication capabilities to replicate certain kinds of data between our ~400 retail stores in the Nordics, whilst supporting off-line editing and two-way synchronization between nodes. In combination with “CouchApps,” this would allow applications within each store to continue satisfactory operations without a stable internet connection. We would also get document versioning for free out-of-the-box.

A third example (as you mentioned) would be using a Redis NoSQL database as a caching mechanism of sorts. This seems to be fairly common these days, although I haven’t tried that myself (yet).

Of course, such challenges might be possible to overcome with SQL databases as well, although one could argue that NoSQL databases could be better suited for certain kinds of problems.

andre1sk · July 30, 2018, 5:08pm

Yes logging is often another use case where it can make sense.

Hard to comment not that familiar with CouchDB replication capabilities. This is not a very generic use case though.

brightball · July 30, 2018, 6:00pm

The primary use case for CouchDB is “database over REST” and data syncing over remote nodes that can become disconnected. CouchDB and PouchDB pair up well for that use case. It’s a popular solution for fully-in-browser applications.

But you’re right, other than very specific use cases it’s hard to justify starting with anything other than Postgres because it fits “most” of them. Postgres can be interesting for logging too, depending on how long you want to keep the logs. One of the features people utilize Mongo for for logging is capped-collections that don’t grow beyond a certain size. You can accomplish the same thing with Postgres by using a sequence on your primary key with a mod value so it keeps rotating between the same set of keys and just overwrites 1 when it gets back around to it.