24) ElixirConf 2017 - Elixir + Neo4j - Regina Imhoff

axelson · October 9, 2017, 8:03am

Posting a little bit late today but here you go:

ElixirConf 2017 - Elixir + Neo4j - Regina Imhoff

Neo4j is a non-relational graph database with its own query language, Cypher, which means it doesn’t work with Ecto. However, graph databases are great at modeling social networks. You will be learning how to combine Elixir Phoenix with Neo4j to make a clone of a popular social networking site with real time updates to the social graph.

Audience: Intermediate

All talks are available in the Elixir Conf 2017 Talks List or via the elixirconf2017 tag

josevalim · October 9, 2017, 9:13am

I have always kept an eye on graph databases and it is really exciting to see some work in using them directly from Elixir!

AstonJ · October 9, 2017, 12:35pm

I’m really interested in graph dbs too and I watched most of this talk when it was first uploaded. Great talk Regina, if you are reading this please join up as I might need to pick your brains one day

JEG2 · October 9, 2017, 1:06pm

I agree that the graph databases are interesting and this was a solid introduction. I did wonder if skipping Ecto in this case could make interactions more natural.

AstonJ · October 9, 2017, 1:37pm

I love the idea of using separate DBs for different components of your app - so say Postgres for a registration system, Neo4J for a user r/ships system etc

I’m hoping @PragDave will cover something like this in his next course as I’d love to see how his approach would tie something like this together.

WolfDan · October 9, 2017, 9:26pm

I tried to integrate separate databases for that but at the end of the day you end up having mostly the same data in the two databases… So why don’t use a graph database for all? IMO

kylethebaker · October 9, 2017, 9:49pm

I really like the query syntax, it somewhat isomorphic to the graph relationships themselves, with the arrow and the ‘edges’ (-).

MATCH (p:User)-[:OWNS]->(y:yard)-[*]-(q:User)

The wildcard is particularly cool, how it cuts out all of the inner joins. Basically it’s saying “Use any path/relationship you can find to get from PointA to PointB”. I wonder if it’s possible to do a sort of “meta query” where you can have it show you all of the relationships it’s traversed, maybe something like MATCH (f:Foo)-[r:*]-(b:Bar) RETURN r?

AstonJ · October 9, 2017, 10:11pm

Primarily because different types of DBs have different strengths and weaknesses. And since Elixir makes it much easier to build your app as a series of components, it, to me at least, makes sense to make full use of that; I really really like the idea of having independent systems that are built without having to make compromises that you might have to otherwise.

What kind of data was being duplicated in your app/s?

WolfDan · October 9, 2017, 10:40pm

Well is and app that I’m working my self, kind of little “startup”, basically a social network (I’m a n00b so all this is based in my opinion and experience, I have never worked on a real project before, any opinion or suggestion is welcome!)

At first time I decided to use simply Postgre for all my my project (pretty robust, amazing search features etc…), but after see this talk I decided to use graph databases because it really fist to what I’m doing and make the work easily, so I decided to use Postgre and a Graph database called Dgraph

So I begin to work on it, creating some data and make a sketch of what I want to, but I end up with this “problems”:

I need a way to recognize the data that I need in the two databases, so I need the same identifiers (duplicate data)
I’m working with Abshinte so are Graphql queries to get the data, since get relationships from Graphql queries are pretty easy, I needed a way to query the main object requested and the relationships, so I saw two options:
- Parse the Graphql query and query the relationships to graph database, after that construct a SQL query to send it into Postgres (here I think you need to join data etc so what is the point to use a graph database) and send back the data
- Save directly the same data in the Graph database and query the relationships with the requested data, and just query the basic data of the main object to Postgres database (duplicate data)
At my opinion the second option the second option was easier
If I wanted to use the Graph database as a search I need to add all the data that need to query the result as wanted (again duplicate data)
You can build cool things with Graph databases like recommendation systems and so… Need to have the same data in the two databases too to make the feature work correctly

The problem with two databases and duplicate data is maintain the same state in the two databases, so what if one fail if the other no when you update data etc…

At this point looks reasonable to me just use the Graph database, but I haven’t yet made a final decision, as mention before I’m a n00b so possibly I’m wrong on all that I said or a few things, I hope to be corrected if that is the case

Feel free to correct any grammatical problem

AstonJ · October 9, 2017, 11:14pm

If you want to build your app as a series of components, I highly recommend PragDave’s online course - it has given me a huge insight into this sort of architecture and if you’re interested in this too, I think you will love the course just as much as I did

WolfDan · October 9, 2017, 11:24pm

I would really love to acquire the course since was released, but unfortunately I do not have any way to acquire it, I am still quite young and that amount of money is a lot in my country ^^’

axelson · October 10, 2017, 12:26am

@WolfDan could you provide more detail about what type of data your storing and how you’re choosing which database should get which data? Generally I don’t think you’d want to be duplicating the data into both types of database but it’s hard to say for sure without understanding your use-case better.

vic · October 10, 2017, 1:35am

A year ago I started working at a mobility startup (pretty much works like an airline) that uses Elixir/OTP as it’s main backend technology - Yay! -.

Having worked on some reservation systems in the past (a java legacy system we inherited about 5 years ago which had a huge relational-table mess with lots of views to represent hotel room availability) this time we decided not to go with a relational model and use instead a graph database. It turns out modeling the whole availability thing (routes and geo-located stations, etas, trips, reservations, people and vehicles) using graphs was much easier for our minds than thinking on lots of joins or views. Actually we spent a few weeks just modeling graphs on a board before actually coding anything, and trying to answer some of the questions we know the API would have to answer. Of course, graph dbs excel at problems where you have lots of relationships, and I wouldn’t recommend them for storing all of your business data, for example we still use Postgres for most of the business stuff like users, payments, reservations, but keep nodes on the GraphDB with the same UUIDs that Ecto generates, and we just store enough data on them as we need for solving the availability search.

One nice thing IMHO is that using Neo4J from Elixir moved me to contribute to both bolt_sips and the underlying boltex driver. Actually many Elixir libs I share on my github have come to life because we use them in a way at work (besides those that are just for fun). And that is nice, I guess, solving real problems cause more libraries to be born and those already existing to become more mature, and the Elixir community benefits from that.

So far using Neo4J has been a nice experience for us. Also OTP has been a very nice choice for us, we basically have a process per vehicle which is basically driving the vehicle with live data (via websockets and graphql subscriptions) to the next station that has either people getting on/off, and we skip those places that no one reserved at. And that in Mexico City can save a lot of time in traffic and a much better experience for our customers.

ltrls · October 10, 2017, 1:39am

I’ve been working on a Neo4j driver (not recently tbh) that would implement the same kind of API as Apache TinkerPop (which is kind of “pipeline-y”) and generate CYPHER queries. It still needs a lot of work and polishing before I’m comfortable making it public (and before it’s simply usable) so it’s in a gitlab.com private repo but if someone would like to help feel free to send me a message!

AstonJ · October 10, 2017, 2:46pm

How do you architect your app, Vic? Is it split into a series of components? Do you use Umbrellas or something else?

(I hate to keep going on about it, but if you haven’t seen it already, I think you might like PragDave’s approach in his online course).

WolfDan · October 10, 2017, 4:04pm

Hum I can give you an example with something already created that is pretty similar to want I’m doing:

So for example if I take my second mentioned approach I’ll need to add the image url with the kind of relation ship and the name of every character and voice actor

But as mention before it works like a average social network like Facebook or Twitter, you can do feed posts hare thing etc

WolfDan · October 10, 2017, 4:25pm

Thanks for sharing your experience! It really help me a lot to know what path to take

PD: I sended you a PM I’m not sure if that works correctly in the forum

swelham · October 10, 2017, 8:32pm

IIRC you can do exactly that. I just think you would need to drop the : in the relationship match
-[r*]-.

Florin · October 13, 2017, 4:02pm

I am hearing, not directly, Regina is working on an Ecto version, for Neo4j. I contemplated the same at the beginning of my experience with neo4j in Elixir, but realized soon I would just limit the access to Neo4j, and hence went for giving the users a more transparent access to the Neo4j’s own query language; Cypher. And I don’t regret that decision. In fairness, at that time the Model was still a thing, in Ecto, and I didn’t like it very much. Now Ecto is probably much more suitable for trying again, but I let Regina drive this journey, and focus on the lower level Neo4j bindings instead.

yatender-oktalk · November 12, 2018, 5:39pm

I am using neo4j from 2016 oct and found it quite good i used neo4j_sips and with raw query syntax and it’s superb in performance, I didn’t face a single issue in last 2 years with Neo4j, but the thing with Neo4j is you can’t run on cluster in free version, single instance only you can run and the Licence fee is quite higher side so single node only it’ll support with maximum 5 billion nodes, dGraph is better than Neo4j but it’s library of elixir is in alpha stage so I wouldn’t recommend if you are using elixir, (you can use GoLang with DGraph and it’s awesome).