Mental bridge from Ruby to Elixir?

josefrichter · September 18, 2017, 1:35pm

Hi guys,

I came across this table from Sasa Juric’s book summarizing which common components of a (ruby) web app can be replaced with Erlang

Chris McCord mentioned in his ElixirConf 2017 Closing Keynote that it’s rather difficult for experienced Elixir devs to see the Elixir world through the eyes of newcomers again, so here we go

For a newcomer (from Ruby world) like me, it would be quite helpful to get a bit more detail about which specific parts of the Elixir ecosystem replace those components. There’s a lot of new terms like GenServer, Supervisors, ETS, Mnesia, etc. etc. that don’t ring any bell for a newcomer, so such a mental map could fix that.

To be more specific, these are some of the questions I’m trying to find answers for:
– what do I use instead of Redis and why?
– what do I use instead of Sidekiq and why?
– how does Erlang ecosystem render some components, that are common in Ruby world, unneeded?
– how does the whole ‘concurrency’ promise help me deal with the fact that at some point all the concurrent connections might need to write into a database at once?
– what are some BAD use cases for Erlang/Elixir, where I’m better off sticking with Ruby?
– etc.

Thank you very much!

wmnnd · September 18, 2017, 1:52pm

Hey there and welcome!

Saša’s list is, of course, a little tongue-in-cheek but ultimately true.

Let’s look at some of the points from the list:

HTTP Server: You don’t need to use a third-party public-facing HTTP server because the solutions written in Erlang (like Cowboy) are ready to handle this already.
Redis/Sidekiq: You don’t need to use third-party software for handling background data processing or in-memory caching since this can all easily be achieved with Elixir/Erlang processes.
It’s kind of a running joke in the community to ask whether you even need an external database. Theoretically, you could also use Erlang’s in-memory database ETS and occasionally save it to your hard disk. But many people like to use SQL or noSQL database and they work just fine with Elixir/Erlang.

Regarding the advantages of concurrency when you eventually still end up writing to a database: Well, not every HTTP request needs to write something to a database and even then, you profit from increased stability, responsiveness and availability of your sever if it is able to handle concurrency better

I can’t really think of use-cases in which you’d want to go with Ruby over Elixir/Erlang. There are, however, cases in which you might want to go with a language that compiles to native code (like C/C++/Rust) instead in order to get some performance benefits.

There are some nice introduction books and courses for Elixir out there. I have personally read »Programming Elixir« and it covers many important aspects of both the language and Erlang/OTP.

I can also recommend this little video series about GenServer and Supervisors on YouTube if you want to get a quick fix:

kokolegorille · September 18, 2017, 2:07pm

ETS Erlang Term Storage

– what do I use instead of Sidekiq and why?

Background processes is easy as spawning a process, usually a GenServer

– how does Erlang ecosystem render some components, that are common in Ruby world, unneeded?

Which components?

– how does the whole ‘concurrency’ promise help me deal with the fact that at some point all the concurrent connections might need to write into a database at once?

When having a limited resource and lots of requests, You can use poolboy, like Ecto does for db access

– what are some BAD use cases for Erlang/Elixir, where I’m better off sticking with Ruby?

Often, people coming from Rails complain for not having devise out of the box. I am also coming from Rails, and what I miss the most is a plugin called awesome nested set to manage db tree.

Maybe the gem world is still bigger than hex world. But Elixir gains so much from Erlang/OTP that I would not consider reusing Rails vs Phoenix.

kokolegorille · September 18, 2017, 2:21pm

I forgot to mention

service crash recovery

But that is so obviously Elixir/Erlang strong point, where You can recover any processes with the help of supervision tree.

orestis · September 18, 2017, 3:00pm

I’ve never used Ruby, but at least from a Python perspective, the main difference between Elixir/BEAM and those languages is that the BEAM VM is designed to be effectively started once and never restarted, and it can use all the cores of a machine without needing to spawn new OS-level processes to maintain responsiveness.

So, the main mental leap you have to do is:

Trust the VM: It will not crash, it will not leak memory, it will not block.

hubertlepicki · September 18, 2017, 3:08pm

ETS, DETS, Mnesia. Or Redis. Nothing stops you from using Redis.

Maybe nothing. Maybe you just spawn a process/task and it does some job. Or maybe you use one of the libraries for background jobs.

I do not think it does. It makes building certain things on your own easier, however. Think bout background jobs queue, that you might not need to build or use because you’re good with async tasks that you can just crate ad hoc.

In my experience (>10 years writing Ruby code), this problem occurs when you have many connections that are open to database… while most of them are doing nothing. Ruby is using the database connections very inefficiently. Starting a request will open connection, where you can open transaction, then do some Ruby computations, then write something, then at the end it closes the connection etc. Elixir’s default DB library for many - Ecto - does use connection only when it needs to write/read some stuff, and then immediately checks it out to the pool. The 2nd thing is that explicit need to preload sutff when you make query makes it easier to reduce the N+1 queries your app does. So it’s using the DB more efficiently.

If you really have multiple writes from multiple threads/workers then you’re toast either way

When you have limited budget and there’s plenty of components you can glue together Ruby app from that are out there already. Especially true if you are just starting up with Elixir. In general, Elixir app will take slightly more effort and slightly more code than similar Ruby/Rails app.

AstonJ · September 18, 2017, 3:27pm

I would probably use Elixir’s Task - it’s built in and lets your start a process in the background extremely easily

Tasks are processes meant to execute one particular action throughout their lifetime, often with little or no communication with other processes. The most common use case for tasks is to convert sequential code into concurrent code by computing a value asynchronously

https://hexdocs.pm/elixir/Task.html

NobbZ · September 18, 2017, 3:56pm

In general I’d sign, but you still can produce huge space-leaks easily.

When you read a full GiB file into memory in raw binary mode, it will be allocated in the bin_heap. When you now simply do the following:

def foo(<<c :: binary-size(1), _ :: binary>>), do: foo(c)
def foo(<<c :: binary-size(1)>>), do foo(c)

This will not only loop forever, but keep a reference to the original binary, therefore it can’t get garbage collected ever.

This leak can be avoided by doing as follows:

def foo(<<c :: binary-size(1), _ :: binary>>), do: c |> :binary.copy |> foo
def foo(<<c :: binary-size(1)>>), do foo(c)

This will enforce copying the subbinary and therefore not keep a reference to the original binary, therefore it can be garbage collected.

In erlang this produced many shooting holes in my feet

orestis · September 18, 2017, 4:16pm

Fair enough. My point though was more about memory leaks you cannot reasonably fix yourself, rather than memory leaks that happen in code you directly control.

OvermindDL1 · September 18, 2017, 4:17pm

@Nobbz that is not a ‘leak’ though, it is still pointed to and referenced. A leak is something that is dereferenced but never released, meaning that it can never ever again be reclaimed, that is definitely not your example. ^.^

michalmuskala · September 18, 2017, 4:23pm

The described situation is no longer true in OTP 20. The GC will copy small fragments of big binaries (under 64 bytes) directly to the process heap, instead. This does not solve all the problems but does reduce the issue significantly.

NobbZ · September 18, 2017, 4:23pm

One has to make a difference between memory leak (unreleasable and unaccessible memory) and a space leak (releasable, but much later than expected, if at all).

Wikipedia (as of 2017-09-18 18:20 UTC+2) describes the difference as this:

A space leak occurs when a computer program uses more memory than necessary. In contrast to memory leaks, where the leaked memory is never released, the memory consumed by a space leak is released, but later than expected. [3]
[3]: Leaking Space - ACM Queue

This is exactly the difference I’ve learned during study as well.

NobbZ · September 18, 2017, 4:26pm

So, it will still occur when I match on binary-size(65)? So I’ll stick to :binary.copy/* for now when extracting subbinaries which potentially might be involved in a longrunning process.

michalmuskala · September 18, 2017, 4:40pm

Yes, for larger binaries this can still happen, so the advice on using :binary.copy/1 on data that you plan to store for a long time still applies. But it’s harder to hit this issue than it was before.

sasajuric · September 18, 2017, 5:07pm

I think others have answered your specific questions, but I’d just like to address a more general point, since this table tends to be misinterpreted in a couple of ways.

It’s worth mentioning that this table is a true story, not a contrived example. I was working on these things side-by-side, and Server B was implemented in plain Erlang, as a single project. Moreover, having been involved in server A from the very start, there was no doubt in my mind that we could have moved most (if not all) of it to Erlang. In fact, since we had a lot of problems in production with server A (unlike server B which had almost no problems), I proposed we move it gradually to Erlang, and argued that this move will solve many of our problems. Reflecting back, I still feel the same.

I however caution against a general conclusion that Erlang can always replace 3rd party products. In more complex cases, an external component will likely be a better choice than a built-in Erlang option. That said, Erlang tools (such as processes or ETS tables) can definitely handle many different scenarios, so my feeling is that compared to other languages (especially scripting languages with no proper concurrency story), an Erlang (or Elixir) based project will in general require less 3rd party products and external OS processes.

josefrichter · September 18, 2017, 5:22pm

Thanks everyone! This is really about “practical recipes”. I typically work with early stage startups where for 90% of use cases you’re good to go with out-of-the-box rails + postgres, and sometimes maybe a bit of sidekicq and redis cache, based on “premature optimization is the root of all evil” principle. I’ve been even told that Elixir is ‘premature optimization’ once

So I’m really looking for best practices for common problems that keep repeating in the usual clones of instagram / snapchat / foursquare / tinder with thousands of users, not millions. It appears to me that Elixir ecosystem could actually let you focus on the core of your business, rather than waste your time on convoluted setups and caching strategies…

Thanks again, this is a great community!

josefrichter · September 18, 2017, 5:28pm

This is a crucial piece of info, thanks for that!

Fully understood! 80/20. I was interested in those 80% of use cases here

sasajuric · September 18, 2017, 6:29pm

Yes, my position is that for simpler projects Elixir will help simplifying the tech-stack. So while I see occasional complaints that Elixir is an overkill for smaller projects, I personally think that with Elixir we’ll produce simpler solutions for simple problems, and at the same time we’re certain it can take us very far if things become more complex.

orestis · September 18, 2017, 9:09pm

This 1000 times. I’m working on very small scale projects that run on a single machine. Not having to install a ton of packages for common functionality is a huge boon. In fact I’m thinking of collecting all this philosophy into a talk for some next ElixirConf… my working title is “Scaling down with Elixir”.

sasajuric · September 18, 2017, 9:35pm

I think this is a wonderful idea!

Many of us try to promote Elixir by talking about more complex scenarios such as massive scalability, fault-tolerance, and high availability. I believe that this leaves many people with a feeling that Elixir is an overkill for their smaller-scale problems.

Having a talk which showcases how in simple scenarios you can do a bunch of things with Elixir alone, without needing to reach for external products, might help people understand that Elixir is not only made for large-scale systems, but can also do wonders for smaller ones.

This is precisely why I feel that Elixir is a win-win technology. It leads to simple solutions for simple problems, yet at the same time it can take us far, so we don’t have to consider swapping it for something else down the line.

You should totally submit such talk, and I hope I’ll be there to see it live