Re-writing the backend of an API with Phoenix, converting old projects to elixir / erlang
I’m currently working on a payment gateway for Elixir. So far I have have been working on a PayPal integration and have the basics up and running.
-
I`m learning elixir by doing work for small customers that need a quick fix for something.
-
Having successfully deployed 6 small crud apps I am now taking one of them, a logistics application and improving it to a fully fleshed out system.
-
and I am making a food/health application on my spare time to track allergens and other abnormalities that could have an impact on your health (mainly made for lupus patients, but can be used for many others as well)
I’m using the Absinthe library to learn graphql. It was going well, but got interrupted by a holiday weekend
Using phoenix build a game server. It is my first job.
Lib. for interacting with Lob API (sending mail, checks, postcards etc)
I’ve been building various web crawlers we use for my day job in Elixir over the past 8-12 months - but nothing real big until recently. Last week I launched my first fully distributed production system (1 ‘coordinator’ node and 4 worker nodes) and it was quite a positive experience.
Took me a little over two months from scratch and a little under 6000 lines of Elixir (excluding tests, comments, etc) with just me working on it. I know it is not a real big codebase, but I’m proud of the fact I was able to get it out the door by myself and in hindsight it would have been extremely difficult to accomplish what I have in another language. I’ve had it running in some form since about a week into the project, so I was able to stand up some MVP tests early on. The plan to go distributed happened at the end, so it was a ‘trial under fire’ situation by turning a lot of reading and small tests with distributing elixir/erlang into real life.
The performance since launch has been awesome as well - by far our best performing system and most stable, even in the short amount of time it has been running for clients. My current capacity on four very modest worker nodes will allow me to retire around 8-10 legacy servers from these tasks.
I’ve used Phoenix before, but this project is just a traditional OTP app. There is a small web UI that shows node status and work being done in realtime - but I just used Plug for handling routing and cowboy for a simple websocket setup. If I have to do a deeper admin interface down the road - I’ll just add a Phoenix project into the umbrella.
Releases and deployment are done via Distillery and edeliver.
If anyone is in this space (crawling/scraping) and has questions - feel free to DM me and I’ll do my best to answer them.
@Ljzn is the game server private source? I am planning on building a small game when I get chance and would be interested in seeing someone else’s implementation for reference.
I think elixir battleship is a good open source project for learning to build games.
Cool I will take a look, thanks!
Thanks for sharing your experience, @adammokan. I’m wondering what were your biggest challenges in building your system, and what background processing queue are you using? For example, I’m working on a small project that involves scraping, and so far I’m using exq + Floki.
I’ll explain the general processing workflow I have in place. I apologize if this is too long/verbose. My goal with this project wasn’t so much a cost-cutting move or to increase speed - but more for predictable performance and stability. So keep that in mind while reading. Maybe it will help someone think through their problem in a different way - but not suggesting my approach is perfect at all.
TLDR - try to break your work down into processes and leverage what the BEAM excels at.
Since this was my second big project, I’d say I still struggled with using processes effectively early on and fully embracing the benefits of “let it crash”.
The source of my processing jobs are a redis list with hundreds of thousands of ‘jobs’ to pop off throughout the day. For this particular case, that redis queue runs about 150k in a 24 hour period. So, just think of everything I mention below as a job being generated from the LPOP against that list in redis. (the redis portion has been in place for years - its not something introduced with the elixir effort)
I started out pulling jobs into the system and generating structs/maps that I would pass around through various steps or stages (crawling, parsing, validation, post-processing, etc). My struct had a state/status attribute that was an atom like :crawling
or :parsing
, which I matched on and then I would change that value as the job progressed through the system. I used a lot of GenStage for portions of this. For other things, where I had to restrict concurrency, I would use Conqueuer - which is just a worker abstraction on top of Poolboy. So, lets say I want to have 50 things in the pipeline, I may not want all 50 to be doing some post-processing step that involved uploading data to S3 or something. I’d tighten that step up with a limited number of workers and let it queue right there.
The issue came down to how I was passing the job data around in the form of a map/struct - my code got really rough around how to handle scenarios where things did not go as planned. I had to be real careful to not bring down certain internal processes that would then dump job data in progress. I had supervision trees in place and so-on, but I found myself managing internal queues all over the place for this struct data. Things just became overly complex after the project grew.
So about a month ago I stepped back and realized that by creating a GenServer process for each job (under supervision) it would made things much simpler. No more passing data structures around to different processes and all of that.
I now use the DynamicSupervisor in GenStage for handling the supervision of these processes and also to limit the number of job processes running concurrently. I feed that DynamicSupervisor with a GenStage producer stage that pulls data from Redis as needed using the demand settings available to provide the back-pressure mechanism GenStage provides. GenStage keeps me from pulling more data from redis than I can handle. Rather than passing around struct for the job data attributes - that data is now just the internal process state. I just handle updates and progression through standard GenServer calls. Also note that I spawn the child job processes off in the DynamicSupervisor using a restart strategy of :transient
. This will restart a job process automatically if it exits/crashes with something other than a normal exit. By default the supervisor will just restart the job with the data it got originally from redis, so any steps I have completed before crashing will not be persisted. I could have done that with ETS, but decided that I’d rather just start it over rather than risk restoring a bad state from ETS upon crashing. Note that I do have a simple global agent process that keeps track of process restart counts, though. If something crashes more than three times - I log it and throw it away. I do not want an inherently bad job eating up a spot in my DynamicSupervisor if it will fail repeatedly.
To keep hundreds of these independent processes progressing ahead smoothly and to give me a way to globally control how fast I want states to progress forward - I implemented a real basic ‘clock’ process, similar to how some video games and simulation systems are designed. When each job process is spawned they subscribe to the clock messages (which are sent as a tuple with the current timestamp/epoch like {:tick, 1480100323)
using a simple pubsub mechanism. So this allows everything in my job process to pattern match on a combination of the :tick
and the current state it is in. I also did this because certain steps in my processing require the job to wait a few seconds - so this tick allows me to have a process chill on certain stages for like 5 seconds by scheduling it to wait until a certain timestamp threshold is met. Its also kinda a handy approach because I can adjust the ‘clock rate’ at runtime and slow things down or even pause it to make debugging a live issue simpler. The thing to note on this clock idea is that be sure your job processes can process the tick messages at least as fast as you are sending them so you do not fill message queues up. I’m pushing my ticks out at about 1.5 second intervals right now, for example.
Again, I’m not suggesting anyone get onboard with this clock idea - but it works for me.
As far as parsing goes, I’m using Floki like you are.
Coming from the front-end, I find wrapping my head around the issue of synchronization in distributed systems is probably the most challenging part of Elixir/Phoenix/OTP.
I’m still trying to grok Lamport timestamps and Vector Clocks, and your implementation (clock-based pubsub mechanism to manage the job processes?) seems like a nice variation within that spectrum.
@adammokan thanks for the lengthy and helpful reply! Web crawling/scraping usually is more complex than what one imagines in the first place. And the devil is in the details… I like your “clock” idea, yet, I “feel” there might be flaws in it that are not obvious.
I don’t disagree with that. I’m about a million crawls in without an error, so I’m hopeful - but we’ll see.
I’m currently (until we manage to push Elm in) doing JS dev on top of a [pretty horrific] C# CMS. To learn C# and keep actually interested at the same time (though Linq is really quite nice so far, quite impressed), I started a C#/Unity course. Then for each of the first few challenges, I’m also converting them to Elixir to cover things I’ve missed - first is a console application, so looking at escripts, second is a FSM, so going to use gen_fsm for one version then a decision tree using digraph.
Kinda stuck on the Exercism exercise about making a “zipper”. Can’t quite grok what it is or how to use it. Can anyone point to a good description, for those of us who didn’t encounter it in Data Structures class?
A zypper is a wrapper to another datastructure that is able to walk that datastructure back and forth a “cell” in constant time.
Usually they are implemented around three values:
- the value of the current cell
- a history, which contains all information necessary to reconstruct the previous structure on a step “backwards”
- the future which holds more or less the remainder of the original datastructure.
For a list, this could work as following:
iex(1)> [1, 2, 3] |> to_zypper |> forward
%Zypper{value: 2, history: [1], future: [3]}
iex(2)> [1, 2, 3] |> to_zypper |> forward |> forward
%Zypper{value: 3, history: [2, 1], future: []}
iex(3) (v(2) |> backward) == v(1)
true # at least it should ;) this session is virtual and never really had happened
AFAIR exercism asks for a Zypper around a binary tree, so forward is replaced by some left and right, also my backward is called up over there.
But I do hope, that this explanation as well as the contents of the hints
-folder can help you make progress. If not say a word and I will split the thread up.
A simple way to see what a zipper is and how it works:
iex> zfrom_list = fn list -> {[], list} end
iex> zforward = fn {left, [cur|right]} -> {[cur|left], right} end
iex> zbackward = fn {[prev|left], right} -> {left, [prev|right]} end
iex> zget = fn {_left, [cur|_right]} -> cur end
iex> zupdate = fn {left, [_cur|right]}, cur -> {left, [cur|right]} end
iex> zinsert = fn {left, right}, cur -> {left, [cur|right]} end
iex> zto_list = fn {left, right} -> Enum.reverse(left) ++ right end
iex> zipper = 1..10 |> Enum.into([]) |> zfrom_list.()
{[], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
iex> zipper = zipper |> zforward.()
{[1], [2, 3, 4, 5, 6, 7, 8, 9, 10]}
iex> zipper = zipper |> zforward.() |> zforward.()
{[3, 2, 1], [4, 5, 6, 7, 8, 9, 10]}
iex> zipper |> zget.()
4
iex> zipper = zipper |> zupdate.(42)
{[3, 2, 1], [42, 5, 6, 7, 8, 9, 10]}
iex> zipper = zipper |> zbackward.()
{[2, 1], [3, 42, 5, 6, 7, 8, 9, 10]}
iex> zipper = zipper |> zinsert.(84)
{[2, 1], [84, 3, 42, 5, 6, 7, 8, 9, 10]}
iex> zipper |> zto_list.()
[1, 2, 84, 3, 42, 5, 6, 7, 8, 9, 10]
It is an efficient way to walk a structure (list is the above example, but can be a map, array, a tree, etc… etc…), update element, etc… etc… all in a purely functional manner.
I did not put it as a tree because that would just give you the answer, but to implement it you ‘invert’ the tree as it is pushed on to the left
. ^.^