Matt
Structuring an OTP application
I have a question about the structure of an OTP application. Given a fictitious employee time tracking application for my example, I am trying to reason which application layout would make the most sense using OTP.
Note: This is just an example fictiicous application, nothing I am working on. Just an example to better understand OTP orchestration.
The fictitious employee time tracking application will have employees, schedules, time tracking (clock in/clock out), vacation requests, and overtime submissions. Following OTP, each one should be a process, gen_server for example. Which application makes the most sense:
Option Number One
-
Each employee is a gen_server process.
The employee process holds the employees name, id number, etc. -
Each schedule is a gen_server process.
A schedule holds an employee’s schedule, which days they work, etc. The schedule process is linked to an employee, or a worker process under an employee gen_server? -
Each time tracking event is a gen_server process.
This process holds a clock-in or clock-out event. Also linked to an employee, or supervised by an employee gen_server, not sure which is better. This does seem like too many processes may accumulate over time though. -
Each vacation request is a gen_server process
This process holds information regarding an employees vacation request, dates, and approval information. Also linked, or supervised by an employee gen_server process. -
Each overtime submission is a gen_server process
This process holds overtime information, and who may have approved it. Also linked to an employee, or supervised by an employee gen_server process.
This setup seems like it might spawn too many gen_server processes especially for the clock-in/clock-out. So would the second option be more appropriate?
Option Number Two
-
Each employee is a gen_server process.
The employee process holds the employees name, id number, etc. -
Each schedule is a gen_server process.
A schedule holds an employee’s schedule, which days they work, etc. The schedule process is linked to an employee, or a worker process under an employee gen_server? -
Each time events is a gen_server process.
This process holds a list of clock-in or click-out events. One process for each employee. So, instead of one process for each clock-in or click-out even, just one process holding a list of clock-in and click-out events. -
Each vacation requests is a gen_server process
Same as the above, one process per employee, containing a list of vacation requests instead of each request being a process itself. -
Each overtime submissions is a gen_server process
Same as the above, one process per employee, containing a list of overtime requests instead of each request being a process itself.
I’m very curious to hear your thoughts. Please remember this application is not real, not a web app question and not a persistence question. Just simply curious about people’s opinion on structuring processes.
Most Liked Responses
ericmj
It’s a common misconception that you should use processes to structure your application. Instead use modules to organize your application and use processes when you need concurrency or shared state between processes.
I think you should start by considering the external interface of your application, right now it’s a black box so you can design it however you want. But if you add an HTTP interface you probably want to handle requests in parallel, so the web server will start a process per connection or request. So now you need shared state between these processes which can be solved as you proposed, but your solutions are overly complex. Why a process per employee, why not a single process for all employees, or a single process for your whole database?
You should also be careful about using processes to store information that is not transient. You don’t want to lose employee information because the employee process crashed.
peerreynders
Employees, timecard events, schedules, vacations, and overtime may be important concepts for structuring data and possibly even parts of application state but really don’t inform much in terms application behaviour.
I’m sorry that I long ago coined the term “objects” for this topic because it gets many people to focus on the lesser idea. The big idea is “messaging” …
This comment highlights how people like to focus in on (static) “objects” because that is comparatively easy when in fact the application value is derived primarily by the (dynamic) “collaborations” that implement application behaviour.
Similarly in a BEAM application the ideal process structure is influenced much more heavily by the behaviour the application is meant to exhibit rather than the structure of the data it is managing or transforming.
This Erlang developer puts a different spin on it:
Lambda Days 2015 - Torben Hoffmann - Thinking like an Erlanger
Processes as the building blocks for protocols - so the big idea is designing protocols realized through communicating processes to implement application behaviour.
ericmj
This is impossible to answer. If we take your employees as an example, using one process per employee is likely wrong because it’s hard to argue why you should have one process per employee instead of a single process holding a list of employees. You also want to store schedules, should they be stored separately from employees? Possibly, but what would you gain from that?
Eventually you can consider storing the whole database in one process. What are the benefits and downsides of this? One benefit is less complexity, one process is simpler than multiple, another is that you can more easily implement transactions because you don’t need synchronization between multiple process.
One downside can be seen as an code organization, all your database would be in one location, but this is the misconception. Code is organized with modules, so you can have multiple modules for one process. Another downside can be scalability, you may want to shard your database to multiple process so you can use all your cores. This is when you should start considering using multiple process.
So to summarize, it is hard to answer this question because you are asking how to structure an application using processes when we are saying that you shouldn’t. I would suggest that you start writing the application without processes or with a single process and then when you hit a road block that you think processes will solve come back and ask a question that is specific to your problem and less abstract and hypothetical.









