Is using Agent to load data in memory in application start up correct approach

magnetic · March 16, 2020, 9:49pm

Hi all,

I am quite new to Elixir, and working on a side project to get better understanding of it.
It is a phoenix app, that serves with list of static items. Those items are taken from third party API.

Since the data is quite static, I just need to fetch it once, and serve it to requests from memory, which means I need to load them to memory at startup.

Application startup
- Get list of items
- get sub-items for each item
User requests index page => send list of items(stored in memomry of the app).
user requests children of one item => send list of subitems(also from app memory)

So, it seems, that is a good fit for Agents, where on startup of the phoenix app, i will make requests to the third party api to get all items first, then spawn a new agent for each item to get its subitems(children) put it in memory.

However, i am not sure, if that’s the right approach, in elixir/OTP way

kokolegorille · March 16, 2020, 10:04pm

If You do have a fetching logic, it’s probably better to use a GenServer than an Agent.

If You do multiple requests to the same API, it’s good to have a mecanism to avoid flooding third party API.

There are some libraries for cache… for example cachex.

Finally, when it comes to access speed, I like to use ETS.

magnetic · March 16, 2020, 10:19pm

Thanks for reply @kokolegorille

I have few questions to clarify.

Do I need to be reaching for the caching mechanism at this stage? I am not concerned about the speed of getting the data once it is in memory of my app. I thought having it as state of Agent(or GenServer) should be enough.
Is the approach I described above correct from Elixir development POV. Mainly,
- Spin up a process to get list of items
- Spin up new process for each item (and give id of the item as name of the process), to get sub items.
- Access state of the process, by the id of the item to get subitems of that process whenever needed (page is requested by user)

kokolegorille · March 16, 2020, 10:27pm

Yes why not… as it is a learning project, You can choose whatever suits your interest.

Just remember it’s easy and cheap to spawn a process, but You should always know how to reach them (Maybe use Registry?) and what to do in case a process crash.

Remember also to be nice (AKA rate limit requests) when querying third party API, because it’s easy to flood.

ityonemo · March 17, 2020, 5:02am

IMO. This is an excellent fit for agents, and a poor fit for genservers, since it is simple state and doesn’t require more complex things. You just need get/2 and start_link/2

Just remember to write a test where you use Process.exit/2 to kill the agent and make sure you can still make a query afterwards.