GenServer design with steps

jc00ke · December 17, 2019, 12:01am

An API I’m working with uses an export model to get certain data. It’s survey data, and to get survey responses, I have to

Create a response export
Monitor the progress of that export
When the progress is complete then I’ll get another ID
Download the actual data with that ID

Here’s a sequence diagram too, in case you’re a visual type

I need to do more stuff with the actual data once I get it, so my thought was to wrap everything in a GenServer and have a client function that kicks off the process.

I’m having a hard time with the loop/break aspect though. I know of the tick method, and I’ve used it in other components in this application, but I need to give control back to the calling function. It seems like Task.async/await could be used for that, but I’m not sure if that’s the most simple way to do this.

Any and all advice would be greatly appreciated!

entone · December 17, 2019, 12:42am

Sounds like what you’re trying to design is a state machine, Erlang/OTP provides gen_statem.

I’ve built StatesLanguage based on gen_statem that allows you to describe your state machine in a JSON format, and provide the callbacks for your different states and their transitions.

Hexdocs

It also provides the ability to output your graph to different visualization engines. This is an example of outputting to graphviz, https://github.com/CityBaseInc/states_language/blob/master/README.md#mix-tasks

If you decide to use it, I’d love any feedback you have. There’s a forum post here

jc00ke · December 17, 2019, 9:32pm

Usually I reach for a state machine, though this time I did not. I’ll try modeling it that way, thanks for the suggestion.

StatesLanguage looks quite nice! If I had a more complex model, I’d consider using it.

xlphs · December 18, 2019, 3:12am

For long running tasks, I typically start a new GenServer process and give it a unique id so the process can be looked up later using GenServer.whereis/1, then talk to the process to check progress or do something else. In your case, make an http api to start the process and return the unique id, then make another api to check progress using that id, and another api to do more stuff with the obtained data. So the key is to name each process for easy look up and then you control what is process is doing later.

ityonemo · December 18, 2019, 4:27am

You should probably be using a Registry for that, which will let you have terms for names, instead of trying to carefully name things with atoms.

ityonemo · December 18, 2019, 4:34am

Having written a state machine library myself, I don’t think this is the job of a state machine. It’s a sequential series of events. I would say your gut instinct to use Task.async is correct.

IMHO use case of a process-driven state machine should be when you have one of the following:

you have cycles in your state graph and timeouts (especially those which can are repetitive, i.e. pings, or those which trigger state transitions) need to be autocancelled on transition to a new state.
it’s a stateful network protocol which needs to respond to active: true messages
there is a theory-driven finite state machine (DFA or NFA) that is associated with a proof (e.g. Raft)

Edit:

actually looking back at your thing why don’t you just do this:

defp check_in(id) do
  receive do after 1000 -> :ok end
  case HTTPLibraryOfChoice.get("/progress/#{id}") do
    :in_progress -> check_in(id)
    {:complete, file_id} -> file_id
  end
end

In general, I don’t believe in using GenServers or StateMachines unless you really really have to.

xlphs · December 18, 2019, 4:57am

You are right Registry is better in this case. I’m too used to erlang’s global name register, which allows for other things out of the box…

ityonemo · December 18, 2019, 5:40am

I think global is a bit dangerous as a process registry because it requires more coordination if you have a cluster, and because the BEAM performs a high latency mutex/all-nodes-query every time you spawn a process with a global name it might not be correct for a quite a few distributed erlang use cases (though in many cases it won’t really matter).

mudasobwa · December 18, 2019, 5:56am

Also trying to carefully name things with atoms might result in ADOS attack on the long run.

GenServer is nothing but a handy wrapper around receive do under the hood.

ityonemo · December 18, 2019, 6:09am

GenServer is nothing but a handy wrapper around receive do under the hood.

-_-

entone · December 18, 2019, 2:53pm

I think you proved my point, number 1 and 2 appear to be true for his use-case here.

ityonemo · December 18, 2019, 3:45pm

There are three states in his linear state graph. A while loop within a state is not a cyclic state graph. I recommend looking up active: true in gen_tcp and gen_udp.

jc00ke · December 19, 2019, 12:29am

This feels dirty, but it also is the most simple way for me to make this work. Thanks!

ityonemo · December 19, 2019, 12:43am

Process.sleep exists too, if that makes you feel better

OvermindDL1 · December 19, 2019, 3:34am

It does the same thing actually, lol.

jc00ke · December 20, 2019, 6:41pm

I went with Process.sleep/1

Thanks all!

sorentwo · December 21, 2019, 1:04am

It is also how :timer.sleep works. That’s the only way to block a process in the BEAM that I’m aware of—everything else is built on that.