How to return data from many pages in API calls

Hi! I am pretty new to elixir and FP in general and am wondering what the proper way to do this would be.
I am making calls to an API in which the result has multiple pages, only a few results come back on the first page. I have the ability to add ‘&page_number=’ to the request so I’m using recursion to increment the page number every time around. It looks something like this:

      defmodule A do
        def fetch(current_page) do
        response = HTTPGet("www.api.com/example&page=#{current_page}")
        if response.total_pages == current_page do
          :ok
        else
          fetch(current_page + 1)
        end
      end

This works great but the ultimate goal is to have ALL of the results (from all pages) returned somehow. What is the best way to do this? I can make the info that comes back from each page be a list so should I create an empty list and join (++) the data from the next page?
I’ve tried something like this but since data is immutable I always end up with the results from a single page and I am losing my mind lol cuz it seems like it should be a simple fix.

Any help would be appreciated,

Thanks!

1 Like

Hey you’ve got the right approach so far with the recursion and your right about it being immutable, meaning that each subsequent call will lose the previous.

A way to overcome this is to pass along with each call an ‘accumulator’ which stores the values of all the previous calls.

So you could then refactor your example to look like:

defmodule A do
  def fetch(current_page, acc \\ []) do
    response = HTTPGet("http://.....?#{current_page}")
    new_acc = [response | acc]
    case response.total_pages do
      ^current_page -> new_acc
      _ -> fetch(current_page + 1, new_acc)
    end
  end
end

So here we are saying get the current page, and if it is not the last page, call fetch again, add to the head of the accumulator, the last repsonse. If it is the last page, then just return the accumulator.

So for example, if there was 5 pages, we would get:

a result with the following:

[page5, page4, page3, page2, page1]

If you wanted them in the original order, you could simply call Enum.reverse(acc) at the end there.

4 Likes

That’s so cool. I’m really loving elixir it’s so much fun.
thanks a million.

1 Like

i’ve seen this done a few different ways, but my favorite is using Stream.unfold. As a stream, each page of data only gets loaded when/if a caller consumes it. Conceptually you have 3 clauses in your unfold function: 1) fetching a page of data, which you store in your accumulator along with what is the next page to fetch. 2) taking one element off your list of data to return and storing in your accumulator the shortened list, 3) when you’re on the last page and your list of data is empty, terminate the stream.

I’ve implemented exactly this type of paginated request, returning an Elixir Stream, for a Strava API wrapper. It’s in the Strava.Paginator module and uses Stream.resource to construct the stream.

Example usage to create a stream from a request:

Strava.Paginator.stream(fn %{page: page, per_page: per_page} -> 
  "/api/segments/starred?page=#{page}&per_page=#{per_page}"
  |> Strava.request(client, as: [%Strava.Segment{}])
  |> Enum.map(&Strava.Segment.parse/1)
end)

The function provided to Paginator.stream is called for each requested page. You provide an alternative function for each paginated API resource.