Still new to Phoenix I wonder what’s the best way to do pagination (or even better infinite scrolling).
Can you recommend me a library that works with Phoenix 1.4-RC3 (which uses Ecto 3 if I am correct) and is easy enough to use for a beginner? Is there a de-facto standard library in the elixir world?
@Nefcairon My favorite way is to use Repo.stream/2 in Websocket connection (see Channels guide). Since you decide to use WebSocket you will have 2-way communication which means you can query anything and return results one by one in really small and fast messages whenever you want to. This solution is also really well scalable. No matter if you have 100 or 100_000_000 entries returned from query you still only fetch and sends exactly same count of entries at a time (just only number of server messages is changed).
If you want to paginate query you can use ecto's Query API, but before you decide to use Ecto.Query.limit/3 make sure you have read why it’s considered as bad practice in some cases:
Number does not matter since you are using Stream.
Here we have few assumptions:
WebSocket (or other bi-directional communication way) is required as well as Phoenix channel API (or equivalent)
Since we are using Channel we can assign any data for each WebSocket connection
Since we are using Ecto.Repo.stream/2 we have only one database row at a time in memory.
Each WebSocket connection have it’s own process
From this we can create scenario like:
Client request GET /posts
Server sends basic HTML DOM with a JavaScript code
/posts is list so we are automatically appending table element
Client is requesting all Post from database
Server is generating per-client query id (for simplicity let’s say it’s counter) - example return: {"id": "query1", "status": "OK"} (which means that query with id 1 is valid and therefore will be send).
Client is setting to table property id with value query1 and property data-model with value post
Server is calling Ecto.Repo.stream/2 in background and in Stream.each/2 it’s sending data like: {data: {…}, "id": "query1", "type": "append"}.
Client receives such response and it’s creating tr DOM element for it with id: query1-#{resonse.data.id}
If there is created, deleted or updated any Post then server could send something like: {data: {…}, "model": "post", type: "delete|update"}
Client accepts such event and it’s looking for table[data-model="post"] > tbody > tr[id^="query"][id$="-#{response.data.id}"].
With that we are sending always same amount of data for each client request at a time. Later it’s possible to limit number of requests per client (using Channel assigns).
I see that you’re mixing pagination with live updates in order to provide a single convenient interface for a client via Channels. This approach and transport are great, but I’m wondering whether Repo.stream/2 is suited for pagination?
If I got it right, Repo.stream/2 uses database’s internal cursors to efficiently select millions of records one by one or in chunks, but at the price of taking up a connection from the pool till it’s done. So the more records and clients you have, the faster you run out of available connections. Since clients don’t need all those millions of records, you have to apply a limit with Enum.take/2 and keep the last record’s unique identifier AKA external cursor (typically a combination of ID + timestamp) on the client side or in the Channel’s state for the next page request.
A plain query with Query.limit/3 is simpler and cheaper than Repo.stream/2 for this case, isn’t it?
It’s why I said it’s example scenario. Instead of sending one row right after another you can request 30 rows at a time and wait until client sends request like: {"id": "query1", "type": "next", "value": 30} and then you just need to send back next 30 rows etc. There is lots of scenarios depends on use case which could be created using such API.