joaquinalcerro

joaquinalcerro

What is the best approach for fetching large amount of records from postgresql with ecto

Hi there,

I am working with Ecto-Postgresql and I need to call all of the records from a specific table but the table has 40,000 records approximately (15MB).

Each record in the table has a email message id with which I have to call Google Gmail API to get the message.

So I am thinking about memory efficient. What is the most efficient way to call all the records with the minimum impact in memory or not loading the hole 15MB at once just to make API calls?

I found Ecto Stream but it works inside a transaction and I don’t use transactions. And… is Ecto Stream the best way to go?

Thanks for any comments.

Regards.

Most Liked

josevalim

josevalim

Creator of Elixir

IMO stream is the way to go. You need to run inside a transaction but that should not be a problem. You should be able to do neat things such as:

Repo.stream(...)
|> Task.async_stream(&deliver/1, max_concurrency: 10)
|> Stream.run

and that will get emails in batches of 10 and invoke the google email api concurrently. It requires Elixir v1.4.

27
Post #6
michalmuskala

michalmuskala

You can increase the timeout for the transaction where streaming happens:

Repo.transaction(fn -> ... end, timeout: some_huge_number)

You need to remember that LIMIT + OFFSET pagination is prone to race conditions - it may happen that you’ll miss some rows or see rows that you shouldn’t. Database-level cursor gives a consistent view of data.

11
Post #9
brightball

brightball

Detailed explanation there: We need tool support for keyset pagination

Short version, when iterating over the records get the max range and the min range to make sure it’s used in your where clause. This trims the result set BEFORE sorting. Using LIMIT and OFFSET means the entire possible result has to be loaded and sorted to calculate which chunk of them is coming back.

Where Next?

Popular in Questions Top

sergio
In Ruby, I can go: User.find_by(email: "foobar@email.com").update(email: "hello@email.com") How can I do something similar in Elixir? ...
New
_russellb
I want to try my hand at web scraping. What tools/libraries do I need to use. I’m hoping to turn this into something professional so don’...
New
tduccuong
Hi, is there any work on GUI with Elixir, that is similar to Electron/Javascript? My idea is to bundle Phoenix and BEAM into a single se...
New
mcarvalho
What is the difference between System.get_env and Application.get_env? For example, what are best practices to use one versus another.
New
gshaw
What is the idiomatic way of matching for not nil in Elixir? E.g., First way: defp halt_if_not_signed_in(conn, signed_in_account) when...
New
stefanchrobot
What’s the safe way to decode a JSON string into a struct? I want to avoid calling String.to_atom. Jason.decode can give me a map with st...
New
belgoros
I’m not a pro in using Regex and can’t figure out why the following behaviour happens, especially if we take into account the difference ...
New
RisingFromAshes
I’ve read in another post that it may be possible with a router helper - but I couldn’t find an appropriate one, and tbh, I’m still just ...
New
ashish173
I am using Ecto timestamps with postgres, I can see the timestamps() use the :naive_dateime but for my use case I wanted to store the ti...
New
shijith.k
I am trying to start a new phoenix project with elixir 1.9, but mix phx.new does not work. It says that ** (Mix) The task "phx.new" could...
New

Other popular topics Top

malloryerik
Hi, this is for people who, like me, have had some friction using .html.heex templates in VSCode. The solution seems to be, in a hyphena...
New
Darmani72
If I have a post route which an argument: post /my_post_route/:my_param1, MyController.my_post_handler How would get the post params ...
New
skosch
To my knowledge, put_in, Map.update etc. all have the one limitation of not automatically creating intermediate keys when needed (for exa...
New
msaraiva
Surface is an experimental library built on top of Phoenix LiveView and its new LiveComponent API that aims to provide a more declarative...
564 43622 214
New
alice
Hey, Just curious what are the main benefits of Elixir compared to Clojure? When is Elixir more useful than Clojure and vice versa? Th...
New
chrismccord
This release brings a number of exciting features, including integration with the new Phoenix LiveDashboard and Phoenix LiveView. There h...
New
AstonJ
Please see the new poll here: Which code editor or IDE do you use? (Poll) (2022 Edition) It’s been a while since we first asked this, I...
208 31142 143
New
dogweather
I wrote this comment on r/haskell, and it’s not popular there. :wink: But I think I’m on to something… Haskell reminds me of Java, and e...
New
hariharasudhan94
Lets say i have map like this fetching from my database %{"_id" => #BSON.ObjectId<58eb1a7a9ad169198c3dXXXX>, "email" => "XXX...
New
vonH
In asking this question I am more interested about the expressiveness of the language itself and less concerned about the availability of...
New

We're in Beta

About us Mission Statement