Ecto, tds_ecto and loading large dataset

jedeleh · December 4, 2017, 5:36pm

I’ve got an issue with a 3 stage GenStage pipeline I’ve implemented. In the producer I do a select out of a MS SqlServer database with ecto as such:

  C3poRepo.all(
   from t in C3poEvent,
      where: t.completed == 0,
      order_by: [desc: :occurred_at],
      limit: ^demand,
      offset: ^offset
  )

seems to make sense. The issue I’m having is that is that the database is preparing this statement and executing it, and then the next time data is demanded it prepares a new statement and executes it. The DBA I’m working with says that it should prepare the statement once and then execute it X times rather than re-preparing it again. The issue that I’m seeing because of this is that the sql adaptor runs out of handles after about 1200 demands.

My DBA suggested using raw sql but I really don’t like that idea for a long term solution.

This situation is so arcane that I’m not even sure how to ask this question. The end result that I’m seeing is that after roughly 63K rows the application crashes. I can restart it and it will crash again after the next 63K rows. I’m working against 24 million so this is not a good thing, obviously.

peerreynders · December 4, 2017, 6:31pm

While waiting for a more definitive answer I recommend that you put together a bare-bones elixir/ecto project:

Start retrieving rows with purely sequential code, i.e. stay within a single process - and see whether you can reliably get beyond the 63K mark.
If it still fails, it’s time to try using “hand made SQL” i.e. get as close to the driver as possible - and if that still fails there may be a driver issue.
If the single process works - refactor towards an example with a GenStage producer and a simple consumer. If things start breaking here, post your code - as your producer implementation may be missing something.

Also which version of Ecto and which adapter are you using?

jedeleh · December 4, 2017, 6:44pm

good suggestions. I’ll go down that path, thanks

And here are the versions I’ve got these things locked at:

ecto 2.2.6
tds_ecto 2.0.3
tds 1.0.3

not sure if this one matters but here it is:

poolboy 1.5.1

jedeleh · December 5, 2017, 2:43pm

So I went back and tested it this way:

defmodule Foo do
  def bare_bones do
    EventBridge.start(nil, nil)

    counter = 0
    for a <- 1..3000 do
      new_events = get(1000, a)
      IO.puts 50 * a
    end
  end
end

and was able to reproduce the issue. I should note that I’m running this code from inside iex -S mix:

mymachine ] iex -S mix
Interactive Elixir (1.5.1) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> Foo.bare_bones
...

This completely removes GenStage from the equation, so I believe I can eliminate that from my concern.

I guess this is related to one of three things:

ecto (I’d be hugely surprised if this was the issue, but I call it out for the sake of completeness)
tds_ecto
the way I run my test. (by calling the function from inside iex -S mix

peerreynders · December 5, 2017, 4:07pm

Do the logs identify what ultimately causes the crash? (Keeping in mind the crash is likely happening much further downstream than the actual cause).

jedeleh · December 5, 2017, 10:56pm

It ended up being the version of tds_ecto I was using. It was a known issue and once I upgraded to the latest, the entire thing went away. I guess I should have tried that first.