Ecto connection times out when using TDS adapter to query MSSQL database

I have a table with several columns of floating point numbers in a MSSQL 2019 database.
When I try to naively query all the rows in the table the connection times out with an arror

Tds.Protocol (#PID<0.621.0>) disconnected: ** (DBConnection.ConnectionError) client #PID<0.677.0> timed out because it queued and checked out the connection for longer than 5000ms

Eventually the data is returned, but I am concerned that it takes 2000x longer with the Tds adapter than with the Postgres adapter.

I have created a demo repo that seeds some data and can be used to easily see the performance difference:
(GitHub - janhendrik-rust/slow_going)[GitHub - janhendrik-rust/slow_going]

The postgres and odbc connections complete the query and return in under 50ms, the Tds adapter takes over 90 seconds on the same size dataset.

Is there a better way to query medium sized datasets using Ecto connecting to a MSSQL server? Or do I have something missing in my configuration?

Hey @janhendrik-rust welcome! I’m trying to understand what databases you have set up. Are you connecting to the MSSQL 2019 database via the postgres adapter? If so, how? I was not aware that MSSQL had a postgres compatible interface.

Wha :open_mouth: , is this a thing?

Pretty sure it doesn’t; unless its third party… and even then not sure how (or why) you’d do it. I’m just about to start a gig helping with a MS SQL Server migration to PostgreSQL which includes a temporary integration piece. Connectivity mismatches is one of the issues which led to them brining me in. Lots of differences from authentication techniques, to data types, etc. not sure it would be feasible to make it work, let alone any reason/benefit at that level.

While the person you’re responding to would need to answer to be sure (naturally), I read their comment as, “2000x longer with Tds connecting to MSSQL server than the PostgreSQL adapter connecting to PostgreSQL.” (with against the same data implied).

If I get some time, and since I just happen to have a MSSQL Server setup on my workstation now due to this other project, I might try the repo the questioner posted to see if I can help; weekend stuff though, I’m afraid.

1 Like

Couple of things to consider:

1.) Are the MSSQL DB and the Postgres DB both on the same machine? If the Postgres is local and the MSSQL DB isn’t that’d explain the speed difference. Try setting them both up on the same machine and see if that impacts your performance at all.

2.) I’d check the DB parameters mentioned here: Ecto.Adapters.Tds — Ecto SQL v3.11.1 to insure you’re matching everything.

3.) There’s a built-in adapter for TDS in ecto_sql. What happens if you take out the seperate Tds hex package? I’d definitely try the ecto_sql version because if there were any changes made to support better performance they’re likely to have been made to that driver.

It’s hard to imagine a case where you could be assured of an “apples to apples” comparison across DB’s. I mean there are so many things which would factor in to performance. Are all the tables indexed the same? Are all the field types the same? etc. etc.

Oh sure… but sometimes something simple can be informative, even if all you get a vague hints or suggestions of what kinds of problems might be present in the actual scenario. The key is to understand the limits of what it is your looking at, if you aren’t looking at the “real thing”.

1 Like

Can you show the results using a schema instead of a raw query?

Thanks for everyone’s interrest.

I’m really not trying to do an “apples to apples” comparrison - this is not a database showdown :).

Up until recently I’ve exclusively used Ecto with the Ecto.Adapters.Postgres adapter connecting to a local Postgres 15 container. The query performace has never been an issue. Then I switched the database over to a local MSSQL 2019 Express container and the Ecto.Adapters.Tds adapter.

When I perform the exact same query using the ecto_sql Tds {:tds, “~> 2.3”} adapter the query is up to 2000 times slower than when using the ecto_sql Postgres {:postgrex, "~> 0.17}, adapter on the seeded dataset. Have a look at the repo I posted for the migration and seed script I used.

At first I thought I must be doing something wrong, or there must be something misconfigured on the MSSQL side, or my code - and there still might be. But performing the same queries in DBeaver or sqlcmd/psql utilities had no decernable speed difference between the two local databases.

If I do small queries - returning les than 1000 rows the Tds adapter performs admirably. As the result set increases the time the query takes grows exponentially

Tds Adapter to local MSSQL2019 Express:

Result Set size Query Time in ms
10 3.861
100 5.083
1_000 53.342
2_000 178.746
4_000 3_467.661
8_000 20_511.550
16_000 88_519.569

The iex code snippet I used to query the MSSQL Ecto Repo

{time, result} = :timer.tc(Ecto.Adapters.SQL, :query!, [MyApp.Repo, "Select top 10000 float_one, float_two, float_three, float_four, float_five, float_six, float_seven, float_eight, float_nine, int_one, int_two, int_three from floats_test", []])

Postgrex Adapter to local Postgres 15:

Result Set size Query Time in ms
10 2.902
100 1.699
1_000 5.838
2_000 8.308
4_000 10.226
8_000 28.109
16_000 78.704

The iex code snippet I used to query the Postgres Ecto Repo

{time, result} = :timer.tc(Ecto.Adapters.SQL, :query!, [MyApp.Repo, "Select float_one, float_two, float_three, float_four, float_five, float_six, float_seven, float_eight, float_nine, int_one, int_two, int_three from floats_test limit 10000", []])

This gives me the impression that the serialization of the dataset might be to blame, but I am completely out of my depth as to how to troubleshoot this issue.

I get similar perfomance when using Ecto.Query queries - so the issue seems to be in the Tds adapter implementation.

Great, this is the bit that was missing then. The indication that there is an issue is that other queries by other clients are fast, not that the queries against postgres are fast.

The TDS response time does look a bit N^2, so perhaps there is a bug inside its implementation? If you turn debug logging on what is the log output you get from Ecto? It should tell you how long the query took vs how long parsing the responses took. That may not be a smoking gun depending on where the bug is but it might help.

First of all I would ensure that you are using the latest versions of the dependencies. Check the mix.lock and cross-reference with the releases or do a mix deps.update --all to upgrade all of the dependencies to their latest versions.

If I turn on debug logging, I get the following in an iex session:

`iex> from(f in SlowGoing.Platform.FloatsTest, limit: 10000) |> SlowGoing.Repo.all()
[error] Tds.Protocol (#PID<0.450.0>) disconnected: ** (DBConnection.ConnectionError) client #PID<0.513.0> timed out because it queued and checked out the connection for longer than 5000ms
#PID<0.513.0> was at location:

(tds 2.3.4) lib/tds/tokens.ex:547: Tds.Tokens.decode_row_columns/3
(tds 2.3.4) lib/tds/tokens.ex:199: Tds.Tokens.decode_row/2
(tds 2.3.4) lib/tds/tokens.ex:65: Tds.Tokens.decode_tokens/2
(tds 2.3.4) lib/tds/tokens.ex:72: Tds.Tokens.decode_tokens/2
(tds 2.3.4) lib/tds/messages.ex:153: Tds.Messages.parse/3
(tds 2.3.4) lib/tds/protocol.ex:483: Tds.Protocol.decode/2
(tds 2.3.4) lib/tds/protocol.ex:544: Tds.Protocol.send_query/2
(tds 2.3.4) lib/tds/protocol.ex:175: Tds.Protocol.handle_execute/4
(db_connection 2.6.0) lib/db_connection/holder.ex:354: DBConnection.Holder.holder_apply/4
(db_connection 2.6.0) lib/db_connection.ex:1512: DBConnection.run_execute/5
(db_connection 2.6.0) lib/db_connection.ex:1607: DBConnection.run/6
(db_connection 2.6.0) lib/db_connection.ex:743: DBConnection.parsed_prepare_execute/5
(db_connection 2.6.0) lib/db_connection.ex:735: DBConnection.prepare_execute/4
(ecto_sql 3.11.1) lib/ecto/adapters/tds/connection.ex:52: Ecto.Adapters.Tds.Connection.execute/4
(ecto_sql 3.11.1) lib/ecto/adapters/sql.ex:996: Ecto.Adapters.SQL.execute!/5
(ecto_sql 3.11.1) lib/ecto/adapters/sql.ex:952: Ecto.Adapters.SQL.execute/6
(ecto 3.11.1) lib/ecto/repo/queryable.ex:232: Ecto.Repo.Queryable.execute/4
(ecto 3.11.1) lib/ecto/repo/queryable.ex:19: Ecto.Repo.Queryable.all/3
(elixir 1.14.3) src/elixir.erl:309: anonymous fn/4 in :elixir.eval_external_handler/1
(stdlib 4.2) erl_eval.erl:748: :erl_eval.do_apply/7`

[debug] QUERY OK source="floats_test" db=46034.3ms decode=0.3ms queue=1.1ms idle=1936.2ms SELECT TOP(10000) f0.[id], f0.[float_eight], f0.[float_five], f0.[float_four], f0.[float_nine], f0.[float_one], f0.[float_seven], f0.[float_six], f0.[float_three], f0.[float_two], f0.[int_one], f0.[int_three], f0.[int_two], f0.[inserted_at], f0.[updated_at] FROM [floats_test] AS f0 [] ↳ anonymous fn/4 in :elixir.eval_external_handler/1, at: src/elixir.erl:309

Which seems to indicate that it spent 46 seconds in db, 2 seconds in idle and almost no time in decode and queue

If I’m reading this right Tds.Tokens.decode_row_columns eventually calls Tds.Types.decode_data – and this is where I get completely lost.

Is it worth considering the MSSQL product version? For example. are any of the Express limitations affecting this: Editions and supported features of SQL Server 2019 - SQL Server | Microsoft Learn

We’re connecting to an Azure hosted MSSQL DB for one of our systems using TDS, albeit quite small - I think it’s ~8000+ rows, but off the top of my head I’m not sure what kind of reponse times we’re seeing.

If you copy that exact query into another client do you get a faster response time? Does the database log anything? Finally, is there an explain equivalent for TDS?

1 Like

I don’t know the TDS protocol well enough to be sure this is the issue, but one innocuous-seeming line in the type-conversion code is a performance black-hole:

tds/lib/tds/types.ex at f15b86871c41d004345d20130e0db215f03dc78a · elixir-ecto/tds · GitHub

Reattaching data to tail is potentially VERY expensive if tail is the next 10k rows!

I wrote a microbenchmark to demonstrate this:

defmodule Bench do
  def from_tds(<<>>), do: nil
  def from_tds(<<size::unsigned-8, data::binary-size(size), tail::binary>>) do
    data = data <> tail
    <<val::little-float-size(64), _::binary>> = data
    {val, tail}
  end

  def shorter_version(<<>>), do: nil
  def shorter_version(<<size::unsigned-8, data::binary-size(size), tail::binary>>) do
    <<val::little-float-size(64)>> = data
    {val, tail}
  end

  def big_binary(size) do
    <<8, 0, 0, 0, 0, 0, 0, 0, 0>>
    |> List.duplicate(size)
    |> Enum.join()
  end

  def run_example(size, fun) do
    big_binary(size)
    |> Stream.unfold(fun)
    |> Stream.run()
  end
end

[100, 1000, 10_000, 100_000, 1_000_000]
|> Enum.each(fn size ->
  {from_tds_time, _} = :timer.tc(&Bench.run_example/2, [size, &Bench.from_tds/1])
  {shorter_version_time, _} = :timer.tc(&Bench.run_example/2, [size, &Bench.shorter_version/1])

  IO.puts("#{size}:\t#{from_tds_time}\t#{shorter_version_time}\t#{from_tds_time / shorter_version_time}")
end)

Here are the results from my local machine (Elixir 1.13, 2015 MBP):

100:		6250		22		284.09090909090907
1000:		1583		147		10.768707482993197
10000:		28594		1791	15.965382467895031
100000:		37877514	28739	1317.983019590104
1000000:	(I got bored)

That seems to echo the way you’re seeing query times scale with size.


As to fixing the issue, I don’t know the TDS protocol at all. Are there situations where length (from the column metadata) and size (from the binary stream) would disagree?

4 Likes

That’s a good point.

When I profile the following Ecto query on the sample project i posted (Thanks @mjaric for showing me how) :

from(f in FloatsTest, select: [f.float_one,f.float_two,f.float_three,f.float_four,f.float_five,f.float_six,f.float_seven,f.float_eight,f.float_nine,f.int_one,f.int_two,f.int_three], limit: 10000) |> Repo.all()

I get the following profiling results with this mix command

mix profile.fprof -e SlowGoing.Run.fetch_all

                                                                  CNT    ACC (ms)    OWN (ms)     
Total                                                          1322773   52977.343   51457.080     
:fprof.apply_start_stop/4                                            0   52977.343       0.036     
anonymous fn/0 in :elixir_compiler_1.__FILE__/1                      1   52977.307       0.011     
SlowGoing.Run.fetch_all/0                                            1   52977.296       0.029     
SlowGoing.Repo.all/1                                                 1   52977.256       0.004     
SlowGoing.Repo.all/2                                                 1   52977.252       0.023     
Ecto.Repo.Queryable.all/3                                            1   52977.049       0.022     
Ecto.Repo.Queryable.execute/4                                        1   52976.993       0.057     
Ecto.Adapters.Tds.execute/5                                          1   47053.187       0.007     
Ecto.Adapters.SQL.execute/6                                          1   47053.180       0.016     
Ecto.Adapters.SQL.execute!/5                                         1   47053.159       0.012     
Ecto.Adapters.SQL.sql_call/5                                         1   47053.147       0.030     
Ecto.Adapters.Tds.Connection.execute/4                               1   47053.085       0.028     
DBConnection.prepare_execute/4                                       1   47052.907       0.021     
DBConnection.parsed_prepare_execute/5                                1   47046.775       0.014     
DBConnection.run/6                                                   1   46895.986       0.020     
DBConnection.run_prepare_execute/5                                   1   46895.485       0.016     
DBConnection.Holder.handle/4                                         2   46895.321       0.013     
DBConnection.Holder.handle_or_cleanup/5                              2   46895.308       0.036     
DBConnection.Holder.holder_apply/4                                   2   46895.248       0.041     
Tds.Protocol.msg_send/2                                              3   46894.854       0.122     
DBConnection.run_execute/5                                           1   46890.259       0.014     
Tds.Protocol.handle_execute/4                                        1   46890.163       0.039     
Tds.Protocol.send_query/2                                            1   46889.206       0.010     
Tds.Protocol.decode/2                                                2   46870.585       0.036     
Tds.Messages.parse/3                                                 2   46870.453       0.048     
Tds.Tokens.decode_tokens/1                                           2   46715.996       0.010     
Tds.Tokens.decode_tokens/2                                       10009   46715.986     157.969     
Tds.Tokens.decode_row/2                                          10000   46551.343      90.867     
Tds.Tokens.decode_row_columns/2                                  10000   46458.639      54.273     
Tds.Tokens.decode_row_columns/3                                 130000   46403.798    1857.341     
Tds.Tokens.decode_row_column/2                                  120000   43247.636     717.061     
Tds.Types.decode_data/2                                         120001   42486.278   30996.609     
:garbage_collect                                                 68406   10206.122   10206.122     
:lists.member/2                                                  90079    8701.587    1029.836     
Enum.map/2                                                           5    5922.734       0.032     
Enum."-map/2-lists^map/1-0-"/2                                   20005    5922.702     287.541     
Ecto.Repo.Preloader.query/6                                          1    5685.277       0.010     
anonymous fn/3 in Ecto.Repo.Queryable.postprocessor/4            10000    5532.761      92.117     
Ecto.Repo.Queryable.process/4                                   130000    5438.703    1159.499     
Ecto.Repo.Queryable.process_args/4                               10000    5381.206      56.798     
Enum.map_reduce/3                                                10007    5323.765      56.391     
Enum."-map_reduce/3-lists^mapfoldl/2-0-"/3                      130007    5266.946    1753.240     
anonymous fn/4 in Ecto.Repo.Queryable.process_args/4            120000    3471.793     672.895     
Ecto.Type.adapter_load/3                                        120000    1663.026    1108.097     
:suspend                                                         10812    1520.263       0.000     
Enum.reverse/1                                                   10010    1252.745     105.045     
:lists.reverse/2                                                 10019    1142.267     128.622     
Ecto.Type.of_base_type?/2                                       120000     514.021     505.688     
Ecto.Repo.Assoc.query/4                                              1     237.447       0.007     
Enum.reduce/3                                                        9     155.168       0.056     
Enum."-reduce/3-lists^foldl/2-0-"/3                              10035     155.134     106.856     
DBConnection.decode/4                                                1     150.775       0.016     
DBConnection.Query.decode/3                                          1     150.743       0.011     
DBConnection.Query.Tds.Query.decode/3                                1     150.718       0.019     
DBConnection.Query.Tds.Query.do_decode/3                         10001     150.680     102.344     
anonymous fn/3 in Ecto.Repo.Queryable.preprocessor/3             10000      95.536      53.035     
anonymous fn/1 in DBConnection.Query.Tds.Query.decode/3          10000      46.726      40.455     
anonymous fn/2 in Tds.Messages.parse/3                           10002      45.736      44.056     
Ecto.Repo.Queryable.preprocess/4                                 10000      40.233      40.233     
Tds.Protocol.msg_recv/1                                              2      20.777       0.045     
Tds.Protocol.msg_recv/2                                            102      18.742       2.853     
:gen_tcp.recv/2                                                    102      14.366       1.225     
:inet_tcp.recv/2                                                   102      11.560       0.580     
:prim_inet.recv/2                                                  102      10.980       0.566     
:prim_inet.recv0/3                                                 102      10.414       1.052     
:prim_inet.async_recv/3                                            102       7.556       1.456

... the rest of the profiler output is available at (https://github.com/janhendrik-rust/slow_going/wiki/Result-of-selecting-10000-Rows-by-12-columns) ...

Most of the time is spent in Tds.Types.decode_data/2 probably as a result of issue @al2o3cr outlined above

and a fair amount of time is spent in GC.

This issue is also related to the fact that decode_data/2 is called for each column in the result set and each row, in this example case decode_data/2 gets called 120_000 times, 12 colums x 10_000 rows - so the problem seems get progressively worse as you add columns.

1 Like

Can you perform a similar experiment but only selecting integer columns? The slow path in decode_data is only for floats, so it shouldn’t be encountered.

That is exactly right, when I profile an integers only query

from(f in FloatsTest, select: [f.int_one, f.int_two, f.int_three, f.int_four, f.int_five, f.int_six, f.int_seven, f.int_eight, f.int_nine], limit: 10000) |> Repo.all()

The query returns in 7 seconds instead of 52 seconds

                                                                   CNT    ACC (ms)    OWN (ms)     
Total                                                           913951    7469.975    7287.849     
:fprof.apply_start_stop/4                                            0    7469.975       0.032     
anonymous fn/0 in :elixir_compiler_1.__FILE__/1                      1    7469.943       0.008     
SlowGoing.Run.fetch_all_ints/0                                       1    7469.935       0.022     
SlowGoing.Repo.all/1                                                 1    7469.867       0.005     
SlowGoing.Repo.all/2                                                 1    7469.862       0.022     
Ecto.Repo.Queryable.all/3                                            1    7469.682       0.023     
Ecto.Repo.Queryable.execute/4                                        1    7469.627       0.065     
Enum.map/2                                                           5    4632.802       0.026     
Enum."-map/2-lists^map/1-0-"/2                                   20005    4632.776     286.250     
Ecto.Repo.Preloader.query/6                                          1    4388.941       0.012     
anonymous fn/3 in Ecto.Repo.Queryable.postprocessor/4            10000    4242.289      91.559     
Ecto.Repo.Queryable.process/4                                   100000    4148.128     881.984     
Ecto.Repo.Queryable.process_args/4                               10000    4091.856      56.506     
Enum.map_reduce/3                                                10007    4034.502      56.159     
Enum."-map_reduce/3-lists^mapfoldl/2-0-"/3                      100007    3977.243    1324.443     
Ecto.Adapters.Tds.execute/5                                          1    2835.755       0.006     
Ecto.Adapters.SQL.execute/6                                          1    2835.749       0.014     
Ecto.Adapters.SQL.execute!/5                                         1    2835.730       0.009     
Ecto.Adapters.SQL.sql_call/5                                         1    2835.721       0.031     
Ecto.Adapters.Tds.Connection.execute/4                               1    2835.655       0.024     
DBConnection.prepare_execute/4                                       1    2835.503       0.024     
DBConnection.parsed_prepare_execute/5                                1    2830.201       0.017     
DBConnection.run/6                                                   1    2690.386       0.023     
DBConnection.run_prepare_execute/5                                   1    2689.441       0.016     
DBConnection.Holder.handle/4                                         2    2689.278       0.013     
DBConnection.Holder.handle_or_cleanup/5                              2    2689.265       0.029     
DBConnection.Holder.holder_apply/4                                   2    2689.215       0.038     
Tds.Protocol.msg_send/2                                              3    2687.875       0.136     
DBConnection.run_execute/5                                           1    2685.452       0.014     
Tds.Protocol.handle_execute/4                                        1    2684.381       0.037     
Tds.Protocol.send_query/2                                            1    2682.710       0.008     
Tds.Protocol.decode/2                                                3    2671.407       0.046     
Tds.Messages.parse/3                                                 3    2671.237       0.071     
anonymous fn/4 in Ecto.Repo.Queryable.process_args/4             90000    2602.208     503.440     
Tds.Tokens.decode_tokens/1                                           3    2526.041       0.016     
Tds.Tokens.decode_tokens/2                                       10012    2526.025     141.649     
Tds.Tokens.decode_row/2                                          10000    2379.666      92.614     
Tds.Tokens.decode_row_columns/2                                  10000    2282.176      55.985     
Tds.Tokens.decode_row_columns/3                                 100000    2224.965    1056.610     
Ecto.Type.adapter_load/3                                         90000    1229.861     819.005     
Tds.Tokens.decode_row_column/2                                   90000    1004.601     505.789     
Tds.Types.decode_data/2                                          90001     465.631     404.840     
Ecto.Type.of_base_type?/2                                        90000     385.967     378.347     
Ecto.Repo.Assoc.query/4                                              1     243.853       0.006     
:suspend                                                           728     182.126       0.000     
Enum.reduce/3                                                       10     145.852       0.055     
Enum."-reduce/3-lists^foldl/2-0-"/3                              10038     145.817     100.787     
Enum.reverse/1                                                   10012     141.788      93.154     
DBConnection.decode/4                                                1     139.798       0.019     
DBConnection.Query.decode/3                                          1     139.760       0.014     
DBConnection.Query.Tds.Query.decode/3                                1     139.729       0.023     
DBConnection.Query.Tds.Query.do_decode/3                         10001     139.685      97.943     
anonymous fn/3 in Ecto.Repo.Queryable.preprocessor/3             10000      95.830      54.176     
:garbage_collect                                                   340      95.473      95.473     
:lists.reverse/2                                                 10019      45.129      43.750     
anonymous fn/2 in Tds.Messages.parse/3                           10004      43.369      40.733     
Ecto.Repo.Queryable.preprocess/4                                 10000      41.153      40.356     
anonymous fn/1 in DBConnection.Query.Tds.Query.decode/3          10000      40.575      39.629     
Tds.Protocol.msg_recv/1                                              3      13.435       0.071     
Tds.Protocol.msg_recv/2                                             76      11.236       1.577     
:gen_tcp.recv/2                                                     76      10.032       0.814     
:inet_tcp.recv/2                                                    76       8.064       0.407     
:prim_inet.recv/2                                                   76       7.657       0.385     
:prim_inet.recv0/3                                                  76       7.272       0.714     
DBConnection.log/4                                                   1       5.217       0.006     
DBConnection.log/5                                                   1       5.211       0.022     
DBConnection.log/2                                                   1       5.044       0.008

... the rest of the profiler output is available at (https://github.com/janhendrik-rust/slow_going/wiki/Result-of-selecting-10000-Rows-by-9-columns-of-ints) ...


So it is deffinately the floating point codepath that has bad performance problems.

3 Likes

The issue is at line 508 in Tds.Types module, just comment it out and it should fly :wink:

        data_type_code == @tds_data_type_floatn ->
          data = data <> tail 
          len = length * 8
          <<val::little-float-size(len), _::binary>> = data
          val

should be

        data_type_code == @tds_data_type_floatn ->
          len = length * 8
          <<val::little-float-size(len)>> = data
          val
3 Likes