Ecto embedded_schemas really really slow (4sec for 150 DB Entries) and its not the DB

Hey Elixir Community,

We use Elixir in a couple of Microservices. One of them Stores Information about Reservations placed for different stores. Fetching all reservations for a store in a certain timeframe used to be really fast (with little reservations) but with 150 reservations returned the performance decreases to a steady 4 seconds… For a Team that prides themselves in time in service durations of on average less than 60ms, that is tough…
I took all day to narrow it down and it seems to me that the bottleneck is the result mapping from the Database to the final Elixir structs. The Postgres DB Queries are always below 10ms but what happens afterwards is a mystery to me…

Here some insights into the Schemas involved. The DB Schema for reservation events looks like:

  schema "reservation_events" do
    field(:reservation_id, :string)
    field(:status, :string)
    field(:start, :utc_datetime_usec)
    field(:store_id, :integer)
    field(:hash, :string)
    field(:customer, EncryptedCustomerReservation)
    embeds_many(:products, Product)
    embeds_many(:reco, Product)
    field(:confirmation_timestamp, :utc_datetime_usec)
    field(:confirmation_store_timestamp, :utc_datetime_usec)
    field(:locale, :string)
    field(:mandator_id, :string)
    timestamps()
  end

The EncryptedCustomerReservation implements the Ecto.Type behaviour and maps a Customer (Map consisting of Name, Email, Telephone, …) to a JSON String and encrypts that at rest (dump/1) and decrypts it at need/read (load/1). Always done in under 1ms.
The encrypt function called on dump for the Customer:

  def encrypt(reservation) do
    :crypto.crypto_one_time(
      :aes_256_ctr,
      @key,
      @init_vector,
      reservation |> Jason.encode!() |> Base.encode64() |> to_charlist,
      true
    )
    |> to_string
  end

Decrypting is simplyfied the reverse of the above.

The Product part is an embedded_schema (JSONB) but I also went the way to implement the Ecto.Type behaviour to get a better introspection. I implemented the load callback as follows:

def load(data) when is_map(data) do
    start = DateTime.utc_now
    data =
      for {key, val} <- data do
        {String.to_existing_atom(key), val}
      end

    {:ok, struct!(Product, data)}
  end

Probably not the most beautiful way but the performance stayed pretty much the same. Every Product load is again finished in under 1ms.

The Product schema:

  use Ecto.Schema
  embedded_schema do
    field :url
    field :image
    field :text
    field :price
    field :currency
    field :reasonCanceled
    field :status
  end

Sinde all reservations currently only have one product and all mappings from db to Elixir structs take less than 1ms for 160 entries this should never ever take 4 seconds. Probably not even 160ms.

Am I missing something obvious?
Has someone else encountered similar performance issues using Ecto and Postgres?

I am desperately searching for a solution as the speed of our Elixir services has always been incredible
:heart: elixir

I am grateful for every suggestion or workaround

I’m not sure about your original issue, but it’s worth noting that you generally shouldn’t use AES CTR mode with a reused key+IV pair. Doing so means that every encrypted value in your application is XORed against the same stream of values…

Here’s some things that I would try:

  • Make sure I have a reproducible function call,
  • Log the queries that are executed,
  • Run the exact same queries (and fetch the data) from the same machine using a different SQL client (e.g. psql),
  • Run the exact same queries (and fetch the data) from a different machine using a different SQL client (e.g. psql),
  • Query the same data without the encrypted fields and embeds (create a new schema e.g. ReservationEventSimple on top of the same table "reservation_events" but without the encrypted field and the embeds; use the same code path as much as possible),
  • Add the encrypted field and the embeds to see if they are the root cause.

Try to replicate the production setup as much as possible: run the same code paths, run the code on production or bring production data to local machine, measure each step. Also avoid using IEx to benchmark code as it skews the results.

1 Like

Thank you very much for your replies. I did pretty much what you recommended! I narrowed the issue down even further and it seems to be the mapping from JSONB Arrays from Postgres into Elixir datastructures by ecto (field(:reco, {:array, :map})). My Workaround is to query the DB directly without ecto and do the mapping myself.

result = Stores.Repo.query!(
        "select * from reservation_events where store_id = '#{store_id}' and timestamp '#{date.year}-#{date.month}-#{date.day}' >= start and timestamp '#{start_date.year}-#{start_date.month}-#{start_date.day}' <= start;"
result.rows
|> Enum.map(&map_db_to_output(result.columns, &1, store))
def map_db_to_output(row, columns, store) do
    Stream.zip(row, columns)
    |> Enum.into(%{})
    |> format_event_data!(store)
  end
def format_event_data!(
        %{
          "customer" => customer,
          "products" => products,
          "reco" => reco,
          "reservation_id" => reservation_id,
          "start" => start,
          "status" => status,
          "store_id" => store_id
        },
        city
      ) do
    %{
      "reservationId" => reservation_id,
      "status" => status,
      "start" => start,
      "storeId" => store_id,
      "storeName" => city,
      "customer" => Stores.Encryption.EncryptedCustomerReservation.decrypt(customer),
      "products" => products,
      "reco" => reco
    }
  end

Maybe not beautiful but it takes less than 100ms now (actually mostly under 30ms). To me this seems to be an issue in Ecto. At the moment I do not have the time to validate it in another, reproducible setup.

Ok, thank you, then I probably misused crypto_one_time. My intention was to have a static encryption key to encrypt and decrypt my data quickly but still have it stored in DB in an ‘unusable’ fashion in case of data loss.
How should :crypto.crypto_one_time be used?

@seb5law Would you mind opening an issue in Ecto so that somebody could look into this further?

We’re using Cloak for encrypting fields at rest. I think you might find it useful.

1 Like

Obviously… Thanks a lot!

1 Like
2 Likes