How can I prevent duplicate inserts on duplicate calls to an endpoint

Hi,

I have an api endpoint that receives JSON posts from a 3-rd party.

In the controller I check for the presence of a key_field in the JSON. If the fields is not nil, I do a Repo.get_by(DataStructure, key_field: key_field). If I get a nil I do an insert, if I get a record, I do an update.

The trouble is that sometimes, the third party sends two posts with the same data in quick succession and this causes the same data to be inserted twice. This then leads to Ecto.MultipleResultsError on subsequent calls for the same key.

How can I prevent the duplicate insert.

How about using upsert?

Hi,

Try to set on_conflict option to :nothing or whatever is appropriate for your use case.
Read more here -> https://hexdocs.pm/ecto/Ecto.Repo.html#c:insert/2-upserts

Trouble is that the column is not unique. There are multiple rows where it can be null.

Is there a constrain that I can put on that column that would make it allow null but only unique values for non-null? (in that case I imagine that on_conflict would work.

How do you know if you should update or insert right now?

I use Repo.get_by(DataStructure, key_field: key_field) . If it returns nil this means that the key_field value is not in the database and I do an insert, otherwise update.

The main problem is cause by another call that happens in parallel and happens before the Repo.insert has the chance to finish.

The way I’m currently able to reproduce this is using a helper like this:

  defp post_to_endpoint() do
    body = %{
      "first_name" => "John",
      "last_name" => "Doe", 
      "key_field" => "123456" 
    }

    url = "#{Application.get_env(:my_app, :endpoint_url)}/api/endpoint"

    HTTPoison.post(url, Jason.encode!(body), ["Content-Type": "application/json"])

    Logger.debug("posted #{url}")
  end

  def test_endpoint() do
    Task.start(fn -> post_to_endpoint() end)
    Task.start(fn -> post_to_endpoint() end)
  end

This approximates the concurrent calls I receive from the third party server.

Given this constraint key_field should be unique. Because if a value exists you’ll update it.

As I said above, I don’t know how to make the column unique and allow multiple rows with null .

Is there any other way?

A unique constraint does allow multiple NULLs because in SQL NULL is equal to nothing, not even another NULL.

1 Like

uniqueness does allow for multiple rows with NULL. In sql NULL != NULL, so those rows won’t be seen as an uniqueness violation.

2 Likes

I did not know that NULL != NULL. Thanks! I will try this approach.