Optimal Azure AI translator API call batching

Can someone give ideas how could I create optimal batches for Azure AI translator calls?

Limits are max 50000 characters max per batch and max 1000 strings per batch. Then there can be x amount destination languages that multiplies characters per language. So string with 100 characters will take space of 200 characters example if you use two destination languages. If you have one string with length 50000 you can only have destination language. These destination languages are defined per batch. So example if I had 6 destination languages I could split them to two batches of 3 languages each and so on.

What exactly are you asking? If you have a string of 50_000 characters and need 2 languages, that maybe your code should split it in two and then call the API, or what?

How to make optimal batches with limit of 50000 characters/letters and 1000 strings per batch.
Making it optimal with one language wouldn’t be that bad but as there can be multiple languages and if you add another language it will multiple every string length added to the batch. Then because you have 50000 character limit you can’t example put 30000 long string into a batch with two languages because that will go over 50000 total character limit per batch as that one string is counted 60000 length string.

In some situations we have over 10 destination languages but how many destination languages depends on situation and can be pretty much anything from 1 to 20.

I tried to think how I could make optimal batches but my brain started to hurt badly so came here to ask help :wink:

Edit:

So example if I have 10 destination languages, 2500 strings and those string can be any length under 50000. How do I create optimal batches meaning minimum amount of API calls out of those when limit is 50000 total characters/letters and 1000 strings where every strings length (characters) count is multiplied with number of destination languages used per API call (batch).

Yep, I get that, but what’s still missing in your question is how would you do it manually?

Would you:

  • Split the 30_000 string to 25_000 + 5000 and do two requests each with two languages requested for translation, or
  • Split the 30_000 string to 15_000 + 15_000 and do the same?

Maybe I am misunderstanding you but it looks like trivial division would do the trick?

Or are you saying that e.g. if you have 2 x 5_000 length strings then you can still batch some other strings inside the same request?

String can’t be splitted because they go AI translator. I mean if you split a string how it’s translated would change. I’m saying that if example had 5_000 length string I could translate it to 10 languages in one request. But one API call can only have specific destination languages so you can’t define destination languages per string.

So example if I have a string and 10 destination languages I can either use single API call with that string and set use all those 10 destination language to translate that string to 10 languages or I could do 10 API calls with single destination language. Example if string would have length of 25_000 I could only use two destination languages for one API call because with two languages that would be counted as 50_000 characters long that is the API total character limit.

Yes I can batch multiple strings into same request up to 1000 with any length but in total they can have up 50_000 characters. But every new destination language you add it will multiply length of all strings against that 50_000 limit. So with one language every string length is counted normally against 50_000 if you have two then every string length is doubled against that 50_000 and if you have three then every string length is tripled against that 50_000 limit.

So maybe this?

defmodule AzureTranslate do
  def demo() do
    data = [
      {"text 1", 16_000, ~w(en ja za ru)},
      {"text 2", 25_000, ~w(de es)},
      {"text 3", 10_000, ~w(bg ru de sp es za)}
    ]

    requests_for_multiple_texts(50_000, data)
  end

  def requests_for_multiple_texts(max_request_length, texts) do
    texts
    |> Enum.map(fn {text_key, text_length, languages} ->
      {text_key, requests_for_single_text_via_reduce(max_request_length, text_length, languages)}
    end)
  end

  def requests_for_single_text_via_chunk(max_request_length, text_length, languages) do
    languages
    |> Enum.chunk_while(
      {max_request_length, []},
      fn lang, {remaining_bytes, chunk} ->
        if remaining_bytes >= text_length do
          {:cont, {remaining_bytes - text_length, [{text_length, lang} | chunk]}}
        else
          {:cont, Enum.reverse(chunk), {max_request_length - text_length, [{text_length, lang}]}}
        end
      end,
      fn {_remaining_bytes, list} ->
        {:cont, Enum.reverse(list), {}}
      end
    )
  end

  def requests_for_single_text_via_reduce(max_request_length, text_length, languages) do
    {_remaining_bytes, chunk, final_result} =
      languages
      |> Enum.reduce(
        {max_request_length, [], []},
        fn lang, {remaining_bytes, chunk, final_result} ->
          if remaining_bytes >= text_length do
            {remaining_bytes - text_length, [{text_length, lang} | chunk], final_result}
          else
            {max_request_length - text_length, [{text_length, lang}],
             [Enum.reverse(chunk) | final_result]}
          end
        end
      )

    Enum.reverse([Enum.reverse(chunk) | final_result])
  end

  def benchmark() do
    Benchee.run(%{
      "Enum.chunk_while" => fn ->
        requests_for_single_text_via_chunk(50_000, 16_000, ~w(de fr es it pt))
      end,
      "Enum.reduce" => fn ->
        requests_for_single_text_via_reduce(50_000, 16_000, ~w(de fr es it pt))
      end
    })
  end
end

Included:

  • Two alternative implementations of an algorithm to get a singular text and break it apart on several requests;
  • One function that uses the faster of both implementations (requests_for_multiple_texts);
  • Related to above: run the benchmark function (make a small new Elixir project and just include benchee in it, or use Mix.install in a single .exs file ) and you’ll see for yourself which of both is faster. SPOILERS: it’s the one using Enum.reduce. I could probably make an even faster one but wasn’t in the mood to make one that uses only pure recursion and nothing else;
  • Demo data + demo function that, ahem, demonstrates its correctness with an example.

The output might be slightly cryptic, so explaining it:

You get a list of tuples: first element is the text key (or the text itself), the second one is a list of lists of tuples.

The list of tuples represent a single request which might have e.g. same text with 5 languages. Each text here is represented by its length, not the key / text itself. Got tired and didn’t want to bloat the functions further with one more piece of data. :person_shrugging:

The list that encompasses that list of tuples is the list of requests that must be made for this single text to get fully translated.

Not sure if the code is good but it gets the job done. With the demo data above the result is:

[
  {"text 1", [[{16000, "en"}, {16000, "ja"}, {16000, "za"}], [{16000, "ru"}]]},
  {"text 2", [[{25000, "de"}, {25000, "es"}]]},
  {"text 3",
   [
     [{10000, "bg"}, {10000, "ru"}, {10000, "de"}, {10000, "sp"}, {10000, "es"}],
     [{10000, "za"}]
   ]}
]

…which means:

  • "text 1" gets to do 2 requests: one with 3 languages and one with 1 language;
  • "text 2" gets to do 1 request with 2 languages;
  • "text 3" gets to do 2 requests: one with 5 languages and one with 1 language.

Thank you!