wanton7

Optimal Azure AI translator API call batching

Can someone give ideas how could I create optimal batches for Azure AI translator calls?

Limits are max 50000 characters max per batch and max 1000 strings per batch. Then there can be x amount destination languages that multiplies characters per language. So string with 100 characters will take space of 200 characters example if you use two destination languages. If you have one string with length 50000 you can only have destination language. These destination languages are defined per batch. So example if I had 6 destination languages I could split them to two batches of 3 languages each and so on.

6 comments

#help

0 497 6

2024-04-07 06:22:33 UTC

Marked As Solved

dimitarvp

So maybe this?

defmodule AzureTranslate do
  def demo() do
    data = [
      {"text 1", 16_000, ~w(en ja za ru)},
      {"text 2", 25_000, ~w(de es)},
      {"text 3", 10_000, ~w(bg ru de sp es za)}
    ]

    requests_for_multiple_texts(50_000, data)
  end

  def requests_for_multiple_texts(max_request_length, texts) do
    texts
    |> Enum.map(fn {text_key, text_length, languages} ->
      {text_key, requests_for_single_text_via_reduce(max_request_length, text_length, languages)}
    end)
  end

  def requests_for_single_text_via_chunk(max_request_length, text_length, languages) do
    languages
    |> Enum.chunk_while(
      {max_request_length, []},
      fn lang, {remaining_bytes, chunk} ->
        if remaining_bytes >= text_length do
          {:cont, {remaining_bytes - text_length, [{text_length, lang} | chunk]}}
        else
          {:cont, Enum.reverse(chunk), {max_request_length - text_length, [{text_length, lang}]}}
        end
      end,
      fn {_remaining_bytes, list} ->
        {:cont, Enum.reverse(list), {}}
      end
    )
  end

  def requests_for_single_text_via_reduce(max_request_length, text_length, languages) do
    {_remaining_bytes, chunk, final_result} =
      languages
      |> Enum.reduce(
        {max_request_length, [], []},
        fn lang, {remaining_bytes, chunk, final_result} ->
          if remaining_bytes >= text_length do
            {remaining_bytes - text_length, [{text_length, lang} | chunk], final_result}
          else
            {max_request_length - text_length, [{text_length, lang}],
             [Enum.reverse(chunk) | final_result]}
          end
        end
      )

    Enum.reverse([Enum.reverse(chunk) | final_result])
  end

  def benchmark() do
    Benchee.run(%{
      "Enum.chunk_while" => fn ->
        requests_for_single_text_via_chunk(50_000, 16_000, ~w(de fr es it pt))
      end,
      "Enum.reduce" => fn ->
        requests_for_single_text_via_reduce(50_000, 16_000, ~w(de fr es it pt))
      end
    })
  end
end

Included:

Two alternative implementations of an algorithm to get a singular text and break it apart on several requests;
One function that uses the faster of both implementations (requests_for_multiple_texts);
Related to above: run the benchmark function (make a small new Elixir project and just include benchee in it, or use Mix.install in a single .exs file ) and you’ll see for yourself which of both is faster. SPOILERS: it’s the one using Enum.reduce. I could probably make an even faster one but wasn’t in the mood to make one that uses only pure recursion and nothing else;
Demo data + demo function that, ahem, demonstrates its correctness with an example.

The output might be slightly cryptic, so explaining it:

You get a list of tuples: first element is the text key (or the text itself), the second one is a list of lists of tuples.

The list of tuples represent a single request which might have e.g. same text with 5 languages. Each text here is represented by its length, not the key / text itself. Got tired and didn’t want to bloat the functions further with one more piece of data.

The list that encompasses that list of tuples is the list of requests that must be made for this single text to get fully translated.

Not sure if the code is good but it gets the job done. With the demo data above the result is:

[
  {"text 1", [[{16000, "en"}, {16000, "ja"}, {16000, "za"}], [{16000, "ru"}]]},
  {"text 2", [[{25000, "de"}, {25000, "es"}]]},
  {"text 3",
   [
     [{10000, "bg"}, {10000, "ru"}, {10000, "de"}, {10000, "sp"}, {10000, "es"}],
     [{10000, "za"}]
   ]}
]

…which means:

"text 1" gets to do 2 requests: one with 3 languages and one with 1 language;
"text 2" gets to do 1 request with 2 languages;
"text 3" gets to do 2 requests: one with 5 languages and one with 1 language.

Post #6

Where Next?

View thread on forum (has 6 responses!)

help

Home Questions & Help>Questions

#help

0 495 6

Last post

Popular in Questions

Questions & Help>Questions

(EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started

I have an umbrella app. Some of the apps inside depend on other apps in the umbrella, unsurprisingly. I’m writing a test for one of the...

#otp #umbrella #testing #exunit #errors

26 21896 11

2019-05-08 23:43:18 UTC

New

Questions & Help>Questions

Mint vs Finch vs Gun vs Tesla vs HTTPoison etc

Currently suffering from paralysis by [HTTP client] analysis. This is rather unusual in Elixirland as there tends to be consensus on the ...

#http_client

252 22599 30

2024-02-11 02:32:24 UTC

New

Questions & Help>Questions

Where / How does the Mix environment variable get set?

I am trying to figure out how Mix knows whether the environment is test, dev, or prod – where is this set? Thanks.

#mix

31 23163 9

2018-09-11 12:46:52 UTC

New

Questions & Help>Questions

How do I use the Postgres JSONB / Postgrex JSON extension?

Hi all, I’ve just started learning Elixir and Phoenix Framework, so please pardon my n00bness at this stage. I’m trying to use Postgres...

#ecto

22 17759 9

2024-11-30 23:08:40 UTC

New

Questions & Help>Questions

What do you think of Gleam compared to Elixir?

I have a relationship of love and hate with Elixir. Lots of things are just absolutely right, but there are some things that are kind of ...

#programminguages #gleam

24 17666 10

2023-04-08 20:09:27 UTC

New

Questions & Help>Questions

Learning Elixir, frst impressions ( plz don't kill me ! )

About me? ( if you have nothing better to do than reading about some random guy in the internet :stuck_out_tongue: ) Hello all, this is ...

#learning-elixir

371 26424 61

2018-09-27 19:27:50 UTC

New

Questions & Help>Questions

(Postgrex.Error) ERROR 42804 (datatype_mismatch): column "" cannot be cast automatically to type integer

I have a User schema with a :from_id field set to type :string: defmodule TweetBot.Repo.Migrations.CreateUsers do use Ecto.Migration ...

#ecto

29 13741 4

2018-09-22 00:54:36 UTC

New

Questions & Help>Questions

Import a module from a file into IEX

What is the proper way to load a module from a file in to IEX? In the python world, doing something like this pretty standard: from ....

#iex

35 32699 16

2024-11-20 04:12:47 UTC

New

Questions & Help>Questions

Dialyzer: suppress warning on a specific function

In the Dialyzer docs ( dialyzer — OTP 29.0.2 (dialyzer 6.0.1) ), there is a way to turn off a specific warning for a function: -dialyzer...

#dialyzer

26 14074 7

2026-01-15 15:41:42 UTC

New

Questions & Help>Questions

How is it possible to get 2 million websocket connections when you have 65536 available ports?

I have a server on AWS, and was running a load test using artillery. When looking at the Phoenix dashboard I see the Ports going to 100% ...

/phoenix

20 19144 4

2023-01-24 00:21:16 UTC

New

Other popular topics

Questions & Help>Questions

How can I write a raw sql query?

Hi, I have to write a raw query for one of my project. But till now I have used ecto queries and don’t have much experience writing raw ...

/phoenix #ecto

13 19797 20

2020-04-12 00:15:10 UTC

New

Questions & Help>Questions

Anonymous functions with multiple body

Hi guys, i’m new in the Elixir world, and i have to say, that i love it! i’m having some problem to understand anonymous functions with ...

19 21812 4

2017-02-16 19:25:58 UTC

New

News>Phoenix News

Phoenix 1.4.0 released!

Phoenix 1.4.0 released Phoenix 1.4 is out! This release ships with exciting new features, most notably with HTTP2 support, improved deve...

/phoenix #phoenix-release

688 31194 112

2018-11-21 08:51:31 UTC

New

Official Elixir News>Proposals

Behaviours, defoverridable and implementations

Hi everyone, One of the features added to Elixir early on to help integration with Erlang code was the idea of overridable function defi...

#official-elixir-proposal

111 23052 35

2017-04-25 21:18:42 UTC

New

Questions & Help>Questions

Why isn’t mnesia the most preferred database for use in Elixir/Phoenix?

Why is it that the mnesia database isn’t the most preferred database for use in Elixir/Phoenix?

/phoenix #mnesia

124 21686 22

2020-04-29 21:46:52 UTC

New

Chat & Discussions>Discussions

Elixir Code Editors & IDEs - which one are you using? (Poll)

Please see the new poll here: Which code editor or IDE do you use? (Poll) (2022 Edition) It’s been a while since we first asked this, I...

#code-editors

208 31307 143

2019-10-07 16:02:20 UTC

New

Questions & Help>Questions

Enum.map over list of key/value pairs with a map as the value

As the title describes, I’m trying to run Enum.map() over a list of key/value pairs, where the value is a map. My data looks like this: ...

#enummap

7 19976 6

2019-10-12 19:16:31 UTC

New

Questions & Help>Questions

Import a module from a file into IEX

What is the proper way to load a module from a file in to IEX? In the python world, doing something like this pretty standard: from ....

#iex

35 32699 16

2024-11-20 04:12:47 UTC

New

Questions & Help>Questions

How to set up the Elixir SDK in Intellij IDEA with the intellij-elixir plugin?

Hello! Sorry for this astonishing simple question, but I’m really stuck. I try to set up the intellij-elixir plugin, but I don’t know ho...

17 23249 9

2020-08-23 02:19:55 UTC

New

Chat & Discussions>Discussions

ElixirLS - the Elixir Language Server

TL;DR: I’ve just released an implementation of Microsoft’s IDE-independent Language Server Protocol for Elixir. It adds language support ...

#elixir-ls

1144 54250 245

2026-06-09 16:10:09 UTC

New

Questions & Help>Questions

Why does <.form> discard method and CSRF token when :action is absent, and is action="" a reliable way to submit to the current URL?

Questions & Help>Questions

Help with elixir-ts-mode in doom-emacs config

Questions & Help>Questions

Are Vi keybindings possible inside IEx?

Questions & Help>Questions

I miss the ternary operator - does anyone have a macro that allows a ternary operator in Elixir code?

Questions & Help>Questions

Empty Result on Generic Action with graphql_unnested_unions

Questions & Help>Questions

Clarification about `assign/2,3` usage in `render/1` callbacks

Questions & Help>Questions

With the new 1.20 release does it change the way you see Gleam?

Questions & Help>Questions

Using Phoenix.LiveView.TagEngine as an EEx.Engine is deprecated!

Questions & Help>Questions

About ambiguity introduced in function default arguments

Questions & Help>Questions

OpenApiSpex schema - are there any naming conventions on handling show and index routes?

Questions & Help>Questions

Questions Questions ❯

Latest on Elixir Forum

Sloppy Joe Architecture Discussion

Chat & Discussions>Discussions

Mishka Chelekom - 0.0.9 released with 35 new headless component, MCP and more

News>News & Updates

Hex.Application error trying to run mix deps.get

Questions & Help>Troubleshooting

Ex-tauri - Desktop applications using Elixir

News>Announcing

The Architecture Behind Deploying Livebook Apps w/ Livebook Teams-Hugo Baraúna | ElixirConf US

Learning Resources>Talks

Why does <.form> discard method and CSRF token when :action is absent, and is action="" a reliable way to submit to the current URL?

Questions & Help>Questions

Could OTP handle the latency demands of a crypto order matching engine?

Chat & Discussions>Discussions

Nerves v1.15.0 released!

News>News & Updates

Green_ash - a keyboard-driven LiveView console to probe your Ash resources

News>Announcing

Practical Mentorship for a Stronger Community - Jordan Miller | ElixirConf US

Learning Resources>Talks

Amarula - a WhatsApp client in pure Elixir

News>Announcing

Comcent CE - an open-source voice/contact-center platform on Elixir/OTP, with call queues modeled as processes

News>Announcing

LT: smithy beam: Contract first API Development - Frank Eickhoff | ElixirConf EU

Learning Resources>Talks

BEAM There, Done That with Lukas Backström on Building the BEAM JIT

Blogs & Podcasts>Podcasts

Senior Software Engineer - Stord, Remote USA

Jobs & Member Profiles>Jobs

Elixir Forum ❯

Sub Categories:

Forums

We're in Beta

About us Mission Statement

Optimal Azure AI translator API call batching

wanton7

Optimal Azure AI translator API call batching

Marked As Solved

dimitarvp

Where Next?

Popular in Questions

(EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started

Mint vs Finch vs Gun vs Tesla vs HTTPoison etc

Where / How does the Mix environment variable get set?

How do I use the Postgres JSONB / Postgrex JSON extension?

What do you think of Gleam compared to Elixir?

Learning Elixir, frst impressions ( plz don't kill me ! )

(Postgrex.Error) ERROR 42804 (datatype_mismatch): column "" cannot be cast automatically to type integer

Import a module from a file into IEX

Dialyzer: suppress warning on a specific function

How is it possible to get 2 million websocket connections when you have 65536 available ports?

Other popular topics

How can I write a raw sql query?

Anonymous functions with multiple body

Phoenix 1.4.0 released!

Behaviours, defoverridable and implementations

Why isn’t mnesia the most preferred database for use in Elixir/Phoenix?

Elixir Code Editors & IDEs - which one are you using? (Poll)

Enum.map over list of key/value pairs with a map as the value

Import a module from a file into IEX

How to set up the Elixir SDK in Intellij IDEA with the intellij-elixir plugin?

ElixirLS - the Elixir Language Server

Questions & Help>Questions

Latest on Elixir Forum

Sponsor Spotlight

Our Sponsors

Categories:

Sub Categories:

Forums

Popular Tags

Our Sponsors

We're in Beta