How can we maintain persistent Python processes within Elixir using FLAME and Ports to avoid reloading AI models for each inference task?

We have integrated our Elixir and Python codebases. Initially, the Python code ran on a separate server, exposing AI model inference endpoints via API. Now, we aim to unify the codebases, running Python alongside Elixir and enabling internal communication.

We plan to use FLAME to offload CPU-intensive tasks like loading AI models and running inferences to worker pools, avoiding overloading the main server. We’ll use Elixir Ports for communication between Elixir and Python processes.

Our challenge is keeping Python processes alive to avoid reloading AI models. How can this be achieved?

defp do_something(...) do
  FLAME.call(PoolModule, fn ->
    # Run some logic
     ...

    # Prepare payload 
    payload = ...

    # Decide which function to call from the Python code (Previously considered as separate endpoints)
    function_name = ...

    port = Port.open({:spawn, "<path to python file here>"}, [:binary, :exit_status])

    send(
      port,
      {self(), {:command, Jason.encode!(%{function: function_name, data: payload}) <> "\n"}}
    )

    receive do
      {^port, {:data, result}} ->
        case Jason.decode(result) do
          {:ok, response} -> {:ok, response}
          {:error, error} -> {:error, error}
        end

      {^port, {:exit_status, status}} when status != 0 ->
        {:error, "Python script exited with status #{status}"}
    end
  end)
end
1 Like

You most likely want to start the process managing the port in your Application.start callback, then your flame calls simply interact with running port, and utilize the max_concurrency across calls. Your other options is to use place_child, but it sounds like you want to start this conditionally in app start (You can use FLAME.Parent.get() to check if you’re runing as a child or not.

1 Like