Suggestions for logging RAM high-water mark while benchmarking

I’m trying to performance-tune some big data processing code and am looking for a good way to monitor the high-water mark of RAM usage while running a function. I’ve been using the ExProf library to understand running times in different parts of my code, but this doesn’t tell me how much RAM is getting used.

If I’m not careful, running this code at scale tends to overwhelm the available RAM of my machine, at which point the BEAM seems to unceremoniously reboot. I’m trying to understand the maximum amount of RAM my code consumes while running, so that I can choose an appropriate VM size.

Any suggestions on how to do this monitoring?

The easiest way would be to use memsup from os_mon application, it does seem to cover your specific use-case:

Periodically performs a memory check:

If more than a certain amount of available system memory is allocated, as reported by the underlying operating system, the alarm {system_memory_high_watermark, []} is set.

If any Erlang process Pid in the system has allocated more than a certain amount of total system memory, the alarm {process_memory_high_watermark, Pid} is set.

Awesome, thank you! Is the snippet below kind of the right idea for how to use it?

defmodule TestErlangMemSup do

  def long_running_function() do
    File.read!("/my_huge_json_file.json")
    |> Jason.decode!()
  end

  def start_and_monitor(interval \\ 1000) do
    pid = spawn(fn -> long_running_function() end)
    log_and_repeat(pid, [], interval)
  end

  defp log_and_repeat(pid, log, interval) do
    if Process.alive?(pid) do
      log = [:memsup.get_system_memory_data() | log]
      :timer.sleep(interval)
      log_and_repeat(pid, log, interval)
    else
      log
    end
  end
end

(I’m just trying to run this interactively in a livebook for now). Maybe there’s a different way that people typically use :memsup, this is just my first crack at it :smile: