How do you profile an entire application

Plenty of guides that show how to profile a specific function, but I can’t figure out how to profile an application (run with --no-halt, if that chances anything). Any pointers?

1 Like

What are you trying to measure? What are you hoping to be able to identify?

There are so many potential things “profile an application” can mean, and even more possible reasons to do those things … so it is a bit hard (at least for me …) to answer this usefully without understanding a bit more about what you need to accomplish …

If you have a Phoenix web app, using something like AppSignal (or Scout) could be a good starting point. I’m personally using AppSignal and if you follow the basic setup, you get data on DB queries, Phoenix Controllers and Views out of the box.

I’m looking to profile both cpu and memory. The exact details vary based on the tools. I see tools like ExProf which seem to do what I want (for CPU anyways), but all the examples seem to be testing a single function. It might be trivial, but I’m not sure what that function to wrap the profile macro around for an app. From what I can tell, an application’s exposed entry point is the start function, but that exits as soon as the app is started.

For example, if this was Go, I’d stick:

f, _ := os.Create("cpu.prof")
pprof.StartCPUProfile(f)
go func() {
   time.Sleep(time.Minute * 5)
   pprof.StopCPUProfile()
}()

At the start of my application, and I’d get a useful profile file. Doing memory would be equally simple.

FWIW, I think “profiling an elixir app” is pretty specific. If you google “profiling a go app” or “profiling a java app” the top hits are germane and helpful. I’m not trying to split hair, I appreciate the help, but, so far ,to me, this seems like an area where either the tooling is lacking or the documentation. “The Road to 2 Million Websocket Connections in Phoenix” illustrated this well, it explains the steps that were taken to improve things (like converting a bag to a duplicate_bag), but not how those steps were identified as an bottleneck.

I think there’s also a bit of a culture difference: if you instead search for “elixir metrics” you’ll find quite a number of resources for this. “elixir profiling” not nearly as much. This probably reflects some aspects of the culture around, and cultural heritage that have gone into, Elixir. Aaaanyways …

Here’s a decent tool that looks like it may provide what you are after: Exometer, an Elixir wrapper for it (which I haven’t yet tried myself … so ymmv?), and there are a good number of blog entries out there that talk about these tools as used from Elixir.

There are several other instrumentation-based libraries for gathering and tracking metrics for Elixir out there. That includes a couple of 3rd party services which will host the resulting data and who have nice native Elixir libs. But there are also fully self-hosted solutions that a “elixir metrics” on google will discover :slight_smile:

Some random thoughts circling my head about the whole “profiling has a common meaning in other languages / environments”: … wearing my C/C++ (or most other languages I use) hat, yeah I would agree with that. “Profiling” has a fairly static meaning in those contexts.

Elixir’s, or rather the BEAM’s, inherent preemptive code execution (aka processes) and async message passing make a number of traditional profiling techniques and ideas (like “watch how much CPU my app uses pls”) not as clear cut. e.g. in most other platforms I’ve worked with the idea of “hey, can you tell me how many concurrent code paths there are?” is relatively exotic; even in heavily multi-threaded applications I’ve written in C++, I haven’t experienced anywhere near the usual level of dynamic behavior that a typical Elixir application ends up exhibiting which makes tracking / counting / monitoring processes “a thing” that is actually interesting. The preemption can also have interesting effects in more basic measurement techniques in real-world environs, though that can be controlled for in testing environs, but then that lends itself to synthetic micro-benchmarking … which can be useful, but not always what is needed to demonstrate shipability as artifacts like message bottlenecks will only show up under load.

1 Like

Think I figured it out.

Right at the top of my application’s start function, I added:

:fprof.trace([:start, verbose: true, procs: :all])

spawn fn ->
  :timer.sleep(10_000)
  :fprof.trace(:stop)
  :fprof.profile()
  :fprof.analyse(totals: false, dest: 'prof.analysis')
end

the procs: all makes it profile all processes. I stop the trace after 10 second. This generated a 1.5GB fprof.trace file (so, beware!) The calls to profile/analyze generates a very verbose prof.analysis file (37MB). This file can be explored, but it’s noisy.

I used https://github.com/isacssouza/erlgrind to convert the file to a callgrind format which then lets you use those sets of tools (like http://kcachegrind.sourceforge.net/html/Home.html).

9 Likes

Yep, fprof is quite comprehensive; ships with Erlang, so is quit well supported (and been around the block a few times) but … as you found … it produces some pretty gnarly output. I wasn’t actually aware of erlgrind, though, that’s awesome :slight_smile: Definitely will have to use that next time …

p.s. kcachegrind ftw… have used and loved that app for … gosh … i don’t know how many years now.

1 Like