This easy code eats system memory and kill VM - is there a way to protect against such memory leaks?

I have accidentally written the following module below. I just thought the first ‘perform(url)’ function, when called, will call the second one. Now when calling QueeJob.Url.performs("test") code will eat entire system memory and kill VM. I know that code is wrong, and crash dump points to fact garbage collector got crazy with this. This situation raised the question to me how one can fight potential memory leaks in app. We have a supervisors that help us when something crash. But is there a way how to protect against such memory leaks which kill entire VM?

defmodule QueeJob.Url do

    def perform(url) do
      perform({url,0})
    end

    def perform({url, sleep}) do
      :timer.sleep(sleep)
      IO.puts("URL processed: " <> url)
    end
end
2 Likes

No, there is not a way to prevent you from this kind of bugs. If you need a highly reliable system you need at least two machines anyway. There is only one good advice there, don’t do this sort of bugs and test thoroughly.

Oops … Does it mean you can crash the whole VM with properly forged parameters ?

I may use Dialyzer (don’t like the syntax) or guard clauses but there must be a way to limit recursion depth/mem per process or function ?

1 Like

This code doesn’t look wrong to me, I’m not getting the point, what’s wrong with the code?
It froze my computer and I had to press the power button to shut it down.

would somebody please explain it to me?

@pillaiindu: First clause perform(url) always wins, the latter is never matched.

Guard when is_binary(url) in the former clause, or just swapping them would fix the issue.

2 Likes

you can set a max_heap_size per process:

I’m still not getting the point.

I think the first function whenever called will call the second function and the second function will do it’s job.

why will the first function always win, if it is calling the second function explicitly?

The fist function takes one argument: url
The second function takes two argument url and sleep
And {url, sleep} is one argument and not two.

2 Likes

There is no such thing as “explicit call.” Both clauses have arity 1 and the former accepts literally everything.

2 Likes

perform(url) will accept perform({whatever,whateverelse}).

They compile as separate clauses of the same function.

1 Like

Yeah now I got the point!

made the point clear.

Nope, the second clause takes one argument {url, sleep}.

The first function clause accepts any call with one parameter. The second function clause also expects one parameter: a tuple containing two elements. However, it’s never called, since the first clause already matched.

A good rule of thumb is to always place more specific pattern matches above less specific clauses.

3 Likes

True… I missed that :smiley: That’s why this code is so tricky :wink:

2 Likes

He meant, one argument (the tuple) with two values (the url and sleep).

Edit: tuple, not struct

This is a tuple, not a struct. And it does not really matter how many elements it has.

I actually meant two arguments, but I;m glad you understand why this code does not work like a person who could write it would like it to work.

2 Likes

Well, it does warn you that the clause won’t match because the previous clause always matches.

I’d use a URL struct or use a different function name.

A good idea is to start with the most specific match and less specific matches after (which is why it warns you about that :).

1 Like

I’ll give the two functions different arities instead of using a guard clause.

Thanks, something to explore.
But how would you enforce these memory constraints on the code above ?
I’ve also found an interesting thread, not sure if it still applies though:

From Java Monitor - The Latest Java News

There are no mechanisms in the Erlang VM to curb the growth of the memory. The VM will happily allocate so much memory that the system shoots into swap, or that the virtual memory is exhausted. These may cause the machine to become unresponsive even to KVM console access. In the past we have had to power cycle machines to get access to them again.

The queue-based programming model that makes Erlang so much fun to write code for, is also it Achilles heel in production. Every queue in Erlang is unbounded. The VM will not throw exceptions or limit the number of messages in a queue. Sometimes a process stops processing due to a bug, or a process fails to keep up with the flow of messages being sent to it. In that case, Erlang will simply allow the queue for that process to grow until either the VM is killed or the machine locks up, whichever comes first.

This means that when you run large Erlang VM’s in a production environment you need to have OS-level checks that will kill the process if memory use skyrockets. Remote hands for the machine, or remote access cards is a must-have for machines that run large Erlang VM’s.

For people using Beam in production, does this still apply ?
If so, is there a way to circumvent that kind of behaviour ?

2 Likes