:ftp.start_service problem

Hi, i have an Elixir application but it’s behaving in a very strange way. It’s a very simple phoenix app (not of any interest, just to give you some context). It opens a ftp connection towards a public ftp server, and must download a file. Nothing more than this. The point is that very often, it gets completely stuck on the instruction:

:ftp.start_service(host)

executed after

:ftp.start()

which is executed correctly.

This causes a beam crash & dump, making it very hard to debug (absolutely no chance to have a log).

Every now and then it works. I have noticed that this morning failure were VERY unfrequent, in the noon / evening this is very worse. And, on Amazon EC2 instance, the same docker image is totally unable to establish not even ONE question. 100% failures.

I am using elixir:1.8.1-alpine to build the release and i am running it in alpine. The build is made by a multistage Dockerfile, this is the reason for the two images.

Any suggestion on what/how to debug ?

Can you please share the value of host? And perhaps properly describe what you mean by “it gets completely stuck”?

Have you checked if perhaps the returned value is not as expected and therefore your application crashes?

1 Like

Hi @NobbZ, thanks for reply. The host is NASA ftp server, ‘cddis.gsfc.nasa.gov’, and by stuck i mean stuck: it hangs, apparently consumes all the resources of the beam vm, making it crash. I don’t have any logs other than the dump:

eheap_alloc: Cannot allocate 212907632 bytes of memory (of type “heap”).

If I followed the slack correctly, then you are hit by this bug in the erlang FTP client?

https://bugs.erlang.org/browse/ERL-909?jql=text%20~%20"%3Aftp"

3 Likes

Yeah, i discovered after the post and 24 hours of madness :confused:

1 Like