jeromedoyle
Is Elixir suitable for downloading multiple files simultaneously?
I have an app that downloads lots of large files from S3 throughout the day. I’m using ibrowse and streaming the files to disk. This works well, but cpu usage is pretty high when 5+ files are downloading at once. I know elixir isn’t the best choice for computation heavy tasks, but am I running into the same limitation with file downloads in the sense that the constant stream of messages overloads the cpu?
Most Liked
xlphs
What are you using to download? I use gen_tcp directly and can ingest 100MB/s easily on low end PC, although I buffer the IO a lot. It’s definitely something with your download logic.
dimitarvp
I also haven’t used ibrowse but I’ve used httpotion several times with Task.async_stream and have been able to download 200 files sumultaneously for hours at a time (and store them to an NFS volume, all on a small VPS: 256MB of RAM) and when me and a colleague watched it remotely with :observer the CPU was getting very slightly excited – 7-8% – with rare spikes to 15% (I am guessing garbage collector kicking in).
But I will agree with @xlphs – when in doubt about if the network is causing you problems, always reach for :gen_tcp first. It gives you 99% clear experience and if everything works well in that code then you either keep it and use it, or start making another module that uses a higher-level library and gradually isolate the problem.
jeromedoyle
I found what was causing my issues. One of the async responses I was getting was {:ibrowse_async_response, id, {:error, :req_timedout}} which matched the {:ibrowse_async_response, ^id, chunk} clause and thus called IO.binwrite with {:error, :req_timedout}. Once I added a clause to handle this and return an error tuple instead of calling IO.binwrite, cpu usage has dropped drastically. Thanks for the help everyone!








