Secure files transfers servers in Elixir?

Well a while back I wanted to transfer some files so …

:slight_smile:

6 Likes

Well you have sockets, and something to compute an MD5 checksum - which is actually all you need.

Forget all the generic this and that and make a simple socket client and server (homework - find out the smallest socket client and server code)

Making a client/server over a socket is virtually the first thing I do in any language when learning - very instructive

Repeat in C, TCL, JS, Java, Ruby, Python, Perl, C#

Seriously, the round trip client -> server -> client with a check that you transfer all data and loose nothing is an essential programming technique - learn to do this first THEN add JSON/XML (whatever) on top.

Once you can get two programs in different languages communicating raw bytes over a socket then you can start having fun - Use the low level socket libraries and NOT fancy frameworks - this is the slow way to program - BUT in the long term the best way - understanding is the key to programming libraries which hide what you are really doing should only be used once you understand what’s really happening

Cheers

20 Likes

Possibly even DPDK…

1 Like

I will follow your “slow” way in Elixir to learn it with the help of your article “Why I often implement things from scratch”.
Thanks for your advice ! :wink:

24 posts were split to a new topic: Secure Files Transfers Servers in various languages Discussion

Very exciting debate on languages comparisons… Even if it drives me a bit far from my original concern ! :roll_eyes::grinning:
Nevertheless, to fix some boundaries about my initial request:

  1. The fact that I posted my request on Elixirforum and nowhere else is a bit related to the fact that my benchmarks focus on Elixir only as it is what I have to evaluate: So Rust, Go, OCaml or others (surely) fine and powerfull languages are, for the moment at least, out of my scope,
  2. As I mentionned in the original post, my tinies “files exchange” servers will be located on several computers/VM. This is indeed one of the main reasons that leaded me to test the Elixir/Erlang/OTP system as its distributed aspect seems to be praised everywhere,
  3. I’m only querying practical and clear advices to lead an “Elixir n00b” like me on the jungle of Elixir concepts (GenServers, GenStages, Supervisors, Tasks, Agents and the like) in order to achieve my goals while trying as much as possible to avoid the waste of time generated by exploring some unefficient dead ends (NB: this experimentation is done by me in addition to my regular job, partly outside of my working hours). Just like did Joe Armstrong or Scott Thompson in this thread, for instance…:+1:
  4. I also hope that the advices given could help some others beginners, facing problems similar to mine…
2 Likes

Don’t forget that Elixir is a fantastic ‘glue’ language between languages as well, it’s Ports and Nif’s are not to be discounted. :slight_smile:

Well the first decision is what protocol to use, like do you want to use a standard protocol like HTTPS or SFTP, or do you want some custom protocol inside an SSL connection? If you just want something secure and fast then the beam has a built-in SFTP module, the server documentation of which is at: Erlang -- ssh_sftpd :slight_smile:

If using https then cowboy/plug/phoenix (depending on what extra functionality and features you want, and what kind of clients will be pulling it) will work well.

Though more information and details can be given if this can be filled out. :slight_smile:

  1. What client(s) do you want pulling from it (something prebuilt and if so what, or will you make it yourself)?
  2. What protocol(s) do you want to use (something pre-defined, or making something custom)?
  3. What do you want the BEAM server to do that something like an SFTP server itself cannot do?
  4. Etc… anything else that might be relevant. :slight_smile:
1 Like

Fisrt, thanks for your help.

Will make it myself as the client will be (only) the server: BTW each server should be able to send and receive files and these files should only be transfered from a server (node ?) to another server (no others clients).

I’m not really firmly decided on that for the moment. My specs are that the files should be crypted during the transfer (so sftp would fit) but also that integrity should be verified at the end of each transfer (I don’t think that sftp achieves that) by some checksum or others means .
A plus could be that theses servers should also resume partials transfers but this is optionnal…

The integrity thing mentionned above ?

I’d like this server to be the most efficient possible (big files transfers and simultaneous transfers of numerous small files) as we will do a benchmark and compare the Elixir one to some others same servers made in C++ by another team.
(I actually try to promote Elixir inside my company ! :wink: )

Surprised no one’s mentioned BitTorrent yet. By definition, a way to efficiently distribute files across peers. Found this slide deck about the details: http://www.erlang-factory.com/static/upload/media/1474809327462227martingausbyimplementingbittorrentinelixireuc2016.pdf

1 Like

Perhaps because implementing a whole P2P protocol such as BitTorrent seems to be, as stated in the slides, a bit of, an “hard challenge” (at least for me) to achieve the goals that, at first glance, didn’t seem so complicated to me (but I’m certainly wrong indeed): securely transfering files between (only) two servers/nodes with integrity checks ?
Anyway, thanks for your advice.

If it’s just two servers, yea, BitTorrent is overkill.

2 Likes

If it’s purely a BEAM mesh network then you can transfer a file as easily as sending a message, you can even access a file on a remote node from another node directly. :slight_smile:

Well there is a check-file extension in the sftp protocol for that but it’s an optional extension so I don’t know if the system supports it. Might be good to emulate it by just having each end expose a md5/sha1/whatever prefixed file of the same name that when accessed will hash the file and return the hash as that would be pretty simple to set up.

You can pack a CRC/hash along with it, just send something like a tuple or a map with fields for the hash, filename, and content. :slight_smile:

I/O is what the beam does very well, especially with sendfile calls too if you don’t mind setting up transfer-specific sockets between the servers, but even the distribution network is fine for that until it gets too loaded down and messages get delayed. :slight_smile:

1 Like

Now I experiment two differents ways:

  • The first one by using the Michael Dorman’s Sftp_ex library to implement some kind of simplified SFTP client/server which seems to be ok for the moment,

  • The second one simply using the node mechanism

For the “node” experiment, everything is ok when I use read and save file from one node to a remote one but I’m stuck when I try to stream a file from a node to the remote one (in case of huge files, streaming seems to be a better idea than read/save) as the struct of stream does not allow that.
If someone has some smart idea to achieve this… :roll_eyes:

For the stream you’ll need to send each result from the stream via a message to a receiver to stuff back into a receiving stream. A simple hoist. :slight_smile:

1 Like

Yes, I found that as, for instance, using the Stream.resource/3 function (that’s what Michael Dorman does in his Sftp_ex library for streams BTW) but as I’m lazy as a Stream, I wondered if someone didn’t have already done that then cut and paste and so on… :wink::innocent:
But now as I give up the “node solution” (it seems that the nodes and Erlang clusters are not to be used for my distributed files servers problem) and I fall back on my SFTP solution, using… Sftp_ex !

1 Like

Some news:
Now my “SFTP Server/Client” in Elixir (nearly) works (except the “resume / reget” stuff :rage:,. The checksum is ok, with severam encoding choices, thanks to streaming, is ok):
The server part is inspired (but not using as a dependency) by Exsftpd from Codenaut and the Client part by Sftp_Ex from Mike Dorman or SFTP Client from Tobias Casper (I have two versions)
Great Thanks to these peoples and to those who put me on the Elixir/Erlang tracks here and there ! :+1::+1::+1:

1 Like

My “SFTP Server/Client” is now functional with “resume” (reget/reput to resume incomplete transfers), retries with timing (retry transfers n times, every x seconds), and integrity checks (checksums).
Each transfer use streams to avoid too much memory consumption and to speed up the checksum checks (calculated at the same time as the transfer by using pipes) .
I got rid of every SFTP client dependencies mentionned above (but they greatly inspired me) and the SSH/SFTP functions used are now Erlang :ssh_sftp calls.
I only kept the elixir_uuid library (needed to generate unique ID for each transfer) and the logger_file_backend library to log to a file.
I use ETS/DETS to keep track of failed tranfers.
Unfortunately, I can’t publish my code on Github as it is part of a corporate application.:disappointed:

1 Like

Can’t you just put it into a library and sell it to the management as “giving back to the community”? Companies these days mostly use OSS software so it’s only fair to give back every now and then. It’s also great (and basically free) marketing, plus you get improvements and bug reports from the community.

2 Likes

I will ask but there’s few chance…
Anyway I would have to seriously clean my code beforehand as it’s a “learning” one: full of Elixir n00b’s “scories” ! :smiley:

1 Like

I’ve got official confirmation by my managers that I’m not allowed (at least for the moment) to publish my code. :disappointed:

Anyway, I will remain at the disposal (in this thread) of anyone who struggles with such SFTP problems with Elixir…

1 Like