File management sample codes?

Hi everyone!

My name is César and I’m learning how to use Elixir. My goal is to create some kind of web crawler to locate historical data. Anyway, right now, I am into a more mundane task.

I’m trying to create some programs in Elixir than help me sort my files. For example, I’d like to sort my files by extension or order my photos by modification date. I thought that it would be an easy task, but there are not so many examples of how to traverse directories and move files. I’ve checked on Exercism and several online books, but I am always taken back to the reference manual.

I’ve seen the Files.rename/2 information, but given my current Elixir level, I’m not sure about how to combine it with safeguards so the results are not disastrous. (For example, I might have several source files called IMG_0001.jpg and I don’t want to have them silently overwritten).

Do you know any source that might have this kind of file management examples? Any cookbook that covers this? Most resources flow like Install |> Strings and binaries |> Flow control |> OTP BeamVM, skipping the file chapter :slight_smile:

Thanks for your support!

2 Likes

This might not be the answer you are looking for – but I’d reach for sqlite3. It also can be up to 35% faster than raw file access.

Trouble is, current state of the art of sqlite3 in Elixir does not support Ecto 3 (latest version of the de facto DB persistence library); only version 2 which is now quite old. I am working on bringing Ecto 3 to sqlite3 but it definitely is not going to be ready tomorrow.

Using sqlite3 will rid you of the potential file overwriting problem as well since there the ID is… you know, the database ID, not the filename itself. What’s more, an embedded DB like sqlite3 gives you the ability to sort and filter out of the box.


If that doesn’t sound tempting to you then I’d be happy to help you exactly with the File / IO API. However, you should have in mind that functions like stat can have differing behaviours between different OS-es. Which is all the more reason to opt for a database.

Any particular scenario you would like help with?

If you would like something that basically manages uploads to your app then Waffle might be exactly what you are looking for (it can put all stored files into Amazon’s S3 or in your local filesystem).

1 Like

Thanks for your response @dimitarvp! I believe managing files using sqlite3 can be very beneficial for particular scenarios. I could create hashes from files and then work with those IDs but my scenarios are quite more mundane. Let me walk through some of them.

Example 1: I regularly dump all my videos and pictures from different sources in a folder. I want to sort them into different folders based on the extension. So I would have a folder for JPG files, another one for PNG files, etc. My files are named IMG_XXXX.jpg. The names roll over each 10.000 pictures, so I have to take care not to overwrite any existing files.

So, in pseudo code it would be something like this:

  1. List all the files in given directory
  2. For each file:
    a) Check if there is a folder with the same name as the file extension. If it doesn’t exist, create the folder.
    b) Check if there is a file with the same name as the current file in the extension directory. If it does exist, rename existing file adding _1 to the end of the name (or _2,_3) to prevent overwrites.
    c) Move file to its extension directory.

Reviewing the File doc, my initial take would be to make something like this (Note: this doesn’t work, is pseudo elixir).

defmodule sort_files_by_extension do
  def sort_files do
    File.ls!("./test")
    |> Enum.each(fn filename -> [name | extension] = String.split(filename, ".") end)
    |> Enum.each(&check_extension_dir/1)
    |> Enum.each(&check_duplicate_files/2)
    |> Enum.each(&move_extension_directory/2)

  defp check_extension_dir extension do
      File.exists?("./#{extension}")
      File.dir?("./#{extension}")
      # If both checks are false, then create the dir
  end

  defp check_duplicate_files name, extension do
    File.exists?("#{extension}/#{filename}")
    # If file exists, append _n to name, checking for duplicates again    
  end

  defp move_to_extension_directory name, extension do
    File.rename!("./#{filename}", "./#{extension}/#{name}"   
  end
end

I’ve been reading Pragmatic Programming book and on the if topic, there is this advice: try to work with functions instead of resorting to classical imperative if/then flows. So right now, I feel I’m a bit handicapped in the middle ground: without too much knowledge about functional and avoiding imperative structures. Any comments or improvements to make this work are kindly welcome!

defmodule sort_files_by_extension do
  def sort_files do
    File.ls!("./test")
    |> Enum.map(fn filename -> 
         extname = Path.extname(filename)  #=> ".jpg"
         basename = Path.basename(filename, extname)  #=> "IMG_XXXX"
         {basename, extname, filename}
       end)
    |> Enum.map(&make_extension_dir/1)
    |> Enum.map(&dedup/1)
    |> Enum.each(&move_extension_directory/1)

  defp make_extension_dir({_, "." <> extension, _} = arg) do
    File.mkdir_p!("./#{extension}")
    arg  # just return the argument for other `Enum.map`
  end

  defp dedup(arg, suffix \\ 0)

  defp dedup({basename, "." <> extension, filename} = arg, 0) do
    if File.exists?("#{extension}/#{filename}") do
      dedup(arg, 1)
    else
      arg
    end  
  end

  defp dedup({basename, "." <> extension = extname, filename} = arg, suffix) do
    if File.exists?("#{extension}/#{basename}_#{suffix}.#{extension}") do
      dedup(arg, suffix + 1)
    else
      {"#{basename}_#{suffix}", extname, filename}
    end 
  end

  defp move_to_extension_directory({basename, "." <> extension, filename}) do
    File.rename!("./test/#{filename}", "./#{extension}/#{basename}.#{extension}")
  end
end

A few suggestions:

  1. Enum.each is for pure side effects (e.g. pure file system operations). It’s not chainable. Use Enum.map instead.
  2. You can wrap all the information in a tuple, and pattern match on it in the function parameters.
  3. You can trust the file system to do the right thing (e.g. File.mkdir_p!)
  4. Hail recursion!
2 Likes

I’d probably do this:

  def sort_files(path) do
    files_by_dir =
      File.ls!(path)
      |> Enum.reject(&File.dir?/1)
      |> Enum.group_by(&Path.extname/1)

    Enum.each(files_by_dir, fn {"." <> ext, files} ->
      extpath = Path.join(path, ext)
      File.mkdir_p(extpath)

      Enum.each(files, fn file ->
        filepath = Path.join(extpath, file)
        # Keep only a max of 9, could easily make this unbounded though, but eh useful feature to add
        Enum.each(9..2, &File.rename("#{filepath}_#{&1-1}", "#{filepath}_#{&1}"))
        File.rename(filepath, "#{filepath}_1")
        File.rename!(Path.join(path, file), filepath)
      end)
    end)
  end

Could use some error reporting, but eh.

1 Like

Thanks for your responses! I will review them during the weekend to learn about the different approaches. I really appreciate your suggestions and tips to improve my skills :slight_smile:

P.D As I was looking for other interesting examples, I stumbled upon this Bret Trepstra Ruby script to sort based on tags. I was so surprised to understand the syntax!!