How to validate files uploaded by user effectively?

Hey I am using arc to manage my file uploads, and I am using the file extension whitelist, as it is in the examples, but is that safe enough?
Is it possible to have different checking on file(images, other types of files as well like docx, zip, etc) uploads?
I found this: https://hex.pm/packages/file_info
but it deosn’t want to work yet

How is it not wanting to work yet? What error did you get?

Hey I am getting this for this file upload:

%Plug.Upload{
  content_type: "application/pdf",
  filename: "random-grid.pdf",
  path: "/tmp/plug-1569/multipart-1569287514-743406783659766-2"
}

and I am trying to do this:

FileInfo.get_info(attachment.filename) 

I also tried to append the filename at the end of the path given here but I am getting the same result

error:

** (MimetypeParser.Exception) illegal characters "`"
        (mimetype_parser) lib/mimetype_parser.ex:14: MimetypeParser.parse!/2
        (file_info) lib/file_info/mime.ex:25: FileInfo.Mime.parse!/1
        (file_info) lib/file_info.ex:39: FileInfo.to_tuple/1
        (elixir) lib/stream.ex:565: anonymous fn/4 in Stream.map/2
        (elixir) lib/enum.ex:3317: Enumerable.List.reduce/3
        (elixir) lib/stream.ex:1568: Enumerable.Stream.do_each/4
        (elixir) lib/enum.ex:3015: Enum.reverse/1
        (elixir) lib/enum.ex:2647: Enum.to_list/1
        (elixir) lib/map.ex:181: Map.new_from_enum/1

I am getting hte same result for every single file.
The setup is the following, I am using ubuntu, and the project is running in a docker container if I run the file command inside the container it works.
If i run the project in iex mode and try to call

FileInfo.get_info/1

if I call it with files inside that very folder that i am in it works, I can get results like

%{"mix.lock" => %FileInfo.Mime{arguments: %{}, subtype: "plain", type: "text"}}

but with other files, outside this folder I can only get that same error from earlier

This for sure will not work… How do You append the file and path?

FileInfo.get_info("#{attachment.path}#{attachment.filename}")

You miss a slash in between, as I thought… the path is not valid without.

still doesnt want to work:

%Plug.Upload{
  content_type: "application/pdf",
  filename: "random-grid.pdf",
  path: "/tmp/plug-1569/multipart-1569289758-138513881763554-3"
}
FileInfo.get_info("#{attachment.path}/#{attachment.filename}")

getting the same error, but it did miss the slash so thank you for that

If i do just the underlying command:

System.cmd("file", ["--mime-type" | ["#{attachment.path}/#{attachment.filename}"]])

then the result is this:

{“/tmp/plug-1569/multipart-1569290249-648297072043659-1/random-grid.pdf: cannot open `/tmp/plug-1569/multipart-1569290249-648297072043659-1/random-grid.pdf’ (Not a directory)\n”,
0}

It is working fine on my pc with a pdf file…

iex> FileInfo.get_info "/Users/koko/downloads/MarcAurelePensees.pdf"
%{
  "/Users/koko/downloads/MarcAurelePensees.pdf" => %FileInfo.Mime{
    arguments: %{},
    subtype: "pdf",
    type: "application"
  }
}

But your error seems to be a non existant file… You should check first if the file exists.

File.exists? ...

ok, that says it doesn’t exist, but then how can plug and arc give a path to it and upload it?

From docs

Note: This file is temporary, and Plug will remove it from the directory as the request completes. If we need to do anything with this file, we need to do it before then.

but am i not before the request completes?
these logs are from before a response is sent, in the window i think it should work

If the file does not exist anymore… I would say You are not. But it’s hard to tell What/How You are doing in controller, without code.

  def post_attachment(conn, %{"id" => id, "attachment" => attachment}) do
    IO.inspect(attachment)

    IO.inspect(File.exists?("#{attachment.path}/#{attachment.filename}"))
    x = System.cmd("file", ["--mime-type" | ["#{attachment.path}/#{attachment.filename}"]])
    # x = FileInfo.get_info(["#{attachment.path}/#{attachment.filename}"])
    IO.inspect(x)
    caption = conn.params["caption"]

    type =
      if attachment.content_type =~ "image" do
        "photo"
      else
        "document"
      end

    with(
      topic = %Topic{} <- Topics.get_topic_by_hash_id(id),
      user = %User{} <- user_by_rsvp_or_group_token_or_full_user(conn),
      user_topic = %UserTopic{} <- UserTopics.get_user_topic(user.id, topic.id),
      {:ok} <- user_topic_status_check(user_topic),
      {:ok, post} <-
        PostAttachments.create_post_attachment(user, user_topic, %{
          "attachment" => attachment,
          "caption" => caption,
          "type" => type
        })
    ) do
      # Notifier
      if caption != nil && caption != "" do
        Notifications.notify_users_of_new_topic_post(user, topic, caption, type)
      else
        Notifications.notify_users_of_new_topic_post(user, topic, type)
      end

      # Update group updated_at
      if topic.reference == "group" do
        Groups.update_group_latest_changes(Groups.get_group_by_id(topic.reference_id))
      end

      conn
      |> put_status(200)
      |> put_view(TopicView)
      |> render("200+hash_id.json",
        message: "Attachment on topic was succesful.",
        id: post.post_with_hash.hash_id
      )
    end
  end

the logs:

%Plug.Upload{
content_type: “application/pdf”,
filename: “random-grid.pdf”,
path: “/tmp/plug-1569/multipart-1569291658-659131657529395-2”
}
false
{“/tmp/plug-1569/multipart-1569291658-659131657529395-2/random-grid.pdf: cannot open `/tmp/plug-1569/multipart-1569291658-659131657529395-2/random-grid.pdf’ (Not a directory)\n”,
0}
%Arc.File{
binary: nil,
file_name: “random-grid.pdf”,
path: “/tmp/plug-1569/multipart-1569291658-659131657529395-2”
}

the last piece is form the current arc validator, where it checks the file extension
to me this would mean that “somehow” it doesnt exist when im logging it out, but it exists for arc?

oh wait… what about using path only? I am not sure the filename should be included.

from docs…

"multipart-558399-917557-1" is the name Plug gave to our uploaded file.

Filename is just the real name of the file, but the tmp file is called multipart…

4 Likes

wow :sweat_smile::crazy_face::clap::vulcan_salute:
thanks

Additionally, never do this:

"#{path}/#{filename}"

Instead, use standard API:

Path.join(path, filename)
6 Likes

Thanks a lot! I found your post after banging my head against the wall for too many hours. :sweat_smile: