Elixir + files on heroku

I have a system where a tar file is received by an Elixir/Phoenix app, downloaded and an extracted, yielding a LaTeX file and a directory of image files. pdflatex is run twice and the location of the resulting pdf file is shipped back to the client. The client can then request the file. All this works on my laptop – but not on Heroku. I do get a message from the server indicating that extraction and pdflatex have succeeded. (See code at end). Also, the message below tells me that the tar achive has been saved.

However, when the client requests the file, it is not found.

I know the Heroku storage is ephemeral. However, for the short time I need these files, should generally be ok.

HEROKU LOG:

2019-01-12T22:27:44.217500+00:00 app[web.1]: params for 'process': %{"filename" => "bras_and_kets"}
2019-01-12T22:27:44.217507+00:00 app[web.1]: 22:27:44.212 request_id=bfb4cecb-7738-44f9-8838-1b27f968898c [info] POST /api/print/pdf/bras_and_kets
2019-01-12T22:27:44.217509+00:00 app[web.1]: BODY: <<98, 114, 97, 115, 95, 97, 110, 100, 95, 107, 101, 116, 115, 46, 116, 101, 120,
2019-01-12T22:27:44.217511+00:00 app[web.1]: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
2019-01-12T22:27:44.217513+00:00 app[web.1]: 0, 0, 0, 0, 0, 0, 0, ...>>
2019-01-12T22:27:44.217730+00:00 app[web.1]: PATH: printfiles/bras_and_kets/bras_and_kets.tar
2019-01-12T22:27:44.237525+00:00 app[web.1]: XX, FILE EXISTS: printfiles/bras_and_kets/bras_and_kets.tar
2019-01-12T22:27:44.338354+00:00 app[web.1]: 22:27:44.337 request_id=bfb4cecb-7738-44f9-8838-1b27f968898c [info] Sent 200 in 125ms
2019-01-12T22:28:14.005012+00:00 heroku[router]: at=info method=GET path="/print/pdf/bras_and_kets" host=nshost.herokuapp.com request_id=4c2b7a3a-6acb-431b-9d73-6571833ff720 fwd="24.92.138.170" dyno=web.1 connect=1ms service=38ms status=200 bytes=611 protocol=https
2019-01-12T22:28:14.005494+00:00 app[web.1]: 22:28:14.001 request_id=4c2b7a3a-6acb-431b-9d73-6571833ff720 [info] GET /print/pdf/bras_and_kets
2019-01-12T22:28:14.006072+00:00 app[web.1]: 22:28:14.005 request_id=4c2b7a3a-6acb-431b-9d73-6571833ff720 [info] Sent 200 in 4ms

CONTROLLER FUNCTION 1: handle POST request sending tar archive

def process(conn, params) do

    IO.inspect params, label: "params for 'process'"
    {:ok, body, conn} = Plug.Conn.read_body(conn, length: 3_000_000)
    IO.inspect body, label: "BODY"

    bare_filename = params["filename"]
    tarfile = "#{bare_filename}.tar"
    texfile = params["filename"] <> ".tex"
    prefix = "printfiles/#{params["filename"]}"
    {:ok, cwd} = File.cwd
    File.mkdir_p prefix
    tar_path = "#{prefix}/#{tarfile}"
    IO.puts "PATH: " <> tar_path
    {:ok, file} = File.open tar_path, [:write]
    IO.binwrite file, body
    File.close file

    case File.read(tar_path) do
      {:ok, body} -> IO.puts "XX, FILE EXISTS: #{tar_path}"
      {:error, reason} -> IO.puts "XX,  NO SUCH FILE: #{tar_path}"
    end

    # System.cmd("tar", ["xvf", path])
    System.cmd("tar", ["-xf", tar_path, "-C", prefix ])
    File.cd prefix
    System.cmd("pdflatex", ["-interaction=nonstopmode", texfile])
    System.cmd("pdflatex", ["-interaction=nonstopmode", texfile])
    File.cd cwd

    conn |> render("pdf.json", url: bare_filename)
  end

CONTROLLER FUNCTION 2: get the pdf file:

  def display_pdf_file(conn, %{"filename" => filename}) do
    path = "printfiles/#{filename}/#{filename}.pdf"
    case File.read(path) do
      {:ok, body} -> Plug.Conn.send_file(conn, 200, path)
      {:error, reason} -> conn |> render("pdf_error.html", path: "Sorry, couldn't find the PDF file.")
    end
  end

Also: I see no directory printfiles at the root level of the Heroku directory, where I expect it to be.

Is this app running on more than a single dyno on Heroku? That would be the most likely cause imo. Each dyno has its own file system, and files cannot be shared across them.

S3 is cheap and I’d recommend using that here to ensure your files stick around.

Just one for now, but that is an excellent point.

The problem seems to be in creating the pdf file … so far a mystery why it is messing up on Heroku. If I do pdlatex --version after heroku run bash, I get the expected output.

2019-01-12T23:29:10.644936+00:00 app[web.1]: params for 'process': %{"filename" => "hydrogen_atom"}
2019-01-12T23:29:10.663650+00:00 app[web.1]: BODY: <<104, 121, 100, 114, 111, 103, 101, 110, 95, 97, 116, 111, 109, 46, 116, 101,
2019-01-12T23:29:10.663654+00:00 app[web.1]: 120, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
2019-01-12T23:29:10.663656+00:00 app[web.1]: 0, 0, 0, 0, 0, 0, 0, 0, 0, ...>>
2019-01-12T23:29:10.665158+00:00 app[web.1]: CWD: /app
2019-01-12T23:29:10.665911+00:00 app[web.1]: PATH: printfiles/hydrogen_atom/hydrogen_atom.tar
2019-01-12T23:29:10.667216+00:00 app[web.1]: XX, TAR FILE EXISTS: printfiles/hydrogen_atom/hydrogen_atom.tar
2019-01-12T23:29:10.723437+00:00 app[web.1]: CWD, @prefix: /app/printfiles/hydrogen_atom
2019-01-12T23:29:10.723779+00:00 app[web.1]: XX, TEX FILE EXISTS: hydrogen_atom.tex
2019-01-12T23:29:10.723976+00:00 app[web.1]: Running pdflatex (1) ...
2019-01-12T23:29:10.975010+00:00 heroku[router]: at=info method=POST path="/api/print/pdf/hydrogen_atom" host=nshost.herokuapp.com request_id=7aa02e6f-28e7-473d-b694-d3e08701fd7e fwd="24.92.138.170" dyno=web.1 connect=0ms service=438ms status=200 bytes=383 protocol=https
2019-01-12T23:29:10.854274+00:00 app[web.1]: Running pdflatex (2) ...
2019-01-12T23:29:10.945277+00:00 app[web.1]: XX,  NO SUCH PDF FILE: hydrogen_atom.pdf

SOURCE:

def process(conn, params) do

    IO.inspect params, label: "params for 'process'"
    {:ok, body, conn} = Plug.Conn.read_body(conn, length: 40_000_000)
    IO.inspect body, label: "BODY"

    bare_filename = params["filename"]
    tarfile = "#{bare_filename}.tar"
    texfile = params["filename"] <> ".tex"
    pdffile = params["filename"] <> ".pdf"
    prefix = "printfiles/#{params["filename"]}"
    {:ok, cwd} = File.cwd
    IO.puts "CWD: #{cwd}"
    File.mkdir_p prefix
    tar_path = "#{prefix}/#{tarfile}"
    IO.puts "PATH: " <> tar_path
    {:ok, file} = File.open tar_path, [:write]
    IO.binwrite file, body
    File.close file

    case File.read(tar_path) do
      {:ok, body} -> IO.puts "XX, TAR FILE EXISTS: #{tar_path}"
      {:error, reason} -> IO.puts "XX,  NO SUCH TAR FILE: #{tar_path}"
    end

    # System.cmd("tar", ["xvf", path])
    System.cmd("tar", ["-xf", tar_path, "-C", prefix ])
    File.cd prefix
    {:ok, cwd} = File.cwd
    IO.puts "CWD, @prefix: #{cwd}"
    
    case File.read(texfile) do
      {:ok, body} -> IO.puts "XX, TEX FILE EXISTS: #{texfile}"
      {:error, reason} -> IO.puts "XX,  NO SUCH TEX FILE: #{texfile}"
    end

    IO.puts "Running pdflatex (1) ..."
    System.cmd("pdflatex", ["-interaction=nonstopmode", texfile])
    IO.puts "Running pdflatex (2) ..."
    System.cmd("pdflatex", ["-interaction=nonstopmode", texfile])

    case File.read(pdffile) do
      {:ok, body} -> IO.puts "XX, PDF FILE EXISTS: #{pdffile}"
      {:error, reason} -> IO.puts "XX,  NO SUCH PDF FILE: #{pdffile}"
    end
    File.cd cwd

    conn |> render("pdf.json", url: bare_filename)
  end

Have you checked if the latex run is done without an error?

{_, 0} = System.cmd("pdflatex", args, stderr_to_stdout: true)

Perhaps even use the first element of the tuple to collect combined stdout/stderr and check what the error was (if there was one).


edit

Is pdflatex even installed on heroku?

1 Like

I checked with pdlatex --version (I used a buildpack to install it). Am using an option to run pdflatex in a mode where it ignores errors . But I should check as you suggest. Thanks!

Well, it won’t ask you for how to deal with errors, but if there are errors that prevent a pdf from getting generated, than latex can’t do anything.

Yep, agreed, trying to log errors now and see what’s going on.

Using your method to capture pdflatex errors, I’ve found the source of the problem. My .tex files use some .sty files that are not in the TeXLIve distribution defined by the buildpack. Trying to figure out how to resolve these.

Thanks so much for your help!

1 Like

Previously I’ve created *.deb files and installed them with the apt-buildpack to supplement the install. Might help you in this situation.

1 Like

Thanks! I will look into that. Can you point me to a reference on *.deb files?

Those are package files for Linux Debian system.

This is what I used when I started making packages for Heroku.

I then hosted them on S3, and added the URL to Aptfile.

1 Like

Thankyou! Looks very good … will try to set uo tomorrow.

1 Like

It turns out that there is an elegant solution to the problem of adding packages. One makes a file texlive.packages at the root of the app directory with contents like this:

wrapfig
xcolor

When the repo is pushed to Heroku, the indicated packages are downloaded and added to the system. One can also add entries like

collection-bibtexextra
collection-fontsextra
collection-langgerman
collection-xetex

See https://elements.heroku.com/buildpacks/pubpub/heroku-buildpack-tex

I thank you all for your help!

1 Like