I have Elixir/Phoenix server, based on user requirements server generates markdown and latex files, then it runs System.cmd("pandoc", [relevant commands])
to generate PDF files everything works fine in the local system, here the dependencies are (pandoc and latex) installed locally
now I’m trying to dockerize the project.
I tried installing pandoc and latex in phoenix_server container and it worked fine but the final docker image size increased to 8.5GB because texlive itself has 7.5GB so its not an option
I found this pandoc/latex:2.18 image
so my idea is to create 3 docker containers and run docker-compose up
container_1: Phoenix_server
container_2: postgres:14
container_3: pandoc/latex:2.18
but it didn’t worked.
challenges:
1 sharing server generated file’s with pandoc/latex container, for this I’m thinking to using docker volume
option
2 I could not figure out how to run cli commands from phoenix container
onto in pandoc/latex container
any help is greatly appreciated
Thank you
1 Like
I’m not following the logic at the end there: what’s the problem with a large image?
2 Likes
the idea is to split out all the systems into separate containers(phoenix, pandoc, postgres) and host on kubernets setup, actual reason I also don’t know my boss wanted to do it this way so I’m doing it
maybe considering host cost
If you’re eventually deploying to k8s, you don’t want to start out using Docker-specific features like volume
.
One approach that works cleanly across all deployment methods is to wrap a network service around the pandoc
CLI: something that speaks your favorite network protocol (HTTP + JSON, gRPC, whatever) and accepts arguments + directions for where to put the output. Given that TeX can take a while, it probably needs to implement some kind of status tracking (“is my run of pandoc
complete yet?”) or similar.
I’ve built something similar at a previous job. We needed to combine PDFs using a library only available on the JVM, so we wrote a small HTTP interface using Sinatra and JRuby (our team had a lot of folks who knew Ruby as well as Elixir). It accepted S3 URLs for the input PDFs and a webhook URL to notify when the output was available on S3.
2 Likes
I figured out how to reduce docker image size and install all dependencies in same phoenix container through lot of trial and errors current image size is 2.23gb from 8.5gb
the Dockerfile contains this now:
FROM hexpm/elixir:1.13.0-erlang-24.0.5-ubuntu-focal-20210325
RUN apt-get update && \
apt-get install -y \
postgresql-client inotify-tools
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y \
build-essential xorg libssl-dev libxrender-dev git wget gdebi xvfb \
# Install wkhtml to pdf
wkhtmltopdf \
# Install pandoc
pandoc \
# Install latex
# texlive-latex-base \
# texlive-latex-recommended \
# texlive-pictures \
# texlive \
texlive-fonts-recommended \
texlive-plain-generic \
texlive-latex-extra \
texlive-xetex
RUN echo "xvfb-run -a -s \"-screen 0 640x480x16\" wkhtmltopdf \"\$@\"" >/usr/local/bin/wkhtmltopdf-wrapper && chmod +x /usr/local/bin/wkhtmltopdf-wrapper
WORKDIR /app
COPY config /app/config
COPY lib /app/lib
COPY mix.exs mix.lock /app/
COPY priv priv
COPY assets assets
COPY entrypoint.sh /app/
RUN mix local.hex --force && \
mix local.rebar --force && \
mix deps.get
RUN mix do compile
CMD ["bash", "/app/entrypoint.sh"]
the final image size may reduce a little more if I use mix release
so I’m settling with the above dockerfile setup
and I found this useful link while searching for a relevant solution, just posting it here for future reference
How to execute command from one docker container to another