Packmatic — On-the-fly Zip Generation

Packmatic generates Zip Streams by aggregating File or URL Sources.

Hex / GitHub

By using a Stream, the caller can compose it within the confines of Plug’s request/response model and serve the content of the resultant Zip acrhive in a streaming fashion. This allows fast delivery of a Zip archive consisting of many disparate parts hosted in different places, without having to first spool all of them to disk. The generated archive uses Zip64, and works with individual files that are larger than 4GB.

The Documentation contains a more detailed explanation of why it was built.

36 Likes

This looks great, thanks for releasing it.

:information_source: Packmatic 1.0.0 is now released. This release includes ahead-of-time verification enhancements for the Manifest, among other things. Changelog has been updated.

3 Likes

I am just wondering if you were aware of https://github.com/ananthakumaran/zstream when you started the project. Seems like there is a major overlap in features between the two

Yes. It does not do Unicode names, Zip64, or provide easy Plug integration. Not does it seem to have support for URL based Sources or any Custom Source which was recently added. I have seen a few other libraries like these as well.

The reason of building Packmatic was to solve a specific implementation issue outlined in the rationale section (in the README). Nevertheless, availability of these libraries made the task of implementing Packmatic easier. I have noticed that your library is not acknowledged in the README section and I will do so in the next release.

2 Likes

So packmatic can do unicode filenames on win/mac? I recently found out :zip doesn‘t do so.

Thanks for releasing it! I am currently working on a project which requires some zip file processing. :zip works fine at this moment but in the recent future we will be generating zip files on-the-fly with url based remote files.

I feel your library is such a perfect fit. So glad to know it and definitely will give it a try. Thank you for sharing!

2 Likes

I believe it does if I have interpreted the specification correctly. Haven’t had any issues so far with our own use cases.

See this: https://hexdocs.pm/packmatic/Packmatic.Field.Local.FileHeader.html (first item in Notes) and this: https://hexdocs.pm/packmatic/Packmatic.Field.Central.FileHeader.html (also first item).

zip64 is a known issue, but I was assuming it does support utf8 filenames (bit 11 of general purpose flag is set)

I had some similar requirements a few months ago:

  1. UTF8 file names
  2. Stream file input
  3. Stream archive output

For our specific use case we needed to generate very large zips of video data without worrying about disk space. The goal was to create a multi part upload in S3 as the zip was created and to avoid disk writes altogether. The goal was to limit memory use to that of the largest video file. This was an unusual enough use case that I created a new library for it: zap.

Due to the specifics of the use case it doesn’t compress at all (though it could quite easily). Through the use of streams it accumulates inputs and periodically emits chunks of output suitable for S3. Since switching to zap we haven’t had any disk space issues.

Edit: I’m sharing in this thread in case others find themselves in a similar situation. Zap has a specific use case that is much narrower than packmatic, but we probably could have made packmatic work for our needs.

3 Likes

I shall add yours to the list as well. For S3 Multipart Uploads my preference would be to use a separate component to accumulate/buffer chunks as there is a limit on number of chunks, and another on size of chunks.

Further, within Packmatic, the URL Source reads in chunks as well (powered by ibrowse), so it could theoretically process source files that do not fit on the host.

The design rationale was driven by user experience. I wanted the download to start instantaneously, and this could only be achieved by not buffering anything at all prior to vending of the stream. Once the download starts the user will wait.

3 Likes

What version of Erlang/OTP does :zip not handling unicode filenames on win/mac apply to?

Up until very recently :zip just couldn’t handle unicode filenames period, but I think the fix to that is now released as of at least Erlang/OTP 22.2. Is there an additional problem that is win/mac specific?

You can by doing :erlang.binary_to_list(name) which should work if the vm is in unicode filename mode (based on the docs). But it basically just dumps the bytes as they are, so on mac it works, because the fs uses utf-8, but on windows it doesn’t.

1 Like

Packmatic 1.1.0 has been released:

  • Added support for custom Sources.

    • Any module which implements Packmatic.Source can be used as a Source.
  • Added support for Encoder Events.

    • Added the on_event option to the Encoder which can be used to receive events.
    • See documentation for Packmatic.Event.
  • Revised Packmatic.Source.

    • Added callback validate/1 for entry validation.
  • Revised Packmatic.Manifest.Entry.

    • Moved validation of Initialisation Arguments to Sources.
  • Revised Packmatic.Source.File.

    • Added explicit cleanup logic.
  • Revised Packmatic.Source.URL.

    • Added explicit cleanup logic.
  • Revised Packmatic.Encoder.

    • Fixed acceptance of IO Lists, in case of custom Sources returning these instead of binaries.

An upcoming version will add ability to consume IO Lists from Streams.

5 Likes

Packmatic 1.1.1 has been released. Now it is possible to construct your archive with streams…

Added

  • Added Packmatic.Source.Stream.
    • Added support for Streams that output IO Lists.
  • Updated Packmatic.Source.
    • Added ability for any Source to use any data type as its Source State.
    • Added ability for any Source to return an updated Source State with new data.

Changed

  • Revised Packmatic.Source.Dynamic.
    • Removed custom resolver; any entry notation now accepted.
  • Updated development & test dependencies.
8 Likes

Hi all, Packmatic 1.2.0 is out, replacing ibrowse with hackney, and tested against more modern OTP versions.

7 Likes