Simple way to stream by line from compressed .gz file

Hello,

I have a text file which has lines with new line separation “\n”

I need to stream this file by lines so I can apply my own functions per line and I am doing like this:

iex(30)> "file.txt" |> File.stream!() |> Stream.map(&(my_whatever_function_here(&1))) |> Enum.take(50)

Now the same file becomes gzip compressed data and I would like to stream it by lines and get the same result as from above code.

I am doing it like:

iex(38)> "file.gz" |> File.stream!([], 2024*2024) |> StreamGzip.gunzip() |> WHAT_I_NEED_TO_PUT_HERE |> Stream.map(&(my_whatever_function_here(&1))) |> Enum.take(50) 

and question is what function(s) i need to put instead of “WHAT_I_NEED_TO_PUT_HERE” placeholder ?

I tried lots of things , splitting by “\n” applying flat_map but I can not get result that I am getting new line after StreamGzip.gunzip()

Seems this simple requirement is not trivial to implement.

Thank you for your time.

1 Like

I’ve been thinking about this lately too as I want to use elixir to read my nginx access logs, though I would have assumed you would need to decompress the whole thing before you could stream it. That said I hope I’m wrong and you find this solution :slight_smile:

One of my colleague suggest this:

"file.gz" |> File.stream!([:compressed]) |> Stream.map(&(my_whatever_function_here(&1))) |> Enum.take(50)

I tried with that test file and it seems it works out of the box. Need to do more testing but if it is that simple I can not believe, I struggled so much with gz files in Elixir in past …

4 Likes