fchabouis

Stream CSV file from a remote zip on S3

I have a zip file, containing CSVs, on a remote S3 cellar.

Using Unzip, I can get a stream of the CSV file that I would like to decode, like so :

    aws_s3_config =
      ExAws.Config.new(:s3,
        access_key_id: ["xxx", :instance_role],
        secret_access_key: ["xxx", :instance_role]
      )

    file = new(zip_name, bucket_name, aws_s3_config)
    {:ok, unzip} = Unzip.new(file)
    stream = Unzip.file_stream!(unzip, file_name)

as explained in the doc.

Now I would like to consume that stream by reading it with CSV.
So I try stream |> CSV.decode |> Enum.take(1) and get an error ** (FunctionClauseError) no function clause matching in CSV.Decoding.Preprocessing.Lines.starts_sequence?/5

If I write the content of my CSV on the disk and then read it, it works fine :

# write the file on disk
stream |> Stream.into(File.stream!("stops.txt")) |> Stream.run()
# then read and decode it
File.stream!("stops.txt") |> CSV.decode |> Enum.take(1)

I get the desired result, the first row of the CSV file : [ok: ["\uFEFFstop_id", "stop_name", "stop_lat", "stop_lon", "location_type"]]

The difference I see is that Unzip.file_stream! and File.stream!("stops.txt") do not stream the file the same way. Unzip seem to do it by chunks of 65k, while File.stream! streams line by line.

How can I solve this, without writing the file to disk as an intermediary step?
Thanks!

8 comments

#streams #csv #unzip

9 2487 8

2023-02-15 13:41:37 UTC

Most Liked

ahamez

Hello,

I don’t know if it will help, as I’m not using Unzip, but StreamGzip, in combination with NimbleCSV, for this purpose. But maybe it will give you some hint?

I have the following function that returns a stream for an object downloaded from S3:

  defp get_object_stream(object) do
    {:ok, io_pid} = StringIO.open(object)

    io_pid
    |> IO.binstream(4096)
    |> StreamGzip.gunzip()
    |> NimbleCSV.RFC4180.to_line_stream()
  end

In my case, the trick was to use to_line_stream.

I then can use this stream like this:

object
|> get_object_stream()
|> NimbleCSV.RFC4180.parse_stream()

As you can see, I’m not streaming directly from S3 as I first download the object in memory, but if you have something that’s already able to stream from S3, you would just have to replace the part that constructs the stream from the in-memory string with your stream from S3.

Post #2

LostKobrakai

List.flatten(c) |> Enum.join("") would probably better replaced with IO.iodata_to_binary/1

Post #7

akash-akya

Echoing what @ahamez has already mentioned, the issue seems to be that CSV.decode expects stream of lines. But Unzip.file_stream! returns stream of blobs. You can convert stream of blobs to stream of lines yourself, or you can use NimbleCSV as already mentioned.

Unzip.file_stream!(unzip, file_name)
|> NimbleCSV.RFC4180.to_line_stream()
|> NimbleCSV.RFC4180.parse_stream()

Post #3

Where Next?

View thread on forum (has 8 responses!)

streams

csv

unzip

Home Questions & Help>Questions

#streams #csv #unzip

9 2489 8

Last post

Popular in Questions

Questions & Help>Questions

How to get current request url full query params (as a keyword list if possible) inside a phoenix template?

For example for a current url like http://localhost:4000/cosmetic/products?_utf8=✓&query=perfume&page=2, I would like to get: ...

/phoenix #url #templates #params

7 15285 4

2019-05-07 14:47:00 UTC

New

Questions & Help>Questions

How to convert Date to DateTime which Ecto needs

Hello, I get Persian date from my client and convert it to normal calendar like this: def jalali_string_to_miladi_english_number(persi...

#ecto #datetime

15 15128 5

2023-02-25 00:11:31 UTC

New

Questions & Help>Questions

How to fix Bad argument in call to erlang:'++'(<<"xxx/crash.log">>, ".3") in lager_rotator_default:rotate_logfile/2 line 84

Erlang/OTP 25 [erts-13.2.2] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] 15:22:35.803 [error] gen_event {lager_file_backend...

#production #error #log

2 46367 2

2024-02-18 13:22:44 UTC

New

Questions & Help>Questions

Visual Studio Code - how to highlight html closing tags in html.eex?

I want to highlight html closing tags when i click a html tag. That works in .html files but doesnt work for html.eex templates. How can...

/phoenix #eex #troubleshooting #code-editors

12 21005 3

2019-11-18 19:12:14 UTC

New

Questions & Help>Questions

How can I check Phoenix version?

Hello, how can I check the Phoenix version ? Thanks !

/phoenix

35 28221 8

2022-07-29 11:27:07 UTC

New

Questions & Help>Questions

How To Get Phoenix & VueJS working Together?

I have VueJS GUIs with the project generated using Webpack. I have Elixir modules that will need to be used by the VueJS GUIs. I forese...

/phoenix

93 22614 42

2019-12-19 09:28:07 UTC

New

Questions & Help>Questions

Failed to run 'elixir' command error in vs code

Using vs code and installed ElixirLS: support and debugger. And I got an error popped up on start up says Failed to run ‘elixir’ comma...

#vscode #elixir-ls

49 16580 39

2025-08-20 18:57:04 UTC

New

Questions & Help>Questions

How to get the server ip address?

Hi! In PHP: $_SERVER[‘SERVER_ADDR’] - in Elixir? Searched the docs for ip address and the web, no good results. Thanks!

/phoenix

19 49320 16

2022-10-24 17:20:31 UTC

New

Questions & Help>Questions

Import a module from a file into IEX

What is the proper way to load a module from a file in to IEX? In the python world, doing something like this pretty standard: from ....

#iex

35 32626 16

2024-11-20 04:12:47 UTC

New

Questions & Help>Questions

Why would I choose Elixir as a general purpose programming language?

In asking this question I am more interested about the expressiveness of the language itself and less concerned about the availability of...

#functional-programming #use-cases

65 34961 13

2020-01-05 04:29:20 UTC

New

Other popular topics

Questions & Help>Questions

How to check Elixir version

I wanted to check elixir version in phoenix because i found that my elixir is 1.5 but when i use Enum.chunk_by it said the function is un...

#how-to-question

11 18838 3

2017-08-16 10:25:56 UTC

New

Questions & Help>Questions

Best way to send multiple files as HTTP response

I have a phoenix application from which a user can download multiple(5-6) files of size 1MB. I couldn’t find anything related to sending ...

/phoenix #api

3 18924 3

2018-11-07 11:39:20 UTC

New

Questions & Help>Questions

(Postgrex.Error) FATAL 28P01 (invalid_password) password authentication failed for user “postgres”

After calling mix ecto.create I get this error: 17:00:32.162 [error] GenServer #PID<0.412.0> terminating ** (Postgrex.Error) FATAL...

#ecto #postgres #troubleshooting

10 29754 20

2023-03-18 06:56:50 UTC

New

Questions & Help>Questions

How can I check Phoenix version?

Hello, how can I check the Phoenix version ? Thanks !

/phoenix

35 28221 8

2022-07-29 11:27:07 UTC

New

Questions & Help>Questions

How to convert map to string (separated with ,)

Hello, I have map which I want to convert it to string like this: the map: %{last_name: "tavakkoli", name: "shahryar"} the string I ne...

#maps #strings

15 15660 2

2019-03-08 10:48:10 UTC

New

Chat & Discussions>Discussions

Have you moved away from elixir? If so, why?

I asked this very same question on twitter and got some interesting feedback, but I thought it would be a good question to ask here as we...

#discussion #adoption

1207 39297 209

2020-08-29 14:22:01 UTC

New

Chat & Discussions>Wikis

Phoenix LiveView Info

We’ve put together this wiki for Phoenix LiveView - please feel free to add any info you feel is worth including. What is Phoenix LiveV...

/phoenix #wiki #stickies #liveview

220 25886 73

2019-08-06 22:22:55 UTC

New

Chat & Discussions>Discussions

Concat/appending lists

Got a question about when to concat vs. prepending items to list then reversing to achieve appending. So i know lists boil down to [1 | ...

#erlang

20 15629 11

2020-07-14 21:33:20 UTC

New

Questions & Help>Questions

What is the best IDE for elixir?

I would like to know what is the best IDE for elixir development?

#code-editors

33 49662 23

2021-04-11 09:06:24 UTC

New

Questions & Help>Questions

How to set up the Elixir SDK in Intellij IDEA with the intellij-elixir plugin?

Hello! Sorry for this astonishing simple question, but I’m really stuck. I try to set up the intellij-elixir plugin, but I don’t know ho...

17 23170 9

2020-08-23 02:19:55 UTC

New

Questions & Help>Questions

I miss the ternary operator - does anyone have a macro that allows a ternary operator in Elixir code?

Questions & Help>Questions

Empty Result on Generic Action with graphql_unnested_unions

Questions & Help>Questions

Clarification about `assign/2,3` usage in `render/1` callbacks

Questions & Help>Questions

With the new 1.20 release does it change the way you see Gleam?

Questions & Help>Questions

Using Phoenix.LiveView.TagEngine as an EEx.Engine is deprecated!

Questions & Help>Questions

About ambiguity introduced in function default arguments

Questions & Help>Questions

OpenApiSpex schema - are there any naming conventions on handling show and index routes?

Questions & Help>Questions

How to get type warnings before test failure reports

Questions & Help>Questions

Help with Durable Server counter demo as a first step

Questions & Help>Questions

Has anyone implemented 2FA with Ash Authentication?

Questions & Help>Questions

Questions Questions ❯

Latest on Elixir Forum

ElixirConf US 2026 is coming! Meet the Keynote Speakers and check out the talk lineup!

Chat & Discussions>Chit Chat

Sidereon - GPS, satellite positioning, and astrodynamics for Elixir

News>Announcing

Patch Package OTP 27.3.4.14 Released

News>Erlang News

Patch Package OTP 28.5.0.3 Released

News>Erlang News

Patch Package OTP 29.0.3 Released

News>Erlang News

LT: Your project is great! More people should know about it - Kamila Pokój | ElixirConf EU

Learning Resources>Talks

Stack Overflow Developer Survey 2026

Chat & Discussions>Discussions

Cache hit ratio on oban_jobs at 65% - large completed backlog, is this expected?

Questions & Help>Troubleshooting

Andy LeClair - Principal/Staff Full Stack Engineer

Jobs & Member Profiles>Member Profiles

What Liveview component for small data-grid?

Chat & Discussions>Discussions

How do you use mix hex.audit in your CIs?

Chat & Discussions>Discussions

Sloppy Joe: Mob running on Android, modifying a live app with an Agent + looking for testers

Chat & Discussions>Discussions

Navia Dratp - a real-time multiplayer board game in Phoenix LiveView with a Claude-powered opponent that explains its reasoning

News>Announcing

Hologram v0.10: Events, Middleware, and More

News>News & Updates

schema_org - Strictly-typed builder for generating SEO Schema.org JSON-LD

News>Announcing

Elixir Forum ❯

Sub Categories:

Forums

We're in Beta

About us Mission Statement

Stream CSV file from a remote zip on S3

fchabouis

Stream CSV file from a remote zip on S3

Most Liked

ahamez

LostKobrakai

akash-akya

Where Next?

Popular in Questions

How to get current request url full query params (as a keyword list if possible) inside a phoenix template?

How to convert Date to DateTime which Ecto needs

How to fix *Bad argument in call to erlang:'++'(<<"xxx/crash.log">>, ".3") in lager_rotator_default:rotate_logfile/2 line 84*

Visual Studio Code - how to highlight html closing tags in html.eex?

How can I check Phoenix version?

How To Get Phoenix & VueJS working Together?

Failed to run 'elixir' command error in vs code

How to get the server ip address?

Import a module from a file into IEX

Why would I choose Elixir as a general purpose programming language?

Other popular topics

How to check Elixir version

Best way to send multiple files as HTTP response

(Postgrex.Error) FATAL 28P01 (invalid_password) password authentication failed for user “postgres”

How can I check Phoenix version?

How to convert map to string (separated with ,)

Have you moved away from elixir? If so, why?

Phoenix LiveView Info

Concat/appending lists

What is the best IDE for elixir?

How to set up the Elixir SDK in Intellij IDEA with the intellij-elixir plugin?

Questions & Help>Questions

Latest on Elixir Forum

Sponsor Spotlight

Our Sponsors

Categories:

Sub Categories:

Forums

Our Sponsors

We're in Beta

How to fix Bad argument in call to erlang:'++'(<<"xxx/crash.log">>, ".3") in lager_rotator_default:rotate_logfile/2 line 84