I have a csv file and I want to read the data in chunks like 100. To do that I’m using a Stream function.
def read_csv_file() do
File.stream!("annual.csv")
|> CSV.decode(headers: true, strip_fields: true)
|> Stream.chunk_every(100)
|> Enum.map(fn {_, data} -> data end)
end
data looks like this
[%{
"Industry_aggregation_NZSIOC" => "Level 1",
"Industry_code_ANZSIC06" => "ANZSIC06 division A",
"Industry_code_NZSIOC" => "AA",
"Industry_name_NZSIOC" => "Agriculture, Forestry and Fishing",
...
},
%{
"Industry_aggregation_NZSIOC" => "Level 1",
"Industry_code_ANZSIC06" => "ANZSIC06 division A",
"Industry_code_NZSIOC" => "AA",
...
},
%{
"Industry_aggregation_NZSIOC" => "Level 1",
"Industry_code_ANZSIC06" => "ANZSIC06 division A",
...
},
%{"Industry_aggregation_NZSIOC" => "Level 1", ...},
%{...},
...
]
This code takes a lot of time to read all the records. Also stream.chunk_every is not working in this.
I want to understand how stream.resource will be useful here?
or we can improve the performance without using this?