Background
I have to read a CSV and currently this is happening at compile time as the function runs in a module attribute:
# Imagine this csv file has 3 columns "sport, country, league"
@csv_sports_data
:my_app
|> :code.priv_dir()
|> Path.join("awesome_csv.csv")
|> File.stream!()
|> CSV.decode!(headers: false, separator: ?;)
|> Stream.map(&List.to_tuple/1)
|> Enum.uniq()
So now, because this runs at compile time (iirc) I have a variable with the data I need in tuple format. So far so good.
Problem
The problem comes when I need to do the same thing, multiple times, with small variations:
# The duplication, IT BURNS !!!
# Imagine this csv file has 3 columns "sport, country, league"
@csv_sports_data
:my_app
|> :code.priv_dir()
|> Path.join("awesome_csv.csv")
|> File.stream!()
|> CSV.decode!(headers: false, separator: ?;)
|> Stream.map(&List.to_tuple/1)
|> Enum.uniq()
@sports
:my_app
|> :code.priv_dir()
|> Path.join("awesome_csv.csv")
|> File.stream!()
|> CSV.decode!(headers: false, separator: ?;)
|> Stream.map(&List.to_tuple/1)
|> Stream.uniq()
# Always trim data from pesky users!
|> Stream.map(
fn {sport, country, league} ->
{String.trim(sport), String.trim(country), String.trim(league)}
end)
|> Stream.map(fn {sport, _country, _league} -> sport end)
#No empty sports!
|> Enum.filter(fn sport -> sport != "" end)
@countries
:my_app
|> :code.priv_dir()
|> Path.join("awesome_csv.csv")
|> File.stream!()
|> CSV.decode!(headers: false, separator: ?;)
|> Stream.map(&List.to_tuple/1)
|> Stream.uniq()
# Always trim data from pesky users!
|> Stream.map(
fn {sport, country, league} ->
# Always trim data from pesky users!
{String.trim(sport), String.trim(country), String.trim(league)}
end)
|> Stream.map(fn {_sport, country, _league} -> country end)
# We allow empty countries to make the example interesting
As you can see, I have a lot of duplicated code. At the very least I could place
:my_app
|> :code.priv_dir()
|> Path.join("awesome_csv.csv")
|> File.stream!()
|> CSV.decode!(headers: false, separator: ?;)
|> Stream.map(&List.to_tuple/1)
|> Stream.uniq()
Into a function or variable and then re-use it in @sports
and countries
. The trimming function is also another candidate. And then there are the little differences for @sports
and @countries
where I select only the values I want.
Things I tried
So, my first try was to use the @csv_sports_data
inside the @sports
and @countries
attributes. Obviously this didnât work, as I canât use something that was not yet compiled into an attribute that is itself being generated at compile time.
# This wont work
@sports
@csv_sports_data
# Always trim data from pesky users!
|> Stream.map(
fn {sport, country, league} ->
{String.trim(sport), String.trim(country), String.trim(league)}
end)
|> Stream.map(fn {sport, _country, _league} -> sport end)
#No empty sports!
|> Enum.filter(fn sport -> sport != "" end)
My second try was to consider Macros. According to my understanding, I could create a Macro that reads the CSV file at compile time and then have @sports
and @countries
use it. However, I personally am a believer of the saying:
âThe first rule about Macros - donât use Macrosâ
And I feel the usage of a Macro for this specific situation would be quite overkill. So I would like to avoid it.
And then there is also the trim function:
Stream.map(
fn {sport, country, league} ->
# Always trim data from pesky users!
{String.trim(sport), String.trim(country), String.trim(league)}
end)
Which I cannot place inside a def
or defp
for the sake of reuse.
What now?
Surely I am missing something. Perhaps the solution I was given to work with the CSV is flawed, or perhaps I am forgetting some mechanism that would reduce the amount of duplicated code I have.
- How can I remove all the duplication?