Howto set the decimal separator when reading with CSV.decode() or Explorer.DataFrame.load_csv()

Hi,
my csv file that i want to read is from DE-region (Germany). There they use comma as decimal separator for floating point numbers. eg 4,58

Unfortunately I cannot identitfy a parameter in the documentation to specify another decimal separator for these functions:

Because of the decimal separator (in my case “,” instead of “.”) it will throw an error if defining the datatype of the columns in the csv file on reading with CSV.decode (…) or Dataframe.load_csv(…)

Example:
df = DF.load_csv!(content, dtypes: [{“myFloatCol”,:f64}])

→ throws the expected error:
RuntimeError{message: "Polars Error: could not parse \"4,58\" as dtype f64 at column ‘myFloatCol’ (column number 4)

What would be a good practice?"

Thanks for your advice
Gordian

You could use field_transform to accomplish most of this, by passing String.replace:

String.replace("-123,45", ~r/^(-?)(\d+),(\d+)$/, "\\1\\2.\\3")

(adjust further if you’re expecting scientific notation too)

2 Likes