Explorer.DataFrame and Livebook help

Hello!

I am a BEGINNER and apologize in advance for the elementary questions.

I want to do some data transformations to a dataframe in Livebook, but currently one of the dataframe’s columns/series representing a :utc_datetime from ecto has a :string dtype, and I cannot find a simple/automated way to convert this from a :string to a :utc_datetime and then to an Explorer.Series :date.

I followed this procedure to set up the livebook:

  1. ran an ecto.query on my postgresql database, where my select statement returns a map with atom keys
  2. passed the query as an argument to Ecto.Repo.all()
  3. converted the map from Repo.all() to a csv file
  4. so far have been doing useful data transforms and visualization using Explorer and Vega_Lite in Livebook, HOWEVER: in livebook the function Explorer.DataFrame.from_csv!/1 interpreted a :utc_datetime column as a :string series, and I don’t know how to convert the :string back to any useful/date-related Explorer.Series dtype.

At this exact moment I would appreciate a quick fix to this conversion struggle, but in general I hope to soon use livebook for more rigorous machine learning algorithms and data manipulations on data from my db, so any and all ways to streamline the above clunky process would be GREATLY appreciated.

Thanks! Y’all are awesome!!!

The following is Claude’s LLM answer to your query - hope it helps you!

Here’s how you can handle this:

  1. When loading the CSV, you can use the parse_dates option to automatically parse date/datetime strings:
df = Explorer.DataFrame.from_csv!("your_file.csv", parse_dates: true)

If that doesn’t work automatically, you can manually convert the column using mutate. Here’s how:

require Explorer.DataFrame

df = Explorer.DataFrame.mutate(
  df,
  datetime_col: cast(string_datetime_col, :datetime)
)

For more control over the parsing, you can use Explorer’s datetime casting functions. Here’s a complete example:

require Explorer.DataFrame, as: DF

# Starting with your string datetime column
df = DF.from_csv!("your_file.csv")

# Convert string to datetime
df = DF.mutate(
  df,
  datetime_col: strptime(string_datetime_col, "%Y-%m-%d %H:%M:%S", unit: :microsecond)
)

# If you specifically need just the date portion:
df = DF.mutate(
  df,
  date_col: dt_date(datetime_col)
)

For a more streamlined workflow from Ecto to Explorer, consider:

  1. Use Ecto’s select_merge to format dates in the query itself
  2. Or use Explorer’s ADBC functionality to read directly from your database:
{:ok, df} = Explorer.DataFrame.from_query(
  conn,  # ADBC connection
  "SELECT * FROM your_table",
  []
)

This would give you properly typed columns directly from your database without needing the CSV intermediate step.

The key functions to remember for datetime manipulation in Explorer are:

  • strptime - Parse string to datetime
  • cast - General type conversion
  • dt_date - Extract date from datetime
  • parse_dates option in from_csv
1 Like