How to replace positive and negative infinity to NaN in a dataframe for all columns?

I have a big dataframe with 74 columns and 756270 rows, I would like to make sure there are no infinities in any of the rows of the dataframe. I tried using mutate but I got stuck with trying to check for the type of the column.

  df,
  for col <- across(0..73) do
    {
      col.name,
      cond do
        Series.and(Series.equal(Series.dtype(col), {:f, 64}), Series.is_infinite(col)) ->
          :nan
        true ->
          col
      end
    }
  end
)```
2 Likes

Hi @Fire-Hound, welcome to the forum!

Few points:

  1. You can filter the dtype as part of the comprehension, and you don’t need 0..73:

    for col <- across(), col.dtype == {:f, 64} do
    
  2. When you say you want to make sure there are no infinities, do you want to check? Or do you want to replace (impute) them?

    If you just want to check, this will work:

    require Explorer.DataFrame, as: DF
    
    df = DF.new(a: [:infinity, 1.0, 2.0], b: [0.0, 1.0, 2.0], c: ["x", "y", "z"])
    
    DF.summarise(df,
      for col <- across(), col.dtype == {:f, 64} do
        {col.name, any?(is_infinite(col))}
      end
    )
    
    # #Explorer.DataFrame<
    #   Polars[1 x 2]
    #   a boolean [true]
    #   b boolean [false]
    # >
    

    If you want to impute, you’ll want this:

    DF.mutate(df,
      for col <- across(), col.dtype == {:f, 64} do
        {col.name, if(is_infinite(col), do: :nan, else: col)}
      end
    )
    
    # #Explorer.DataFrame<
    # Polars[3 x 3]
    #   a f64 [NaN, 1.0, 2.0]
    #   b f64 [0.0, 1.0, 2.0]
    #   c string ["x", "y", "z"]
    # >
    
1 Like

Thank you! This is really helpful :heart:

1 Like