Explorer supports creating a subset of dataframe rows:
df = Explorer.DataFrame.new(a: ["a", "b", "c"], b: [1, 2, 3])
Explorer.DataFrame.filter(df, Explorer.Series.greater(df["b"], 1))
#Explorer.DataFrame<
Polars[2 x 2]
a string ["b", "c"]
b integer [2, 3]
>
And it supports multiple filters:
df = Explorer.DataFrame.new(a: ["a", "b", "c"], b: [1, 2, 3])
b_gt = Explorer.Series.greater(df["b"], 1)
a_eq = Explorer.Series.equal(df["a"], "b")
Explorer.DataFrame.filter(df, Explorer.Series.and(a_eq, b_gt))
#Explorer.DataFrame<
Polars[1 x 2]
a string ["b"]
b integer [2]
>
With that in mind, is there a simple way to subset where a column takes any one of several values, i.e.:
df = Explorer.DataFrame.new(a: ["a", "b", "c"], b: [1, 2, 3])
Explorer.DataFrame.filter(df, Explorer.Series.in(df["b"], [1, 2]))
#Explorer.DataFrame<
Polars[2 x 2]
a string ["a", "b"]
b integer [1, 2]
>
I think you could do this by chaining a bunch of Series.equal\2
and Series.or\2
calls, but at some point that is pretty tedious.