Hello,
I’m trying to load a json API message with Explorer.DataFrame.
My api is returning a list like bellow:
resp = [%{"idade" => "8", "nome" => "Mirela"}, %{"idade" => "40", "nome" => "Sergio"}]
When I try to run the following code:
require Explorer.DataFrame, as: DF
DF.load_ndjson!(Jason.encode!(resp), lazy: true)
I got an error:
** (ErlangError) Erlang error: :nif_panicked
(explorer 0.5.7) Explorer.PolarsBackend.Native.df_load_ndjson("...
But if I change to read the first element it works as well.
require Explorer.DataFrame, as: DF
DF.load_ndjson!(Jason.encode!(List.first(resp)))
How to read the entire data ?
Thank you.
The ndjson parser expects JSON objects separated by newlines - for instance, here’s an example from Explorer’s tests:
assert ndjson == """
{"sepal_length":5.1,"sepal_width":3.5,"petal_length":1.4,"petal_width":0.2,"species":"Iris-setosa"}
{"sepal_length":4.9,"sepal_width":3.0,"petal_length":1.4,"petal_width":0.2,"species":"Iris-setosa"}
{"sepal_length":4.7,"sepal_width":3.2,"petal_length":1.3,"petal_width":0.2,"species":"Iris-setosa"}
{"sepal_length":4.6,"sepal_width":3.1,"petal_length":1.5,"petal_width":0.2,"species":"Iris-setosa"}
{"sepal_length":5.0,"sepal_width":3.6,"petal_length":1.4,"petal_width":0.2,"species":"Iris-setosa"}
{"sepal_length":5.4,"sepal_width":3.9,"petal_length":1.7,"petal_width":0.4,"species":"Iris-setosa"}
{"sepal_length":4.6,"sepal_width":3.4,"petal_length":1.4,"petal_width":0.3,"species":"Iris-setosa"}
{"sepal_length":5.0,"sepal_width":3.4,"petal_length":1.5,"petal_width":0.2,"species":"Iris-setosa"}
{"sepal_length":4.4,"sepal_width":2.9,"petal_length":1.4,"petal_width":0.2,"species":"Iris-setosa"}
{"sepal_length":4.9,"sepal_width":3.1,"petal_length":1.5,"petal_width":0.1,"species":"Iris-setosa"}
"""
That is not what Jason.encode!(resp)
will produce - that result will have [
at the beginning and ,
between the objects. You’d need to encode each item in resp
separately.
HOWEVER
In general, re-encoding like that should make you suspect that you’re not using the best API for your situation.
The better choice is Explorer.DataFrame.new
, in particular the last example labeled “From row data:”.