How to parse result from Google BigQuery API

I am using google_api to query a BigQuery dataset but I find it very difficult to parse data (TableRow/TableFieldSchema) contained in response.

It looks like there is no built-in tool to convert this to simply convert it to a Map or anything else. I found this solution that works pretty well but it’s really weird to me that there is no simpler way to do it ?

1 Like

Yeah, what you are bumping into can be a chore with the semi-official library - but once you’ve used it for a while, its not terrible.

LogFlare has a solution that is similar to one I use in my production systems for turning that output into a list of maps with the correct field names and stuff. That is here. Basically they are pulling the schema data from the response and coercing the rows into the correct shapes, types, snake casing attribute names, etc.

Also worth looking at the BQ related stuff in the LiveBook side of the community. The req/req_bigquery library. That just hits the REST API the same as the google client you point to (which uses Tesla under the hood). Not saying to use that for your project, but they take an another approach to handle that output using the Table.Reader protocol as is becoming common for Livebook-related efforts. There are YouTube videos showing how to use this for grabbing data in a LiveBook.

As to some of the reasons why the main library doesn’t do this for you is that he official google cloud hex packages (that I am aware) of are built by some sort of code generator based on API specs. I think it is just a person or two at Google that sponsor those efforts, so I understand why it is as automated as possible. The negative of this is they lack any sort of normal polish or feel like Elixir libraries. As someone who has written a ton of google cloud-related elixir code over the years, I’ve learned to have the Google REST API docs open while trying to decipher these libraries. Much easier than following the long namespaces around in hexdocs. Last point is that all of the google libraries are just hitting these same REST API endpoints (no matter the language).

Also worth just hitting up hex.pm and tossing BigQuery in the search. I know there are a few efforts in various states over the years. Most people I know just end up building a module to clean up the output and reuse it in many projects.

If I have some free cycles next week, I try to revisit this thread out and toss a gist of my variation on this if needed.

1 Like

@zatae After looking more at my earlier response, I realized that the LogFlare stuff I linked to was more related to writing to BQ.

My solution to handling the output from BQ is generally based off the same StackOverflow link you posted.

FWIW, here is a gist of a single module I extracted from some production systems in my world. I did just toss in the handling for “JSON” data types, but didn’t test it. Only requirement outside of the BQ library is jason.

Not suggesting this is any better than the StackOverflow solution, but has a few opinionated things going on to meet my needs.

2 Likes

Thank you very much for all the useful information.

I have the same feeling but it’s true that once you are confortable with reading in-code documentation and Google APIs docs together it feels pretty easy to use their API wrapper.

I will give it a try ! Thanks.

Thank you for your service. This gist has saved me a significant amount of time :smile:

1 Like