Handling of TzWorld data on a Pi Zero

Context

I’m writing a custom firmware for Mood Light - Pi Zero WH Project Kit to make it a lamp that follows the sun cycle for a given location. From the hardware point of view, this is a fairly straightforward Nerves project running on Raspberry Pi Zero with the LED module controlled via the blinkchain | Hex library. One can also point the browser at moodlight.local and see the complete cycle for the given location.

The problem

The core of the logic is simple: given a location (lat, lng, elevation), get the a timestamp representing dawn, sunrise, sunset and dusk. To do that with sufficient precision, it’s necessary to know the timezone of the location (so that the times provided are correctly reported in local time).

The astro | Hex library has all needed functionality, and from my tests it produces accurate enough results for the use case (thanks @kip).

Astro depends on tz_world | Hex, which provides the necessary {lat,lng} → timezone logic using a fairly large dataset which gets updated over time.

This is where things start to get tricky:

  1. At this point in time, the source GeoJSON file used by TzWorld to produce its database clocks at around 45MB compressed.
  2. TzWorld processes the file and produces a few other files that get persisted to disk. The one that matters the most is a dets database of 760MB in size.
  3. TzWorld lets the developer choose among different backends for timezone resolution, each one of them with different tradeoffs between memory usage and speed. The RPi0 has one CPU core, and 512MB of ram, so the only backend that works in a reasonable time is the dets backend with a in-memory cache index, which all in all lets the Pi sit at 80MB of used memory (where half of it is the index cache).
  4. To get to the point that the device has both the necessary processed data and a warm index, I wrote a separate TzWorld backend which can operate within the memory constraints of the Pi (AKA stream all the things). This makes it that the Pi can boot, and once connected to the WiFi can download the source data, import it into the dets database, and warm the index cache. The trouble is that the process takes around 25-30 minutes, as populating the dets database is super slow (because CPU and MicroSD card), and computation of the index is also slow and non parallelizable (because single-core CPU anyway).

I haven’t looked into batch-writing items in the dets table to see if that could help the first half of the process.

Where to go from here

I’m now trying to brainstorm where to go from here to accelerate the process.

  1. On the one side, I can accept this cost as a one-off that gets paid infrequently (first setup, and when a new tz_world database is released). As long as the process doesn’t block the device, I can massage into something I can monitor and provide visible progress.
  2. To speed up rebooting in case of failure, I can also write the index cache to disk, using the same version number as the tz_world database.

I’m also thinking if there’s a way to re-architect the solution. One option I would like to look into is to produce a sqlite database from the initial GeoJSON file, and make the firmware download that instead. I don’t have experience with geospatial data though (in any shape or form) so I would need to learn about how to store it, query it and so on. Not a problem, it just represents work to do.

What I don’t want to do is rely on an external HTTP API to resolve location → timezone. I really want that part to remain private as it currently is, and the whole thing to bootstrap on its own once the firmware boots.

What I’m looking for now is some feedback/stories if there’s anyone who solved similar problems, or if anyone has any suggestion on what could work or not. Thanks!

Love the use case, and I really appreciate the clear and detailed description - and thank you for experimenting with a new tz_world backend too. I’m the maintainer of tz_world too so I guess all roads end up with me for this one :slight_smile:

Would there be value in making the DETS file a downloadable asset of the library? I didn’t do that initially because I was worried about version mismatch between the data structure and the library but its been a couple of years now and the format hasn’t changed (although clearly the data does change over time).

That way you can just download the DETS file which should save quite a bit of time as you point out.

The next question would be whether the bounding boxes (aka “the index”) can also be generated as an artefact for download and I expect that is probably true as well but I will need to refresh myself on how the index is built and accessed.

I’m very open to ideas and suggestions and more than happy to collaborate on a solution that is a better fit for embedded devices.

A spatialite backend implementation would also be a very intersting thing to explore and I’m up for collaborating on that as well. It may well hold the best chance of a more manageable and performant solution at the small compromise of requiring another dependency. That could also presumably be a release artefact since sqlite database tables are just files.

Somewhat off-target question: how are you getting the location?

I don’t see a GPS sensor in the project kit, and presumably you aren’t relying on IP Geolocation since it’s an external dependency.

If the user is entering the location, seems like you could just ask for the desired timezone too.

“How to fit tz_world data into resource-constrained environments” is still a good problem to solve, but sometimes the cheapest way to do a one-off calculation like this is to ask the human :stuck_out_tongue:

Thanks @kip!

My only concern there is size of the download, but it would work. From the OTP side, the version of the table should be stable enough.

That would also work - the cached data is a list of tuples, so it can be converted to an Erlang binary and written to disk - this is what I had in mind when I mentioned writing the index cache to disk.

I did come across this while researching and indeed it would provide the necessary APIs - I don’t know how one would go at having this installed and working in the context of an Elixir application (and Nerves in particular) but I’m happy to try and figure that out.

One thing I noticed while working on the problem is that both Astro and TzWorld could be made more extensible to sidestep some of these issues. For example, Astro doesn’t allow injecting a dependency to resolve location → timezone, while TzWorld doesn’t allow custom backends (I forked it to provide mine).

Both good points, and I’ll do as you suggest - those two changes have been in the back of my mind for a while.

In principle yeah, it would sidestep the problem completely - provided some modification to Astro so that it doesn’t rely on TzWorld for its functionality.

I’ll tack that as the first step, shouldn’t take too long to do.

1 Like

Thank you! Happy to provide feedback on the API if you wish.

I’ve pushed a commit that makes :tz_world optional and adds a :time_zone_resolver option to Astro.sunrise/3 and Astro.sunset/3. Suggest we move the discussion to GitHub - and of course you’re very welcome to try it out by configuring {:astro, github: "kipcole9/astro}.

1 Like

I’ve just now pushed a commit to tz_world that makes it easier to configure a custom backend as the default. This should fix the issue you were having with your custom backend since there insn’t a way to propagate the backend optional parameter from astro to tz_world. As before, better to continue the conversation on this on GitHub.

1 Like

Thank you! Appreciate the speed.

I’ll try and test later today and report on GH.

1 Like

I’ve published Astro 1.1.0 that provides a strategy to improve resource utilisation on resource-constrained and embedded devices. See the release announcement for more information.