Context
I’m writing a custom firmware for Mood Light - Pi Zero WH Project Kit to make it a lamp that follows the sun cycle for a given location. From the hardware point of view, this is a fairly straightforward Nerves project running on Raspberry Pi Zero with the LED module controlled via the blinkchain | Hex library. One can also point the browser at moodlight.local and see the complete cycle for the given location.
The problem
The core of the logic is simple: given a location (lat, lng, elevation), get the a timestamp representing dawn, sunrise, sunset and dusk. To do that with sufficient precision, it’s necessary to know the timezone of the location (so that the times provided are correctly reported in local time).
The astro | Hex library has all needed functionality, and from my tests it produces accurate enough results for the use case (thanks @kip).
Astro depends on tz_world | Hex, which provides the necessary {lat,lng} → timezone logic using a fairly large dataset which gets updated over time.
This is where things start to get tricky:
- At this point in time, the source GeoJSON file used by TzWorld to produce its database clocks at around 45MB compressed.
- TzWorld processes the file and produces a few other files that get persisted to disk. The one that matters the most is a
dets
database of 760MB in size. - TzWorld lets the developer choose among different backends for timezone resolution, each one of them with different tradeoffs between memory usage and speed. The RPi0 has one CPU core, and 512MB of ram, so the only backend that works in a reasonable time is the
dets
backend with a in-memory cache index, which all in all lets the Pi sit at 80MB of used memory (where half of it is the index cache). - To get to the point that the device has both the necessary processed data and a warm index, I wrote a separate TzWorld backend which can operate within the memory constraints of the Pi (AKA stream all the things). This makes it that the Pi can boot, and once connected to the WiFi can download the source data, import it into the
dets
database, and warm the index cache. The trouble is that the process takes around 25-30 minutes, as populating thedets
database is super slow (because CPU and MicroSD card), and computation of the index is also slow and non parallelizable (because single-core CPU anyway).
I haven’t looked into batch-writing items in the dets
table to see if that could help the first half of the process.
Where to go from here
I’m now trying to brainstorm where to go from here to accelerate the process.
- On the one side, I can accept this cost as a one-off that gets paid infrequently (first setup, and when a new tz_world database is released). As long as the process doesn’t block the device, I can massage into something I can monitor and provide visible progress.
- To speed up rebooting in case of failure, I can also write the index cache to disk, using the same version number as the tz_world database.
I’m also thinking if there’s a way to re-architect the solution. One option I would like to look into is to produce a sqlite database from the initial GeoJSON file, and make the firmware download that instead. I don’t have experience with geospatial data though (in any shape or form) so I would need to learn about how to store it, query it and so on. Not a problem, it just represents work to do.
What I don’t want to do is rely on an external HTTP API to resolve location → timezone. I really want that part to remain private as it currently is, and the whole thing to bootstrap on its own once the firmware boots.
What I’m looking for now is some feedback/stories if there’s anyone who solved similar problems, or if anyone has any suggestion on what could work or not. Thanks!