Tz_world / timezone_boundary_builder update

tz_world is a library that maps a location (lat, lng) to a timezone.

From time-to-time the base data in timzone_boundary_builder is updated based upon updates to the IANA timezone database.

With the IANA timezone database being updated to version 2020a on April 23rd, timezone_boundary_builder has now also updated its release.

For current users of tz_world you can update the data source in two ways:

  1. Run mix tz_world.update
  2. In a running system, call TzWorld.reload_timezone_data/0. Application restart is not required - the new data is downloaded and installed in the running system.
9 Likes

Yeah, I can finally replace my manually imported world map in my db without update logic :slight_smile:

It seems the data is loaded into memory on startup. I’m wondering how much memory it takes to load such a db? The geojson file I dealt with was quite big iirc.

I’ll look into that and revert. Thoughts on a better way to dynamically load required data instead of all data and still keep good performance?

Not really. I’ve skipped the problem and put the data in postgres and let it deal with it. But first we’ll need to know if the data you’re using is actually so big it’s worthwhile to think about it. I’m only using it once on registration of users (a.k.a. rarely), so it’s also likely dependent on the use case.

Edit:

@kip I just looked at it. The file added to priv/ is 78 MB, which is fine, but loading the data into memory swallows 1GB of memory. This might be fine for bigger machines, but not so much for smaller ones.


This happens when stopping the app.

Yep, turns out its really big (and this is the version not including geo data for the oceans).

I will, for the next release:

  1. Turn the storage and access mechanisms into a behaviour
  2. Implement a :dets backend (first priority)
  3. Implement a PostGis backend (which was already written by the original author of tz_world)
  4. Implement a non-PostGis backend for filtering by bounding box

Probably going to take a couple of weeks to get this done.

Thanks for the vigilance @LostKobrakai.

3 Likes

I have refactored tz_world to separate different backend strategies. Please feel free to give it a try, I plan to release a new version on hex later this week.

New Backend Access modules

  • TzWorld.Backend.Memory which retains all data in memory for fastest performance at the expense of using approximately 1Gb of memory
  • TzWorld.Backend.Dets which uses Erlang’s :dets data store. This uses negligible memory at the expense of slow access times (approximaltey 500ms in testing)
  • TzWorld.Backend.DetsWithIndexCache which balances memory usage and performance. This backend is recommended in most situations since its performance is similar to TzWorld.Backend.Memory (about 5% slower in testing) and uses about 25Mb of memory
  • TzWorld.Backend.Ets which uses :ets for storage. With the default settings of :compressed for the :ets table its memory consumption is about 512Mb but with access that is over 20 times slower than TzWorld.Backend.DetsWithIndexCache

Most interesting is that by caching the bounding boxes in memory, the backend TzWorld.Backend.DetsWithIndexCache looks to be a good balance of memory utilisation and performance. It uses only ~20Mb of memory to store the bounding boxes and has performance similar to the in-memory backend.

Basic Benchmark

Name                                 ips        average  deviation         median         99th %
Backend Memory                     52.34       19.10 ms    ±14.86%       18.34 ms       30.62 ms
Backend DetsWithIndexCache         50.10       19.96 ms    ±13.15%       20.79 ms       24.66 ms
Backend Ets                         2.33      428.53 ms     ±2.51%      432.73 ms      441.58 ms
Backend Dets                        0.59     1693.80 ms     ±9.62%     1646.47 ms     2018.27 ms

Comparison: 
Backend Memory                     52.34
Backend DetsWithIndexCache         50.10 - 1.04x slower +0.85 ms
Backend Ets                         2.33 - 22.43x slower +409.43 ms
Backend Dets                        0.59 - 88.66x slower +1674.70 ms
7 Likes

I’m wondering how you’re dealing with overlapping timezones? Seems like only a single timezone is returned, so at least it should be documented how this is resolved.

Fair call, currently I just return the first match in order to simplify the API for a consumer. Perhaps I should also add all_timezones_at/2 as well which would always return a list. Open to suggestions on how to handle the overlapping areas.

1 Like

Fixed a few bugs, updated the documentation to reflect @lostkobrakai’s suggestion and added an even faster backend TzWorld.Backend.EtsWithIndexCache which is about 40% faster than TzWorld.Backend.Memory although it does take about 512Mb. Results:

Name                                 ips        average  deviation         median         99th %
Backend EtsWithIndexCache          73.37       13.63 ms    ±20.09%       13.81 ms       19.01 ms
Backend Memory                     51.67       19.35 ms    ±14.90%       18.50 ms       29.64 ms
Backend DetsWithIndexCache         49.23       20.31 ms    ±13.60%       21.18 ms       25.44 ms
Backend Ets                         2.25      444.57 ms     ±2.05%      445.87 ms      456.08 ms
Backend Dets                        0.54     1853.16 ms    ±23.09%     1684.00 ms     2726.35 ms

Comparison: 
Backend EtsWithIndexCache          73.37
Backend Memory                     51.67 - 1.42x slower +5.72 ms
Backend DetsWithIndexCache         49.23 - 1.49x slower +6.68 ms
Backend Ets                         2.25 - 32.62x slower +430.94 ms
Backend Dets                        0.54 - 135.97x slower +1839.53 ms

Other improvements:

  • Dialyzer is now also happy.
  • Removed a compile-time expression that would break releases
  • Consolidated the documentation in the TzWorld module

Unless any nasty bugs pop up I’ll release to hex in about 24 hours.

3 Likes

Version 0.4.0 has been published on hex. From the changelog

Breaking change

  • When specifying a lng, lat to TzWorld.timezone_at/2 the coordinates must be wrapped in a tuple. For example TzWorld.timezone_at({3.2, 45.32}) making it consistent with the Geo.Point and Geo.PointZ strategies.

Configurable backends

  • TzWorld.Backend.Memory which retains all data in memory for fast (but not fastest) performance at the expense of using approximately 1Gb of memory. Generally not recommended.
  • TzWorld.Backend.Dets which uses Erlang’s :dets data store. This uses negligible memory at the expense of slow access times (approximaltey 500ms in testing)
  • TzWorld.Backend.DetsWithIndexCache which balances memory usage and performance. This backend is recommended in most situations since its performance is similar to TzWorld.Backend.Memory (about 5% slower in testing) and uses about 25Mb of memory
  • TzWorld.Backend.Ets which uses :ets for storage. With the default settings of :compressed for the :ets table its memory consumption is about 512Mb but with access that is over 20 times slower than TzWorld.Backend.DetsWithIndexCache
  • TzWorld.Backend.EtsWithIndexCache which uses :ets for storage with an additional in-memory cache of the bounding boxes. This still uses about 512Mb but is faster than any of the other backends by about 40%

Enhancements

  • Add TzWorld.all_timezones_at/2 to return all timezones for a given location. In rare cases, usually disputed territory, multiple timezones may be declared for overlapping regions. TzWorld.all_timezones_at/2 returns a (potentially empty) list of all time zones known for a given point. Futher testing of this function is required and will be completed before version 1.0.
3 Likes

The timezone-boundary-builder has updated the data used by tz_world.

The data for any user of thistz_world can be updated with the following. The new data will be used the next time the application is restarted.

mix tz_world.update

To update the data in a running system (ie without restarting the application), first download the data using the step above. Then in your application call TzWorld.reload_timezone_data by whichever means is appropriate for your application.

Tz_world version 0.7.1 is also published. The only change is to remove tests that tested against a hardcoded data version.

1 Like