With the IANA timezone database being updated to version 2020a on April 23rd, timezone_boundary_builder has now also updated its release.
For current users of tz_world you can update the data source in two ways:
Run mix tz_world.update
In a running system, call TzWorld.reload_timezone_data/0. Application restart is not required - the new data is downloaded and installed in the running system.
It seems the data is loaded into memory on startup. I’m wondering how much memory it takes to load such a db? The geojson file I dealt with was quite big iirc.
Not really. I’ve skipped the problem and put the data in postgres and let it deal with it. But first we’ll need to know if the data you’re using is actually so big it’s worthwhile to think about it. I’m only using it once on registration of users (a.k.a. rarely), so it’s also likely dependent on the use case.
Edit:
@kip I just looked at it. The file added to priv/ is 78 MB, which is fine, but loading the data into memory swallows 1GB of memory. This might be fine for bigger machines, but not so much for smaller ones.
I have refactored tz_world to separate different backend strategies. Please feel free to give it a try, I plan to release a new version on hex later this week.
New Backend Access modules
TzWorld.Backend.Memory which retains all data in memory for fastest performance at the expense of using approximately 1Gb of memory
TzWorld.Backend.Dets which uses Erlang’s :dets data store. This uses negligible memory at the expense of slow access times (approximaltey 500ms in testing)
TzWorld.Backend.DetsWithIndexCache which balances memory usage and performance. This backend is recommended in most situations since its performance is similar to TzWorld.Backend.Memory (about 5% slower in testing) and uses about 25Mb of memory
TzWorld.Backend.Ets which uses :ets for storage. With the default settings of :compressed for the :ets table its memory consumption is about 512Mb but with access that is over 20 times slower than TzWorld.Backend.DetsWithIndexCache
Most interesting is that by caching the bounding boxes in memory, the backend TzWorld.Backend.DetsWithIndexCache looks to be a good balance of memory utilisation and performance. It uses only ~20Mb of memory to store the bounding boxes and has performance similar to the in-memory backend.
Basic Benchmark
Name ips average deviation median 99th %
Backend Memory 52.34 19.10 ms ±14.86% 18.34 ms 30.62 ms
Backend DetsWithIndexCache 50.10 19.96 ms ±13.15% 20.79 ms 24.66 ms
Backend Ets 2.33 428.53 ms ±2.51% 432.73 ms 441.58 ms
Backend Dets 0.59 1693.80 ms ±9.62% 1646.47 ms 2018.27 ms
Comparison:
Backend Memory 52.34
Backend DetsWithIndexCache 50.10 - 1.04x slower +0.85 ms
Backend Ets 2.33 - 22.43x slower +409.43 ms
Backend Dets 0.59 - 88.66x slower +1674.70 ms
I’m wondering how you’re dealing with overlapping timezones? Seems like only a single timezone is returned, so at least it should be documented how this is resolved.
Fair call, currently I just return the first match in order to simplify the API for a consumer. Perhaps I should also add all_timezones_at/2 as well which would always return a list. Open to suggestions on how to handle the overlapping areas.
Fixed a few bugs, updated the documentation to reflect @lostkobrakai’s suggestion and added an even faster backend TzWorld.Backend.EtsWithIndexCache which is about 40% faster than TzWorld.Backend.Memory although it does take about 512Mb. Results:
Name ips average deviation median 99th %
Backend EtsWithIndexCache 73.37 13.63 ms ±20.09% 13.81 ms 19.01 ms
Backend Memory 51.67 19.35 ms ±14.90% 18.50 ms 29.64 ms
Backend DetsWithIndexCache 49.23 20.31 ms ±13.60% 21.18 ms 25.44 ms
Backend Ets 2.25 444.57 ms ±2.05% 445.87 ms 456.08 ms
Backend Dets 0.54 1853.16 ms ±23.09% 1684.00 ms 2726.35 ms
Comparison:
Backend EtsWithIndexCache 73.37
Backend Memory 51.67 - 1.42x slower +5.72 ms
Backend DetsWithIndexCache 49.23 - 1.49x slower +6.68 ms
Backend Ets 2.25 - 32.62x slower +430.94 ms
Backend Dets 0.54 - 135.97x slower +1839.53 ms
Other improvements:
Dialyzer is now also happy.
Removed a compile-time expression that would break releases
Consolidated the documentation in the TzWorld module
Unless any nasty bugs pop up I’ll release to hex in about 24 hours.
Version 0.4.0 has been published on hex. From the changelog
Breaking change
When specifying a lng, lat to TzWorld.timezone_at/2 the coordinates must be wrapped in a tuple. For example TzWorld.timezone_at({3.2, 45.32}) making it consistent with the Geo.Point and Geo.PointZ strategies.
Configurable backends
TzWorld.Backend.Memory which retains all data in memory for fast (but not fastest) performance at the expense of using approximately 1Gb of memory. Generally not recommended.
TzWorld.Backend.Dets which uses Erlang’s :dets data store. This uses negligible memory at the expense of slow access times (approximaltey 500ms in testing)
TzWorld.Backend.DetsWithIndexCache which balances memory usage and performance. This backend is recommended in most situations since its performance is similar to TzWorld.Backend.Memory (about 5% slower in testing) and uses about 25Mb of memory
TzWorld.Backend.Ets which uses :ets for storage. With the default settings of :compressed for the :ets table its memory consumption is about 512Mb but with access that is over 20 times slower than TzWorld.Backend.DetsWithIndexCache
TzWorld.Backend.EtsWithIndexCache which uses :ets for storage with an additional in-memory cache of the bounding boxes. This still uses about 512Mb but is faster than any of the other backends by about 40%
Enhancements
Add TzWorld.all_timezones_at/2 to return all timezones for a given location. In rare cases, usually disputed territory, multiple timezones may be declared for overlapping regions. TzWorld.all_timezones_at/2 returns a (potentially empty) list of all time zones known for a given point. Futher testing of this function is required and will be completed before version 1.0.
The data for any user of thistz_world can be updated with the following. The new data will be used the next time the application is restarted.
mix tz_world.update
To update the data in a running system (ie without restarting the application), first download the data using the step above. Then in your application call TzWorld.reload_timezone_data by whichever means is appropriate for your application.
Tz_world version 0.7.1 is also published. The only change is to remove tests that tested against a hardcoded data version.
Heya, just wanted to drop a note and say thanks for this lib! It’s exactly what I was looking for. Sometimes devs just snatch up open source libs off the shelf and never acknowledge the hard work that people put into them. So thanks! I starred the repo as well.
I’ve just published tz_world version 1.3.3 that fixes the one warning that was occurring on Elixir 1.17.
I’m thinking about an update that simplifies the library overall in the following ways and would greatly appreciate feedback from any current or potential users:
Remove all the backend options except TzWorld.Backend.EtsWithIndexCache and TzWorld.Backend.DetsWithIndexCache. This would remove the backend options TzWorld.Backend.Memory, TzWorld.Backend.Ets and TzWorld.Backend.Dets.
Make the option --include-oceans to mix tzworld.update be the standard and remove the option. The download is about 10% larger that without oceans. But it means that there is global coverage for timezone resolution.