Handling dates, times and timezones correctly

egeersoz · September 5, 2019, 9:20pm

I’m working on an app that can potentially be used across multiple time zones (at the very least, if users go on vacation and access the app from another timezone). I’ve never tackled this issue before, but I know it’s pretty complex, so I wanted to post my design here with the hopes of getting some feedback.

I use Postgres.

For example purposes, I want to talk about the To-Do Item context, since it’s a fairly common thing that a lot of people have experience with.

The desired behavior from the perspective of the user:

When creating/editing a to-do item, the user can provide a due date, a due date and due time, or neither.
If the user is located on the East Coast when they create the to-do item, someone who views that item on the West Coast (including potentially the same user) should see the due date and time in their own timezone. They should also see the timezone the item was created from, if it is different from their current timezone. Microsoft Outlook has a good example:

The timezone snippet displays for me because I’m currently on Central Time.

Anyways, here’s the design I came up with:

When the user first signs up, I capture the name of their current timezone (e.g. “America/Chicago”) using JavaScript and store it in a default_timezone text column.
Whenever the user logs into the app from that point forward, if the app detects a discrepancy between the account’s default_timezone and the browser’s timezone, it displays a warning:

They can ignore this (which makes the prompt go away for a week) or go to their account settings and pick the new timezone name from a dropdown.

Now, let me talk about the to_do_items table and associated context. Along with the typical fields (event name, description, etc.) it has three columns:

due_date (type: date)
due_time (type: time without timezone)
due_time_tz (type: text)

I originally had the first two columns as a single datetime column, but since the time field is optional, it introduced some complexity when doing the conversions while inserting/updating so I split them up. I noticed this is different from how many calendar apps handle it - for example, in Outlook if you don’t enter times, they default to 12:00 AM.

Anyways, when the user is creating a to-do item, the browser also sends the user’s current timezone name, which I stick into the due_date_tz column. Then, when the item is being viewed, I use moment.js and moment-timezone.js to look at the date, time and timezone on the object and display their combined result in the user’s current timezone. For example, “9-5-2019 1:00 AM - America/Chicago” is displayed as “9-4-2019 11:00 PM” when viewed from America/Los_Angeles.

Based on some cursory testing, this seems to work well. But cursory testing is not a substitute for real usage, especially over long periods when timezones can change! My question is, is this design usable, or is it totally and irreparably stupid?

I thought about a few ways in which it might be made better:

Combine the due_date and due_time back into a single due_date_time field, and make it a timestamp with time zone column.
If the user is submitting an item without a due time, automatically default the time to 12:00 AM.
Take the submitted due date and due time, along with the user’s current timezone, and convert them to UTC on the server (e.g. using Timex) before storing in the due_date_time field. If a timezone is not provided during submission (for whatever reason), fall back to the default_timezone on the user’s account.

Thoughts?

LostKobrakai · September 6, 2019, 5:54am

There are a few things:

You want to store wall datetimes and utc datetimes for future points in time, as the timezone definition could change between now and the event:

http://www.creativedeletion.com/2015/03/19/persisting_future_datetimes.html

Dates by themself don‘t really have timezones. You could have an interval of datetimes, which span the time of a certain date in a timezone though.
*_tz columns in postgres don‘t actually do much when using ecto. They don‘t store timezone information, but try to make sure a dt is converted to utc, which ecto already enforces. Unless you use other systems as well to use the db I‘d stick to non *_tz columns.
Using separate date and time columns for a single datetime is imo a really bad idea. If you want datetime or date to be possible use a datetime and a date column.

egeersoz · September 6, 2019, 5:05pm

Can you explain why having date and time columns separate is a bad idea?

You want to store wall datetimes and utc datetimes for future points in time, as the timezone definition could change between now and the event: http://www.creativedeletion.com/2015/03/19/persisting_future_datetimes.html

If I am understanding this article correctly, the problem that is described was caused by the fact that the software was converting the provided wall time to UTC, and three months later was displaying the incorrect time because the timezone rules had changed, and the conversion from UTC back to wall time was performed using the offset in the new ruleset.

In my case, I store 2019-12-18 13:00:00 (timestamp without time zone, i.e. wall time) in America/Chicago. No UTC conversion. On December 18th, 2019, this time would still get displayed correctly, because it is the time that the user actually intended.

The article seems to suggest storing the offset anyway to cover for rare scenarios such as when a particular wall time might occur twice.

So it seems the “ideal” solution is:

Store datetime in a “timestamp without time zone” column, along with the name of the timezone, and the offset at the time of data insert/update. Avoid “timestamp with time zone” columns, or converting to UTC explicitly.

LostKobrakai · September 6, 2019, 5:26pm

Without converting to UTC you give up the ability to compare datetimes in the db or compare to a datetime like “now”.

In the db there’s not much difference between putting together a date and time or spliting out the time or date from a datetime, I just feel the latter is the cleaner approach, but in ecto it’s safer/simpler to work with datetimes instead of a date and time field, which is not aware of any timezone concerns at all.

egeersoz · September 6, 2019, 6:03pm

You said:

*_tz columns in postgres don‘t actually do much when using ecto. They don‘t store timezone information, but try to make sure a dt is converted to utc, which ecto already enforces. Unless you use other systems as well to use the db I‘d stick to non *_tz columns.

Then you said:

Without converting to UTC you give up the ability to compare datetimes in the db or compare to a datetime like “now”.

So, is it better to use a *_tz column (so that any provided time is automatically converted to UTC), or not? Or should both be done?

LostKobrakai · September 6, 2019, 6:12pm

In ecto using utc_datetime has utc in it’s prefix not by accident. It’ll make sure you only ever store datetimes with UTC timezone. It’ll raise if you try to persist a datetime with a different timezone. No need to rely on some server/postgres timezone config to convert anything and therefore no need for *_tz columns with ecto. This way you’ll also only ever use the timezone database of elixir, so you don’t run into issues of different timezone databases diverging.

If you additionally to UTC store the timezone and offset used to calculate the UTC datetime you should be able to calculate expected wall time and walltime given the current timezone database. If there’s a difference you can notify the user about it.

egeersoz · September 6, 2019, 10:01pm

Okay, I think I’m making progress on this. What I’ve gathered so far is that one should avoid the *_tz columns in Postgres altogether and allow Ecto to perform the conversion based on the field type in the schema. If it is :naive_datetime, the value will be stored as is (i.e. wall time). If it is :utc_datetime then Ecto will convert it to UTC before storing (and raise an error if one attempts to insert a non-UTC datetime).

I think I also have a somewhat better grasp of the problem domain more generally, but wanted to confirm.

The original link you provided suggests the following as the solution to the problem of accurately handling and displaying future dates:

Instead of saving the time in UTC along with the time zone, developers can save what the user expects us to save: the wall time. Ie. what the clock on the wall will say. In the example that would be 10:00. And we also save the timezone (Santiago/Chile). This way we can convert back to UTC or any other timezone.

If I’m reading the above paragraph correctly, based on what I understand about Ecto types, the wall time should be saved as :naive_datetime, along with the name of the timezone. This way Ecto will not convert it to UTC before storing.

Then, further along, you said:

Without converting to UTC you give up the ability to compare datetimes in the db or compare to a datetime like “now”.

So it sounds like, in addition to the wall time :naive_datetime column, one should also save the same datetime value in a :utc_datetime column? So the schema would look like this:

field :due_date, :naive_datetime
field :due_date_tz, :string
field :due_date_utc, :utc_datetime

The due_date and due_date_tz columns would allow me to display the correct time to the user, and let them manipulate it, based on their current timezone, using moment-timezone JavaScript library (which comes with its own timezone database):

var wallTime = "2019-12-12 17:00:00"
var timeZone = "America/Los_Angeles"
moment.tz(wallTime, timeZone).local().format('MMMM Do YYYY, H:mm:ss');
# December 12th 2019, 19:00:00 pm

(moment-timezone can also deal with ambiguous times, such as those that occur twice)

Whereas the due_date_utc column would be used for doing date comparisons in the DB itself (i.e. raw SQL, for example using Postgres functions like NOW()) without having to rely on Ecto/Elixir (or another programming language that can use a timezone database to calculate values before interpolating them into the SQL statement). To me, this sounds like it would really only matter if I expect to write SQL scripts in the future to run against the database. Is that accurate?

markmark206 · September 9, 2022, 3:13am

(Not a direct answer, but I thought this might be a useful data point to ponder in this context, so recording my experience here.)

FWIW, I found it simple / unambiguous / bug-proof to:

persist “point-in-time” timestamps as “unix seconds”, and
at render time, convert those “unix seconds” timestamps into tz-aware DateTimes, in the timezone relevant to the time / place of rendering.

This

keeps my schemas simple (the timestamp is just an integer), and
makes certain types of timezone-related bugs hard to introduce.
(Because the timezone (such as it is, in the case of unix time;) of the persisted data is crystal clear, the [devs implementing the] code paths that write and interpret that data are unlikely to be confused.)

Here is an example of using this approach for inserted_at / updated_at timestamps.

(This is probably not the best example, but this is what I have handy atm, and this approach works just as well for “business logic” values… assuming the times you are persisting fall within the time period covered by unix timestamps, ofc).

defmodule MyApp.Schema.Base do
  @moduledoc false
  defmacro __using__(_) do
    quote do
      use Ecto.Schema

      @timestamps_opts [
        type: :integer,
        autogenerate: {System, :os_time, [:second]}
      ]

    end
  end
end

defmodule MyApp.Schema.Items do
  @moduledoc false
  use MyApp.Schema.Base

  schema "items" do
    ...
    timestamps()
  end
  ...
end

defmodule MyApp.Repo.Migrations.AddingItems do
  use Ecto.Migration

  def change do
    create table(:items, ...) do
       ...
      timestamps(type: :bigint)
    end
    ...
  end
   ...
end

It is usually easy to have a db query return these timestamps in a human-friendly format. Here is an example in postgres:

myapp_test=# select inserted_at, to_timestamp(inserted_at) from items order by inserted_at desc limit 2;
 inserted_at |      to_timestamp
-------------+------------------------
  1662692701 | 2022-09-09 03:05:01+00
  1662692701 | 2022-09-09 03:05:01+00
(2 rows)