Proposal: strftime-based calendar/datetime formatting

So after a second look I realized that this is not enough because in some locales they want to treat the hours from 1-12 or use 0 for midnight and 12 for noon. So I have made hours_in_am_pm fully localizable and removed the :midnight and :noon returns.

Yes, this we will have for sure.

Since it is a user supplied function, then it is up to them, so they should provide the relevant error message. We can add a formalization latter if necessary.

Iā€™ve been thinking about this a lot on a long plane flight so here goes:

Calendar module responsibilities

This part is not about formatting but separation and encapsulation of concerns. But if there is agreement with this, and my proposal about augmenting the format flags, then it greatly simplifies the contract between strftime and a calendar.

I think a calendar should encapsulate everything relevant to calculating days, weeks, months, years etc. That is, any configuration should be baked into the module that implements the Calendar behaviour.

This seems obvious when we talk about the iSO calendar versus the Balinese, or Ancient Egyptian or the Islamic calendars. But its less obvious if we consider the myriad of calendars derived from the Gregorian.

For example, in the US, a company may have a fiscal calendar that starts on an arbitrary day of the year. All months, quarters, weeks and days are calculated from this date. Cisco Systems has a financial year calendar that ends on the last Saturday of July. There are a whole class of calendars called 445 calendars and their cousins the 454 and 544 which are common in retail and are designed to make it better for comparing this year to prior year periods.

Although I started thinking this would be best done as configuration passed around in each structurally compatible Date, Time or DateTime struct or map, the variations are too many. I now feel it is better that this detail be backed into a module that implements the Calendar behaviour. This doesnā€™t make implementing a calendar any more complex. It does potentially have the side effect of having to generate calendar modules at runtime for some use cases.

Calendar Behaviour

If you accept that a Calendar module encapsulates all the required configuration and implementation required for the behaviour then implementing strftime comes back to mostly calling the behaviour functions. And week_of_year should definitely stay - in all of those business calendars is used a lot and the notion of a week is, I think we would all agree, a very common case.

If you got this far then the missing pieces of the Calendar behaviour that would be required to support strftime would be day_name, month_name (short and long) and the am/pm functionality described by @josevalim above. Since these will need to be implemented anyway it feels as if they should be part of the Calendar behaviour.

Formatting

Formatting for strftime should not be the calendars concern (in my opinion). Thats a representational issue and other than names for days and months, formatting should be the strftime formats responsibility. In examining the Ruby and Python implementations I see that the Python version is quite sparse but the Ruby version includes an approach that I think is pragmatic.

A Proposal

@spec strftime(date_or_time_or_datetime, format :: String.t(), options :: Map.t()) :: 
    String.t | {:error, String.t}

format is a format string as described in the original proposal however it includes an optional width that is used to determine padding. This is similar to how printf and friends in C define padding.

Format encoding: %<flags><width><conversion>.

Flags (based upon the Ruby implementation)

Flag Description
- donā€™t pad a numerical output
_ use spaces for padding
0 use zeros for padding
^ upcase the result string
# change case
: use colons for %z

Using this approach allows a calendar implementor to use the %x directive to return a format string that uncludes the required padding. That is, %x is really just an interpolation of a format defined in the calendar. And that format can include the appropriate padding. Same would apply for %X and %c. The format for these ā€œpreferred formatsā€ can also then vary based upon locale without too much difficulty in any locale-aware calendar.

Examples

Format Output
%m 1, 12
%2m 01, 12
%A Monday
%^A MONDAY

Summary

This proposal intents to:

  1. Recommend reinforcing the Calendar behaviour and add to it those missing pieces that support formatting which are primarily the day_name and month_name (short and long) functions.

  2. Simplify the interaction between strftime and a Calendar module by reverting to the original plan of simply calling the relevant functions in the calendar module.

  3. Making formatting decisions part of the format string (ie flags and width modifiers)

Options

I have included an options argument to strftime above because for localisation a locale name needs to be specified. In Python a locale is part of the stdlib and runtime environment but not so Elixir. Therefore it should be passed in to the calendar. It could be argued that since there is no other part of Elixir that embodies the idea of localisation then this is not the place to start. I admit my bias given my work on Cldr, but I think the idea of a locale should be able to be specified.

A Calendar should feel free to return {:error, "unknown locale"} if a locale is requested that it doesnā€™t support. Of if a more loose contract is preferred then treat the locale as entirely optional.

5 Likes

I am not sure I agree with this. :slight_smile:

The instants are the same and they are represented by the same year, month, day, hour, minute and second, so it does not feel to me they should be different calendars. To me, it is the same calendar, but it wants to be traversed or viewed in different ways. My suggestion would be to explore a WeekTraverser or WeekBuilder or some sort of behaviour that focuses on exploring a given calendar through different interpretations of its weeks. It may be even something that can be implemented on top of iso days (and therefore calendar agnostic)?

I agree with this but the calendar has to provide a default behaviour. So the recent proposal has a default week_of_year but you can customize it via the week_of_year option. The calendar also provides the default values for preferred time and so on and I donā€™t think we can escape this.

While I really like this idea, then we are no longer implementing strftime, so it needs to be a very conscious choice. The biggest benefit IMO is that it gets rid of all the padding options and, as you said, removes formatting concerns from the calendar.

Those flags are all officially supported by strftime but I decided to go only with three related to padding for now for simplicity. We can always add more later.

Isnā€™t this then forcing localization and therefore formatting concern into the calendar? I would prefer if the calendar does not receive a locale, ever. With the latest proposal, you can localize everything like this:

Calendar.format(datetime, "%format", MyApp.Calendar.format_config(:pt_br))

So I believe we are going with the loose contract you mentioned.

@kip assuming that we will merge the flags and width syntax that you proposed, are there any other changes you would like to see based on my reply?

Thank you.

@josevalim Looks good. I understand why the MyApp.Calendar.format_config(:pt_br) approach. And its easy to integrate with my Cldr project so that Cldr users can adopt Calendar.format easily.

Totally ok with not agreeing. I just canā€™t reconcile how to do what you suggest and still use the Date, Time and DateTime standard structs which have a :calendar field only. Also every calendar is basically just a different representation of the same moments in time so by extension arenā€™t you saying there is only one calendar and multiple different representations that model year, month, day in alternate ways? Anyway, off topic for this thread.

I have nothing else to add to the proposal and I think a lot of users will benefit from it.

1 Like

Good call, I copied the discussion to a new thread so we can talk more about it: How to support multiple week calendars in Elixir?

3 Likes

In german speaking countries we have the concept of a calender week, which differs to a simple week_of_year number. I am unsure what value the week_of_year number would return in the current planned implementation.

  1. A week_of_year can be based on the 1JAN of the year, where any date from 1JAN to the 7JAN (both inclusive) is interpreted as WeekNr=1, and date between 8JAN to 14JAN (inclusive) is interpreted as WeekNr=2, etc. This could be called week_of_year.

  2. Another way to define a week_of_year is to use a relevant first_day_of_week, e.g. MON, and then look to see how many days of the start of the year are in the first week, and if the number of days >=4 (for a 7 day based calender) then this is the first week, if there are three or less, then the week is calculated as being a part of the previous yearā€™s last year number:
    https://en.wikipedia.org/wiki/Week#Week_numbering
    This could be called week_of_year_iso.

The calculation of the week_of_year might thus depend not only on the calender and specific date, but also on the defined first day of the week for the calender as well as the type of week_of_year number the user is looking for. Just having a week_of_year option might not suffice to differentiate what type of WeekNr ought to be returned, or the returned WeekNr might not be what a caller expects.

@rudolfb the week_of_year would be customizable, so it would return whatever you want. But we have currently removed it from the proposal because we are still not sure thatā€™s the direction we want to go :slight_smile:

@kip we have decided to accept this extension but we will keep the default formatting based on ISO. The rationale is the following, if we keep the default without padding, then it means that all calendars users have to add padding.

However, if we keep the ISO convention, then the ISO users do not have to add explicit padding, but non-ISO users will have to add them exactly as before. This keeps it convenient for ISO users (the majority), it allows us to follow the strftime spec, it allows us to reduce knowledge of formatting in the calendar, while providing useful features to other calendars.

Finally, I would like to note that, by figuring out all of these trade-offs with strftime, we have actually solved the problems we had with ICU. However, we are planning to go with strftime because it has a much smaller surface API.

Do you have any final thoughts? I will update the proposal with those latest changes.

1 Like

@josevalim No final thoughts. strftime makes more sense for most users I think since its a familiar api across languages and having the format default to the ISO calendar aligns well too.

Lastly, this approach still allows ex_cldr_dates_times to serve configuration very easily to strftime which I think will be helpful for some people.

1 Like

Hi everyone, we are glad to say this proposal has been accepted. The next step is to develop it as a separate library for further validation. We will let you know once that happens.

4 Likes

This might be out of scope for the standard formatting, but I had to display hours with value >24h for late night schedules.

The idea is simple, instead of writing 2019-01-01T02:00 we have 2018-12-31T26:00.
This is quite common in Japan, but Iā€™m sure it is used elsewhere. This is used mostly for event schedules (cinema, tv shows, conferencesā€¦). The idea is that itā€™s the continuation of the days you woke up.

I wonder how this kind of date could be formatted in elixir.

@kuon in those cases, you can always interpolate the value directly on the format string, mixed with the formatting syntax.

I think there is more than enough rationale to justify something like this. To be honest, when I first got into Elixir, this was something I looked around for and just assumed it already existed.

Having used datetime formatters in a few languages, I think my favorite has been Moment.js (which isnā€™t strftime, and is closer to LDML). I even created a version for Elixir (sorry, never shared it). I think the formatting tokens are more intuitive than strftime.

https://momentjs.com/docs/#/displaying/format/

For example, what I created looked like this:

DateTimeFormatter.format(datetime, "YYYY-MM-DD")
{:ok, "2018-12-22"}

DateTimeFormatter.format!(datetime, "M/D, h:mm a")
"12/22, 1:14 pm"

DateTimeFormatter.format(datetime, "[today is] dddd")
{:ok, "today is Saturday"}

One key difference is multiple characters for the token. Eg:

M              1 2 ... 11 12
Mo             1st 2nd ... 11th 12th
MM             01 02 ... 11 12
MMM            Jan Feb ... Nov Dec
MMMM           January February ... November December

That said, it sounds like you are proposing the ability to substitute formatters? So the default would be strftime, but someone could build a LDML formatter?

That someone can even be you! :smiley:

But yes, a formatter is just a module, so you can just define your own module if you want to.

2 Likes

Ha, well Iā€™d be happy to give it a shot.

With my CLDR-based date/time formatter now finally out the door I look forward to serving configuration for strftime whenever its ready for a road test ā€¦

4 Likes

We have finally implemented a library based on this proposal: https://github.com/plataformatec/nimble_strftime

Everyone, please do give it a try in your application! And @kip, let us know if it provides the necessary hooks for i18n/l10n.

9 Likes

@josevalim, all good. I think the only thing maybe missing is era, but I see thatā€™s not part of the substitutions anyway.

For those looking to localise formatting with NimbleStrftime, there is now an ex_cldr function to generate the options for NimbleStrftime.format/3. ex_cldr_dates_times is another option for those looking for CLDR-based formatting.

MyApp.Cldr.Calendar.strftime_options!/2 (where MyApp is any Cldr backend module youā€™ve defined) provides these options. Youā€™ll need to update to ex_cldr_calendars version 1.5.0.

Its a ! function because if the specified locale is unknown it will raise.

Examples

iex> NimbleStrftime.format(~D[2019-11-03], "%A, %b %d %Y", MyApp.Cldr.Calendar.strftime_options!())
"Sunday, Nov 03 2019"

iex> NimbleStrftime.format(~D[2019-11-03], "%A, %b %d %Y", MyApp.Cldr.Calendar.strftime_options!("fr")) ==
"dimanche, nov. 03 2019"
4 Likes

With Calendar.strftime/2 being merged into Elixir master (from the NimbleStrftime lib) Iā€™d like to open a proposal on some additional formatting options and gain community feedback.

There are three sub-proposals and any and all feedback is welcome.

Weeks (yes, again with the weeks conversation :slight_smile: )

Linux strftime has formatting flags for weeks. These being:

   %U     The week number of the current year as a decimal number, range
          00 to 53, starting with the first Sunday as the first day of
          week 01.  See also %V and %W.  (Calculated from tm_yday and
          tm_wday.)

   %V     The ISO 8601 week number (see NOTES) of the current year as a
          decimal number, range 01 to 53, where week 1 is the first week
          that has at least 4 days in the new year.  See also %U and %W.
          (Calculated from tm_year, tm_yday, and tm_wday.)  (SU)

   %W     The week number of the current year as a decimal number, range
          00 to 53, starting with the first Monday as the first day of
          week 01.  (Calculated from tm_yday and tm_wday.)

Adding this formatting directives would require an update to the Calendar behaviour to provide support.

Options:

  1. No need, donā€™t add
  2. Just add %V for the ISO week number and add Calendar.iso_week_of_year/1 to the Calendar behaviour
  3. Maximise compatibility, go with all three and add the callbacks

Add support for Era

Calendar.day_of_era/1 returns {day, era} but thatā€™s an integer, not a display format. Calendar behaviour doesnā€™t do any display format translation today. The new Calendar.strftime/2 does display format translation for AM and PM (and variants). So one approach is to simply enhance Calendar.strftime/2 to also map Calendar.ISO eras 0 -> display_name and 1 -> display_name. And there would need to be agreement on display name (AD versus CE and BC versus BCE).

Era isnā€™t used in day-to-day formatting for Gregorian calendars except for dates before year 1. It is used as a standard part of formatting Japanese calendars and for some format of Chinese calendars (and derivatives on the 60 year cycle).

strftime defines the following directives:

Specifier Meaning
%Ec Date/time for current era.
%EC Era name.
%Ex Date for current era.
%EX Time for current era.
%Ey Era year. This is the offset from the base year.
%EY Year for current era.

These can be implemented without a change to the calendar behaviour if Calendar.strftime/2 treats era like it treats am/pm and provides an internal translation. It would perhaps be better that both of these (ie am/pm and era translations became behaviours since that would also help in international projects using Gettext or Cldr.

Options

  1. Era? Who cares, forget about it
  2. Compatibility is a good idea and implementation seems simple. Do it.
  3. I care about calendars beyond Calendar.ISO so I need this (Iā€™m not holding my breath on this one :slight_smile: )

Localised number systems (ie not Latin alphabet)

strftime supports localising the date format to use non-latin alphabets. These is also supported in ex_cldr and friends. The default for these directives is to use the Latin alphabet so implementing these directives is quite trivial. If no configuration is provided in options, use the fallback to the Latin characters.

The formatting directives are:

Specifier Meaning
%Od Represents the day of the month, using the localeā€™s alternative numeric symbols, filled as needed with leading 0ā€™s if an alternative symbol for 0 exists. If an alternative symbol for 0 does not exist, the %Od modified conversion specifier uses leading space characters.
%Oe Represents the day of the month, using the localeā€™s alternative numeric symbols, filled as needed with leading 0ā€™s if an alternative symbol for 0 exists. If an alternative symbol for 0 does not exist, the %Oe modified conversion specifier uses leading space characters.
%OH Represents the hour in 24-hour clock time, using the localeā€™s alternative numeric symbols.
%OI Represents the hour in 12-hour clock time, using the localeā€™s alternative numeric symbols.
%Om Represents the month, using the localeā€™s alternative numeric symbols.
%OM Represents the minutes, using the localeā€™s alternative numeric symbols.
%OS Represents the seconds, using the localeā€™s alternative numeric symbols.
%Ou Represents the weekday as a number using the localeā€™s alternative numeric symbols.
%OU Represents the week number of the year, using the localeā€™s alternative numeric symbols. Sunday is considered the first day of the week. Use the rules corresponding to the %U conversion specifier.
%OV Represents the week number of the year (Monday as the first day of the week, rules corresponding to %V) using the localeā€™s alternative numeric symbols.
%Ow Represents the number of the weekday (with Sunday equal to 0), using the localeā€™s alternative numeric symbols.
%OW Represents the week number of the year using the localeā€™s alternative numeric symbols. Monday is considered the first day of the week. Use the rules corresponding to the %W conversion specifier.
%Oy Represents the year (offset from %C) using the localeā€™s alternative numeric symbols.

Options

  1. Who cares about countries and people that donā€™t use the Latin alphabet. Forget it.
  2. Good idea - easy to implement, no impact if Iā€™m not doing non-latin alphabet code. Glad to see Elixir embracing global cultures (ok, Iā€™m pitching I know :slight_smile: )
6 Likes

I believe we can add %V but I would make it calendar specific. So once you change calendars, it will return the week number of said calendars. The default is of course ISO. Does strftime specify a way to get the week of year? Without week of year, I donā€™t see this being very useful.

About %U and %W, I am honestly not sure, as they seem to be based on gregorian calendars. What would they return for a japanese calendar, for example?

I agree with this proposal. day_of_era works but we will probably need to use year_of_era to implement all of the modifiers (except the %EC itself).

Regarding ā€œ(AD versus CE and BC versus BCE)ā€, letā€™s just pick whatever strftime uses/returns.

I like this proposal too but I wonder if it should simply be an option called :number_formatter. The number formatter is a function that receives the number as an integer and the padding, and it returns a string. You can replace the number formatter to use any numeric alphabet and padding that you want. This seems simpler overall. WDYT?


Thanks for writing these down @kip. If you want, we can proceed with a PR for era while we discuss the remaining topics. :slight_smile:

2 Likes