Proposal: Introduce help catalogs

josevalim · August 9, 2018, 11:54am

Note: this is a language proposal so please keep the discussion on topic. If you want to talk about related behaviour but not strictly part of the proposal, please start a new conversation.

Elixir focuses on good warning and error messages whenever possible. After all, an unclear warning/error should be a bug.

In some cases, however, to keep those messages as clear as possible, they end-up spanning multiple lines:

iex(1)> defmodule Foo do
...(1)> def bar(baz) do
...(1)> if true do
...(1)> baz = :other
...(1)> end
...(1)> end
...(1)> end
warning: variable "baz" is unused

Note variables defined inside case, cond, fn, if and similar do not leak. If you want to conditionally override an existing variable "baz", you will have to explicitly return the variable. For example:

    if some_condition? do
      atom = :one
    else
      atom = :two
    end

should be written as

    atom =
      if some_condition? do
        :one
      else
        :two
      end

Unused variable "baz" found at:
  iex:4

here is another example:

iex(2)> defmodule Bar do
...(2)> def foo(:a, b \\ :omg), do: :a
...(2)> def foo(:b, b), do: b
...(2)> end
warning: def foo/2 has multiple clauses and also declares default values. In such cases, the default values should be defined in a header. Instead of:

    def foo(:first_clause, b \\ :default) do ... end
    def foo(:second_clause, b) do ... end

one should write:

    def foo(a, b \\ :default)
    def foo(:first_clause, b) do ... end
    def foo(:second_clause, b) do ... end

  iex:4

The downside of those messages are that, for experienced developers, they end-up being too much noise. There is also a “scare” factor when you download a dependency and it ends-up printing long multiple lines of warnings.

This is aggravated by the fact that Elixir does not allow warnings to be disabled. We promise to keep all warnings relevant and worth of your time, but, as a trade-off, we don’t allow you to disable them.

Note: this is NOT a discussion about disabling or removing some warnings. If there are warnings you don’t agree with, please open up a separate discussion.

In order to keep our promise of relevant warnings for new and experienced developers alike, we would like to introduce help catalogs. This proposal is broken in 3 parts. First we introduce the idea of a help catalog. Then we propose a particular implementation. Finally we discuss improvements in related areas.

Help catalogs

The idea behind help catalogs is very simple. Instead of long warnings, we will provide users with a mechanism to get more information about that warning. For example, the unused variable warning above could be written as:

warning: nested variable "baz" is unused (elixir --explain nested_var)

Once the user invokes the proposed command, they will get a detailed explanation about the warning and how to fix it:

$ elixir --explain nested_var

This warning yada yada yada yada yada yada
yada yada yada.

For the clauses one, we could say:

warning: def foo/2 has multiple clauses and also declares default values. Please define default values in a header (elixir --explain defaults_and_clauses)

Do not worry about the styling of the warnings and of the command for now. We will address them later.

I believe this will be an improvement for long warnings but there is another reason why I believe this feature can be extremely useful. Let’s get a very simple warning, the unused variable warning, as seen below:

iex(3)> defmodule Baz do
...(3)> def bar(used, unused), do: used
...(3)> end
warning: variable "unused" is unused
  iex:5

How to address this warning? We change unused to _unused. But how would somebody in their first week with Elixir know this is the case? We could make the warning longer but everyone would agree it is counter-productive. By having a catalog, we can include detailed information even on simple warnings like above. Similar feature exists in languages like Rust and PureScript.

Question 1: what do think about the idea of supporting help catalog in general? (regardless of the command structure, syntax, etc)

Implementing catalogs

If agreed that help catalogs will be a good addition to the language, then it is time to talk about its implementation.

In the examples above, we have used the following syntax to invoke them: elixir --explain nested_var. Such syntax has two issues:

It seems specific to Elixir. However, as an extensible language, it would be great if help catalogs were available to all libraries
If we also want to introduce the explain functionality to IEx, a natural confusion between explain and help would arise. After all, what is the difference between help and explain? When to use one or the other?

Therefore, I propose the following syntax for help catalogs:

$ elixir -h elixir:nested_var
$ elixir --help elixir:nested_var

Then in IEx, you can read a catalog as:

iex> h "elixir:nested_var"

In other words, we are simply extending the help mechanism to support catalogs. The catalog is given in two parts, the application name (elixir) and the entry name (nested_var), separated by :.

Implementation wise, it will work like this:

We will receive the entry name as “app:entry” and split it by “:”
We get the application name and look for its .app file
Inside the app file, we will look for an entry named help_catalog that points to a module
The help_catalog will then be loaded and it must export a function of zero arity with the same name as the entry. The function should return a string in markdown that will then be formatted and printed.

Question 2: what do you think about the suggested syntax for catalogs and its implementation?

Closing the gap

Now that we have introduced elixir -h "app:entry", does it make sense to close the gap between the command line and IEx? In other words, should we be allowed to run elixir -h String and show the documentation for the String module?

One reason to say yes is completeness. However, I personally open IEx multiple times only to retrieve the documentation or to open a module, so I would definitely use this feature too.

In particular, I propose to support all of --help, --type-help, --behaviour-help and --open in the elixir command line. I think being able to do elixir --open String and have the module open in my editor would be fantastic.

Implementation-wise, this is very straight-forward, as all of those features already exist in IEx, and we are simply discussing an option to make it available in more places.

Note that we will also have to support those entries in mix run, as we also need them available in the context of a project.

Question 3: should we close the gap and allow help and open generally available in the elixir and mix run commands?

Feedback

Feedback on the proposal is welcome. If you agree or disagree with the proposal, make sure to detail why, and remember to provide insight on the questions above. Thanks!

lackac · August 9, 2018, 12:45pm

Let’s do it

I like the idea of the help catalog. I think it hits a good balance for beginner and experienced programmers alike.

I think it’s a worthwhile goal to make the catalog extensible and to avoid confusion by extending the functionality of help.

What are the help catalog functions supposed to do? Write to stdout or stderr? Return a string with the message? Is the message markdown formatted or does it use ANSI codes?

The help catalog to me feels like something that should be part of the documentation. This means that ExDoc should be able to compile a page that lists these and could be linked to. This may be partly achieved if we follow your implementation plan and include the catalog module in the documentation. However, since the actual content is an implementation detail of the functions we either loose that information in the docs or require an extra step while compiling docs of calling each of these functions and interpreting the output.

I wonder if we could make this part of the actual documentation. Two options I can think of:

make it part of documentation metadata on the module level:

defmodule Kernel do
  @moduledoc help_catalog: [
    nested_var: """
    Note variables defined inside `case`, `cond`, `fn`, ...
    """
  ]
end

In this case one could run elixir -h Kernel:nested_var or keep the help_catalog app entry part of the proposal and only change the lookup mechanics.

introduce a new module attribute @help_catalog keyword() with similar setup as in option 1. This would generate functions (probably with some special naming to avoid conflicts) with @doc(string). The main benefit of this is that now we don’t need any special handling of catalog entries on the client level. It can be just a simple help lookup and will work almost out of the box with ExDoc too.

Yes

LostKobrakai · August 9, 2018, 12:47pm

In general all 3 proposals sound reasonable, while I personally don’t particularly care about 3. But I’m wondering if or how the catalog would deal with context specific text. I’m not sure if we have that in the current warning texts though.

Another question I have: Would it still be possible to have the current “verbose” warnings emitted by the compiler? I often just scroll through the list of compiler warnings and fix one after the other. Without having another window open it would be super tedious to lookup every unknown ones of those short warnings and even someone not new to elixir might not have all the useful help information in his head by just reading a short warning. Sometimes a good example makes you see a harder to catch typo or something like that. If the new help catalog would include even more help information than current warnings do this might be a lot of work to maintain (short / compiler / help catalog version).

victorolinasc · August 9, 2018, 12:55pm

Question 1: what do think about the idea of supporting help catalog in general? (regardless of the command structure, syntax, etc)

I see only benefits for this idea so yes.

Question 2: what do you think about the suggested syntax for catalogs and its implementation?

I agree with @lackac about including this in the docs somehow. I personally think that keeping it in a module is better as it does not clutter other modules with more attributes (it will start to look like Java with annotations all over the place).

A new section on docs would make this even more discoverable.

Question 3: should we close the gap and allow help and open generally available in the elixir and mix run commands?

Hell yeah!

lackac · August 9, 2018, 12:55pm

If we stick to the implementation proposed by @josevalim we could have functions with an arity of 1 that take the version as argument (:short, :long, :detailed) and return the appropriate message. This would make it easy to maintain these from the same place. The compiler then could print :long for the first occurrence and :short for all the rest after that. In addition there could be command line flag to always print :short or :long.

josevalim · August 9, 2018, 12:59pm

Good question. The function will return a string in markdown, I have updated the document to mention it.

I thought about using regular documentation but I would like this to be dynamic. For example, if you invoke elixir -h elixir:guards, we could get Kernel’s documentation metadata and build a list from that programatically. So I think being at runtime gives us a bit more flexibility. We could still build a static list with ExDoc though. I think that’s a great idea!

We don’t have context specific text currently. It is something that we could do in the future but it is harder to do in Elixir in general because compiling code is the same as running code and, by then, some of the contextual information is already gone.

josevalim · August 9, 2018, 1:03pm

I like this idea but then it means we need to do another approach than the application based one. Doing the application lookup is ok for one-off script calls, such as elixir -h elixir:unused_var, but I wouldn’t rely on it for emitting warnings in general.

We could use an approach based on module names, but then I think the line gets very blurry between asking the documentation of a function (elixir -h Kernel.is_atom) and asking for a warning (elixir -h Kernel:unused_var).

EDIT: there is another issue with this approach. For the warnings themselves, you usually have some contextual information, such as the variable name that you skipped, for the detailed part, you wouldn’t have it anymore. So their APIs is not quite the same.

lackac · August 9, 2018, 1:14pm

I don’t think it would be necessary to support returning :short and :long from the command line or the console. It would be useful enough to only return :detailed. In this case the code emitting the warning could rely on knowing the module name without looking it up.

Would it be too far fetched to add a second argument with opts \\ [] that makes this work? This could be later used to add contextual information to :detailed as well.

Eiji · August 9, 2018, 1:19pm

@josevalim I would like to comment also some parts of whole post, so:

When reading this something like that just screamed:

$ mix deps.compile
Note: some dependencies may not be up to date with latest Elixir changes which could produce extra warnings while compiling. Use `--silent-deps` to filter them when printing to standard output.

Note: It’s not proposition - just comment i.e. what comes to my mind when seeing it. If anyone is interested please create new topic and quote this part.

Heh, when reading your introduction and examples then I though just about something like that.

Definitely: YES

Hmm, personally for me more confusing is having all types of documentation and warnings explanation in one function.

h explain

                               def explain(code, app \\ :elixir)

    @spec explain(atom(), atom()) :: String.t()

Explains warning (…)

For me this looks enough clear, but others could disagree with me. In short this looks amazing, but for me it would be better separated.

This is awesome idea. To be honest I did not even tried to create any plug-in for any editor, but I think that it would be really helpful for plug-in developers.

Yes, please

EDIT

Just forgot one really useful case. @josevalim how you plan to display FunctionClausuleError? I think that in this case detailed information are really useful for everybody and shorten them could cause problems.

grych · August 9, 2018, 1:33pm

Ad. 1. I like the idea in general, but I also liked the long messages in the beginning with Elixir. It is extremely useful for beginners, and this (and great docs) is why Elixir is easier to learn than any other language.

On an very early stage of learning Elixir, if I try to reassign the value of variable, I will got the warning that the variable is unused:

iex(1)> a = 1
1
iex(2)> if true, do: a = 2
warning: variable "a" is unused (elixir --explain nested_var)

Imagine you’ve never seen Elixir or Erlang before. Isn’t it confusing? And it is perfectly explained with the long help. Yes, you can alway have a long explanation, but it is not available at the first sight. And not everyone would understand that it is possible to get the longer help message.

What if we do short messages optional? Like, by default, showing long messages (warning + help from the catalogue) and allow to turn on short messages by a switch --short-messages or an entry in .elixir?

Ad. 3. YES!

blatyo · August 9, 2018, 1:42pm

I have mixed feelings.

My concern is that by doing this it will reduce the helpfulness of the warnings. Often in the warning you can tell the user the code they should have written. But if that is moved and there’s no way to pass context to the help, then it becomes generic. For example, in one of the projects I work on I have this warning in a macro:

IO.warn("""
Calling assert_message_publish/1 with a message pattern is deprecated. Replace with:

    assert_message_publish #{inspect(binding()[:name])}, #{unquote(message_code)}

Or

    assert_message_publish #{inspect(binding()[:name])}
""")

In the case of this particular code, the user would have to do extra work to figure out what that first argument should be, whereas, I’m able to tell the user in the warning.

If this does get added, I’m not opposed to this approach. I would like IEx to print out the detailed warnings by default though.

If this does get added, yes. Also, could we have --help and -h with no args or some other flag to show the detailed version when running?

AstonJ · August 9, 2018, 2:03pm

I agree with everything Tomek has said. Making short messages optional could be the best of both worlds?

Introducing an extra step to get the long explanation will probably be ignored by many - partly because over a period of time they will subconsciously learn to ignore them (either because they may have needed more info by googleing in the past, or because the long message might not have helped them and they needed to google anyway) and partly because it’s an extra step, usually people will avoid extra steps unless they feel it nescessery or strongly likely to yield what they want.

However if this is too much of a burden for the Core team then I support whatever is easiest for them

I think it would be a nice addition (if made optional as per above). However I worry it may be adding more work for the core team.

Looks good!

I think it makes sense, yes

GregMefford · August 9, 2018, 2:13pm

We should do this.

This proposal is acceptable, but I wonder if it would be easier from inside iex if the thing you needed to type was just a module name, so that you’d get automatic tab-completion. The other nice thing here is that it’s just regular old doc strings that we already have today. No need to do anything fancy to find the catalog module because the module/function name is directly given to you.

For example:

iex> h Elixir.Help.nested_var

From the shell, you could do either:

$ elixir -h nested_var

If no module is specified, Elixir.Help would be assumed. For library-based help catalogs, you just need to be explicit:

$ elixir -h SomeLibrary.Help.some_catalog_entry

Yes

blatyo · August 9, 2018, 2:24pm

I’d be in favor of this.

jgonet · August 9, 2018, 2:53pm

The downside of those messages are that, for experienced developers, they end-up being too much noise.

I wonder if embedding catalog name of warning and adding info on top would improve readability - changing:

warning: nested variable "baz" is unused (elixir --explain nested_var)

to:

View detailed explanation of warnings by elixir --explain <warning name>

[nested_var] warning: nested variable "baz" is unused
[another_warning] warning: this is another warning

josevalim · August 9, 2018, 4:06pm

There have been some concerns regarding the loss of precision in warnings, which is an excellent concern. I will address them altogether instead of individually.

We are not proposing to remove all long warnings and all long errors. Rather, this is a mechanism that would allow them to do so, if desired and if relevant.

@Eiji mentioned an excellent example: function clause errors. Those errors are long, and they will continue to be long, because everything they show is contextual information. @blatyo brought another good example. Both of those cases should continue as is. The help catalog is useful to store complementary and non-contextual information about warnings and errors. All contextual information should remain as part of the warning/error.

===

The other concern that seems to be common is about emitting detailed (long) warnings by default. This seems like a good idea based on the warnings we have today but I would like to remind everyone that, once the catalog exists, it is very likely that many warnings will provide a detailed variant.

For example, “unused variable x” may now have one or two paragraphs about prepending an underscore to the variable name. Similarly, when you use a non-guard function in a guard, I would like the detailed information to show all available guards. Therefore, if we are detailed by default, we may end-up showing a lot more information than we do today. For those reasons, I think detailed by default is practical today, but not when the catalog will be place.

I would like to flip the question. Instead of discussing if we should have “shorter vs long” by default, what can we do to make sure that a newcomer will understand what elixir -h elixir:foobar in a warning/error means and make sure that they will be able to access the detailed information?

josevalim · August 9, 2018, 4:12pm

I thought about a module based mechanism but it has a couple issues:

Today help is_atom shows the documentation for the Kernel module. Do we want to keep this behaviour for the command line? If so, there is an ambiguity if what you propose.
The module names for libraries end-up getting log
I would like the catalog information to be possibly dynamic. For example, Phoenix could look into the user configuration and say “you can change this behaviour by setting X, it is currently set to false”. In Elixir, for example, I would like for elixir:guards to show all guards.

Thoughts?

This is hard to do in Elixir. Compiling code is the same as running code. We do not have explicit “compilation started” and “compilation ended” events. On the positive side, it means we can use the catalog for runtime warnings and errors too.

jgonet · August 9, 2018, 4:12pm

I’d go with longer warnings by default, as shorted may or may not be as descriptive. More advanced user will recognize warning without need of detailed description, newcomers not really. I think if we place information about --short somewhere in docs it will be sufficient.
Maybe going with 3 lvls of description is good idea? short as option, medium by default and long in catalog?

josevalim · August 9, 2018, 4:17pm

Apologies but this does not address my main concern: most warnings we have today will likely become much longer. Would you advocate for long warnings if every unused variable warning comes with one or two extra pagraphs? Or if errors in guards comes with a list of all guards? It feels people are advocating for long warnings based on their experience today but this experience will no longer be true once we have catalogs. Today warnings were written to try to balance between too little and too much information, which may end-up not pleasing newcomers nor experienced developers. Once we have catalogs, the amount of information is much more, and showing it by default will be overly distracting.

What would be the difference between a short, medium or long warning? It seems there is an impression that a catalog means most warnings will become useless by default but that’s not the case. Even with a catalog, every warning will still provide enough information to be addressable on its own.