Running the new Elixir formatter

syntax

#1

I’ve been hearing much about the new formatter and it’s something I have been keen to try.
I find examples buy far the most illuminating way to try some new tech. To that end I though I would blindly run the new formatter on my largest open source Elixir code base, Ace.

I did no preparation before hand and in one swoop was able to format the whole project using

$ mix format mix.exs "lib/**/*.{ex,exs}" "test/**/*.{ex,exs}"

The full change commit

Results

  • My project is now larger. 1369 lines added but only 752 removed.
    A lot of these are empty lines. Every case clause is separated by a new line and so is every function head.
    It looks like my personal convention of separating different functions but not different function heads of one function is now over
  • Brackets are now applied automatically so no more dealing with those warnings. This is a big win.
  • Numbers lose underscores 65_535 -> 65535 :frowning:
  • Several cases where things are now indented to line up with something above rather than just two spaces, this is probably what I am least happy about but I can get over it for the consistency
  • Comments now always end up on a separate line.

In conclusion I think that is a great additions and I hope that the numbers will eventually be formatted with the underscores


Formatter observation
What's new in Elixir - Dec/17
#2

I agree with this point because I use a lot this kind of number format all over the project.


#3

I this the library you used?

I hope this is a bug.


#4

No, as far as I understood, he used mix format which is available in Elixir 1.6


#5

Yes I installed the latest version of Elixir from master.

It’s nice because I am running my project on 1.5, i.e. stable, in a docker container but having 1.6 on my machine means that when editing that project atom is using 1.6 to provide formatting.


#6

Oh, so it is this one in the main Elixir repo. Nice, next release will certainly be interesting.

It seems to have the code for inserting underscores in integers if the number of digits are >= 6. I wonder if we can change that to >= 4.


#7

Please don’t… I was born 1981, not 1_981…


#8

And our http-port number is 20_693, not 20693, sometimes the format depends on the context and not on the data type. :slight_smile:

That’s kind of what I meant when I said:

Nice, next release will certainly be interesting.

It will be impossible to satisfy everyone. But I agree, >= 4 might be too short.


#9

Looks good :023:

I would have normally done this:

  defp receive_data(packet, state = %{status: {:chunked_body, :response}}) do
    {chunk, rest} = HTTP1.pop_chunk(packet)
    case chunk do
      nil ->
        {:ok, {rest, state}}
      "" ->
        send(state.worker, {state.channel, Raxx.trailer([])})
        {:ok, {rest, state}}
      chunk ->
        fragment = Raxx.fragment(chunk, false)
        send(state.worker, {state.channel, fragment})
        {:ok, {rest, state}}
    end
  end

And the formatter does this:

  defp receive_data(packet, state = %{status: {:chunked_body, :response}}) do
    {chunk, rest} = HTTP1.pop_chunk(packet)

    case chunk do
      nil ->
        {:ok, {rest, state}}

      "" ->
        send(state.worker, {state.channel, Raxx.trailer([])})
        {:ok, {rest, state}}

      chunk ->
        fragment = Raxx.fragment(chunk, false)
        send(state.worker, {state.channel, fragment})
        {:ok, {rest, state}}
    end
  end

Which, looking at it now, is far more readable :003:


#10

Well I also have several pieces of code which have prices 120_00.


#11

For what it’s worth, my opinion on the changes. It’s mostly negative (warning, 100% subjective unless marked):

I like this convention too. Keeps together functions heads that belong to the same function, which directs my eyes to logical groups of clauses. But the new output doesn’t shock me, except for the problem of taking more space.

Like @AstonJ said above, this might increase readability a little, I don’t know. But it does take away more vertical screen space for sure, which is a bad thing. I’m not sure whether the negatives outweigh the positives here. The question of taking up more space is the only “objective” argument I want to bring to the table. Maybe the trade-off is worth it, and maybe it isn’t but wasting valuable vertical screen space is not just a matter of personal preference

No opinion on this one. Brackets or no brackets, any choice would be ok.

Please don’t… This is one of those places where the programmer’s intent is expressed through formatting. Sometimes, 12_000 is more readable than 12000, sometimes it’s the opposite. I think this should be left up to the programmer. Sometimes you want 1200, sometimes you may want 1_200 and sometimes 12_00 (value in cents, or something like that).

Or is it one of the goals of the formatter to make sure that two blocks of code that compile to the same AST (module line numbers) will always produce the same output? If this is a goal to aim for, then I think I can accept the disruption of the underscores.

This takes up more valuable screen space for very little benefit, I think. What’s the reasoning behind this rule?

There are some obvious improvements in the source that @Crowdhailer didn’t bother to mention, like minor whitespace issues, and for those things the formatter is great.

My main argument against this might be a vague personal distaste against prescriptive code formatters: everybody prefers to edit text instead of AST nodes directly, but then we make all this effort to make the code as uniform as possible to preserve consistency. It’s almost as if we should have been writing AST nodes all along? Or Lisp? You don’t get many discussions on how to format that…

Finally, despite what I’ve said above, having a code formatter is pretty cool, especially if it can convert from AST to code, because it opens up exciting possibilities for code transformation tools that are aware of the AST. This can be useful for project generators, because it would make it easier to add code to the middle of a module and even for refactoring tools. It would make changing a module name over the whole code base almost trivial, for example: It could just compile the modules, replace the appropriate alias and write the changes to disk. It might have some corner cases I’m not aware of, but it would certainly make this kind of thing much easier. It would produce nice diffs, of course, because the code would have been formatted already. For renaming a function it wouldn’t go as well:

variable = MyModuleAlias
new_variable = variable.function_i_want_to_rename(arg)

or even worse:

def f(my_dynamic_module_alias, arg) do
  my_dynamic_module_alias.function_i_want_to_rename(arg)
end

but with the help of mix xref or even access to the elixir manifests or the erlang core output it might make a passable job. I hope the formatter will be made available as a standalone tool that can be used on the AST for these purposes.

Having a code formatter that is well tested and guarantees it’ll keep the semantic meaning of the code is much better than not having one, no matter how much I disagree with the space-wasting “features”.


#12

I think it’s definitely worth it - it’s significantly more readable Imo :slight_smile:


#13

I’ve skimmed through the source of the formatter, which I assume has been run through itself. It’s quite readable (as expected), although it looks “too light”. In does waste a lot of vertical space on blank lines. Because Elixir functions tend to be quite short and most are referentially transparent, they can be understood in isolation and it’s not a disaster if you can’t see as much code at once.

It does lead to more scrolling in any case, but I think that’s something one can live with. Now I’m waiting for all the cool refactoring tools that will be made from this.


#14

I’m also in the “leave the numbers alone” boat. If I want to use my money (like in stripe) as cents, then it really helps to see it like 120_00 instead of 12000

so far, I like everything else.


#15

You can have all of them on the same line if you want to:

case foo do
  {:ok, bar} -> {:ok, bar}
  {:error, baz} -> {:error, baz}
end

But if you have newlines after ->, then the formatter will respect that and proceed to add an empty line between each clause.

Remember the goal of the formatter is to make the code consistent. The opposite of the current behaviour would be to remove lines between clauses, which would affect different areas of @Crowdhailer’s repository. I think users would very likely be more upset if we remove the empty lines they added to space out code.

Between the two below, one is definitely more readable than the other for me (although I agree it is subjective):

defp integer_to_algebra(text) do
    case text do
      [?0, ?x | rest] ->
        "0x" <> String.upcase(List.to_string(rest))
      [?0, base | _rest] = digits when base in [?b, ?o] ->
        List.to_string(digits)
      [?? | _rest] = char ->
        List.to_string(char)
      decimal ->
        List.to_string(insert_underscores(decimal))
    end
  end

and

defp integer_to_algebra(text) do
    case text do
      [?0, ?x | rest] ->
        "0x" <> String.upcase(List.to_string(rest))

      [?0, base | _rest] = digits when base in [?b, ?o] ->
        List.to_string(digits)

      [?? | _rest] = char ->
        List.to_string(char)

      decimal ->
        List.to_string(insert_underscores(decimal))
    end
  end

We will open up a discussion on that.


#16

Will it be possible to run the formatter in strict/opinionated mode? I.e. “There is exactly one correct formatting standard”?


#17

Yes, if one has to choose, it’s better to have the newlines every time than to never have newlines.


#18

I just have one request: Don’t force us to always use aliases.

We have some modules with names that only makes sense in the context of their parent module, like this

defmodule Foo.Api.Google do
end

defmodule Foo.Authentication.Google do
end

defmodule Foo.Api.Response do
end

defmodule Foo.HttpRequest.Response do
end

For example Credos basic configuration wanted us to use aliases for everything which meant we ended up with references to modules just called Response with no hint if it was a api response or a http request response. In credos case it was configurable by adding this the credos conf, {Credo.Check.Design.AliasUsage, priority: :low, if_nested_deeper_than: 2}.

Some of the standard libs have the same “problem”, what is more readable?

  • Utils.list()
  • Conn.Utils.list()

Now, I don’t know if you even wanted to correct module / aliases usage with the formatter, but I figured I might aswell raise the issue while nothing is set in stone. :slight_smile:


#19

I’m not sure, but I think the idiomatic approach here is to have modules with names like Foo.HttpResponse and Foo.ApiResponse.


#20

Generally yes, but for me atleast it is easier to parse Conn.Utils instead of ConnUtils.

Naming the module Foo.Authenticator.Google instead of Foo.AuthenticatorGoogle (or Foo.GoogleAuthenticator) allows the coder to choose the alias name depending on context. If you are in the borderland where you translate api responses to application responses you might want to use the alias Foo.Authenticator and the module reference Authenticator.Google while if you are in the domain that only handles authenticators then you can use the alias Foo.Authenticator.{Google, Facebook} and Google as the module reference. Of course, some dev use the domain naming scheme and group things by provider context instead like Foo.Google.Authenticator, Foo.Google.Gmail or Foo.Google.Api.

With the module name Foo.AuthenticatorGoogle you are forced to always use AuthenticatorGoogle as the reference even when the context of the call tells the coder it is the authenticator you are calling and not the api, well, unless you use the alias rename feature…

Honestly, I just hate having to use module references like Utils.list(), Conn.Utils.list() is much more understandable. :slight_smile: