Mix formatter to remove unnecessary newlines?

tensiondriven · August 8, 2022, 8:14pm

I’ve looked around for information on this in the mix format source, custom mix formatters, Elixir style guides, and haven’t found anything, so please forgive me if this has been covered…

mix format does a great job of cleaning up unnecessary spaces in Elixir code and breaks long lines into multiple lines very well.

However, while it’s great at adding newlines when needed, there doesn’t seem to be any provision for removing unnecessary newlines.

I’m working on a team where there’s a tendency to use extensive pattern matching in function definitions, etc, which results in many multiline statements. When this code is edited and shortened, often the revised statements will fit on one line, but because the formatter doesn’t do this for us, I frequently see code that has unnecessary line breaks in it. The below should illustrate what I’m seeking. Note that the entirety of this code block has mix format applied:

  # Functions w/ long params get split into multiple lines (expected, desired):
  def function(
        very_long_parameter_that_causes_line_break1,
        very_long_parameter_that_causes_line_break2
      ) do
    # ...
  end

  # Shortening the long variable names does not result in the function definition being shortened: (i would like it to)
  def function(
        shortened_param1,
        shortened_param1
      ) do
    # ...
  end

  # Manually shortening the params works, but it is laborious:
  def function(shortened_param1, shortened_param1) do
    # ...
  end

  # This applies to structs/maps, too. I'd like these unnecessary newlines to be removed...
  def function() do
    #
    %{
      a: 1
    }
  end

  # ... so that after mix format, I have:
  def function() do
    %{a: 1}
  end

Is it possible to detect and remove un-necessary newlines using mix format? I started going down the path of writing my own formatter, but am not up on all the details of the Code module and am hoping there’s an easier way - or, if not, perhaps someone can outline an example formatter that I could write which would do this.

tensiondriven · August 8, 2022, 8:23pm

I see there are projects like Overview — Sourceror v0.11.1 and fix/fix.ex at master · wojtekmach/fix · GitHub which aim to do deeper introspection on Elixir code - all i’m looking for here is whitespace formatting.

dimitarvp · August 8, 2022, 9:38pm

Yep, quite a few people noticed that mix format does not aggressively collapse code that was expanded before. I don’t like it either but apparently the maintainers disagree.

There’s a tool made on top of sourceror, namely GitHub - hrzndhrn/recode (one contributor to it is @Marcus). You can check that tool out but sadly it doesn’t support the feature you and I want yet.

tensiondriven · August 9, 2022, 3:10am

Brilliant, this is a start. Thanks @dimitarvp

sodapopcan · August 9, 2022, 2:01pm

Just in terms of reasoning, I believe it has to do with the ambiguity involved in defining “unnecessary”. For example, should the following be collapsed just because it fits in the line limit?

%{
  one: some_longer_function_name("some_arg", "some_other_arg"),
  two: "hello!"
}

That would look pretty terrible on one line (subjectively) and I believe it’s trying to allow for this type of thing. I’m personally totally cool with this as the diffs caused by tools like Prettier that put everything on a line it can make for some very painful diffs when those lines ultimately get broken up. The big downside, of course, is that you can’t just run the formatter to get stuff exactly as you want it.

dimitarvp · August 9, 2022, 2:21pm

Look at OP’s 3rd example. If you manually expand – needlessly – a very short expression, then the formatter never collapses it back.

sodapopcan · August 9, 2022, 2:26pm

I did—At least if by third example I assume you’re talking about breaking up %{\n\ta: 1\n}. Certainly an argument could be made that if there is only one element it could collapse, but all I was trying to say was that it seems like it’s trying to keep the rule dead simple which is: “If you want line breaks in your function heads and data-structures, you got 'em.”

dimitarvp · August 9, 2022, 3:01pm

Sure, but that particular choice is oddly specific and I suspect it’s tied to the personal preferences of the maintainers. Which, if I was politically inclined, would say is unfair.

I like Rust’s formatter more – it’s (a) ruthless and (b) very configurable. It would collapse back the example OP gave us. Its philosophy is “if I can fit that in one line then I absolutely will”. Or “if I spot needless curly braces (only one expression) then I am removing them”. A canonical form of [minimal] code, we might say.

Most people just go with the defaults, especially if contributing to open source. But there’s also place for internal team enforced formatter config. IMO the perfect state of affairs: strongly opinionated defaults with ability to customize.

Code is not a form of personal art expression.

LostKobrakai · August 9, 2022, 3:10pm

Iirc there’s a clear stance of the elixir team that the formatter is not meant to be a all purpose formatter. It’s meant to support elixir own development and therefore likely fits a lot of other elixir open source projects as well. But it just might not be the tool for enforcing custom or internal code styles.

sodapopcan · August 9, 2022, 3:14pm

To be clear I wasn’t saying you’re wrong, just trying to offer some reasoning as to the why it is the way it is. Although we certainly aren’t aligned on “if it can fit on one line, do it” but absolutely aligned on allowing customization for internal formatters.

Personally the only thing I really hate is not being able to put do on its own line in with statements. with can get super ugly and I find that little change makes everything dramatically clearer.

  with {:ok, foo} <- foo(),
       {:ok, bar} <- bar(foo),
       {:ok, some_longer_thing} <- some_longer_thing(bar)
  do
    something()
  end

The default with the outdent denoting the block always looks to me like someone forgot to format.

That’s a bit of a tangent but does technically have to do with whitespace

dimitarvp · August 9, 2022, 3:30pm

Sorry if I was unclear: I am not saying you’re wrong either. Just preferences.

dimitarvp · August 9, 2022, 3:31pm

That would explain a lot actually, thanks for bringing the info in.

At this point it’s clear that it’s on the community to start these initiatives. We can’t and shouldn’t rely on the core team for so much.

tensiondriven · August 9, 2022, 5:06pm

The strange thing about this particular feature is that it isn’t “reversible” - The formatter will split your code across multiple lines, but there’s no provision to “un-split” it, and not just in formatter, but not in any of the related tools (credo, etc). I’m not advocating for changing default behaviour or anything of that sort, just looking for a solution to assist in collapsing newlines that formatter would not otherwise have added.

LostKobrakai · August 9, 2022, 5:19pm

The changes are not reversed because they not only happen because the formatter applied the changes, but also because a user might have written the code that way and the formatter will maintain some user preferences forward at the expense of not being able to “undo” all changes it does to code.

josevalim · August 9, 2022, 5:40pm

Exactly what @LostKobrakai said. The formatter would aggressively collapse before but then a lot of people complained that sometimes they would write newlines on purpose and the formatter collapsed (and “ruined”) them. So we started respecting the user choice when it comes to newlines.

This pretty much sums up most formatter discussions. x% people are happy it works in one way, y% people are not. If we were to change it, now y% people are happy it works in one way, x% people are not.

tensiondriven · August 9, 2022, 6:14pm

This is not one of those discussions, at least it wasn’t intended to be. I have no problem with the formatter.

What I am looking for is a way - any way - to re-format code with extra new lines in it. formatter isnt the tool, because it doesn’t support reverting added new lines, either by default or with flags. Credo doesn’t have support for it, that I can find.

If it weren’t for comments, I imagine I could drop all newlines from a file and then apply formatter.

If I knew my way around the Code module, it might be possible to write a custom task in fairly short order that would respect comments.

If vscode had a “select statement” command that was elixir-aware, that could be a good way to fix these as they come up.

Currently I do this manually and it’s wearing out my backspace key.

Given these constraints, recode also seems to be a viable option if support is added.

This comes up on teams that are working on existing codebases and aren’t paying attention to when it’s more appropriate to collapse extra newlines. Other languages like elm (and Haskell?) make heavy use of newlines, and people who use those languages seem less sensitive to extraneous newlines. I frequently open files and find cases where there are extra newlines that don’t appear to have been added intentionally.

So, looking for a way to clean these up.

dimitarvp · August 9, 2022, 7:58pm

I think we all agree on that, I just wish we could configure that particular behavior. I understand it’s a preference that can’t and shouldn’t be argued against.

sabiwara · August 9, 2022, 10:48pm

Maybe another relevant place to add this could be the freedom formatter, which is consistent with the official formatter yet is adding some options on the top of it.

There is no collapse option yet, but it would seem consistent with its goals so a PR might get accepted?

dimitarvp · August 9, 2022, 10:55pm

Thanks, bookmarked.

In terms of PRs, I wouldn’t even know where to start but I keep dreaming that one day I don’t have to think of money that much and I’ll have more free time. {sighs}

Marcus · August 21, 2022, 1:42pm

recode 0.2.0 has now the task SameLine. This task tries to get everything in one code line. The task is deactivated per default but the execution can be forced by mix recode --task SameLine. Keep in mind that there may be more changes than expected.
If someone has a better name for the task, please let me know.

The task can be configured with :skip and :ignore.
Imagine the following code snippet.

    test "my test" do
      x = %{
        foo: :fail
      }

      assert_raise RuntimeError, fn ->
        do_some(%{
          x: x,
          y: :foo
        })
      end
    end

recode changing this to:

    test "my test" do
      x = %{foo: :fail}

      assert_raise RuntimeError, fn -> do_some(%{x: x, y: :foo}) end
    end

and with ignore: :fn to:

    test "my test" do
      x = %{foo: :fail}

      assert_raise RuntimeError, fn ->
        do_some(%{x: x, y: :foo})
      end
    end

and with skip: :assert_raise to:

    test "my test" do
      x = %{foo: :fail}

      assert_raise RuntimeError, fn ->
        do_some(%{
          x: x,
          y: :foo
        })
      end
    end