How do I get a well-formatted string from an AST?

Hi, I want to modify elixir files based on the AST.

For the result I want to the elixir formatter.

For a prototype I tried the following and it almost works:

  File.read!("example.ex")
  |> read_contents_of_file()
  |> Code.string_to_quoted!()
  |> Macro.to_string()
  # |> Code.format_string!()
  # |> IO.iodata_to_binary

The problem are the parentheses. They’re added everywhere. An example result:

"defmodule(Example) do\n  def(output(str)) do\n    IO.inspect(str)\n  end\nend"

The documentation for Macro.to_string states “This function discards all formatting of the original code.”.

So is it possible to keep the formatting of the original code?

Thanks in advance for every piece of advice or idea :slight_smile:

  • Sascha

You could pass the :locals_without_parens option to Code.format_string!/2 and see if that fixes the issue.

Since the formatter internally works on quoted code, it seems like it’d be reasonable to suggest a new public API to allow passing quoted code directly.

The formatter works on heavily annotated AST to be able to preserve the user’s choice in some cases, that’s why it doesn’t work directly on the given AST, because it cannot guaranteed something close to a similar result. The best solution here would be to improve Macro.to_string. For example, it can skip adding parens in do/end blocks.

2 Likes

Would it make sense for Code.string_to_quoted/2 to create the heavily annotated AST? Then the formatter could work directly on it? It seems odd to me that the recommended workflow would be:

  1. string to quoted
  2. modify quoted
  3. quoted to string
  4. string to formatted
    1. string to quoted
    2. quoted to io data

We don’t want to expose this internal AST in any way because there is nothing you can do with it this AST we would be able to guarantee compatibility between versions. If you want to format a string, then you need to either go with format string or format file directly.

@sbrink – did you eventually get this to work for you? I am trying to do the same for a different purpose and I cannot get Code.format_string!/2 to remove the brackets from the string output following Macro.to_string/1.

If you did get it to work, do you have an example of how you did it?

Thanks

Look at it’s options, specifically the :locals_without_parens option, it’s default empty when run directly (the mix formatter default sets up some options).

2 Likes

Ya, I guess I don’t know how to use it, because when I try something like Code.format_string!(code_string, locals_without_parens: [def: :*]) (which from my understanding should remove brackets from all arities of the def macro) it still doesn’t.

Do you know where I can find the mix format call to it?

Or where in the format task the defaults are specified? I can’t find it from my reading through the source

Hmm…

iex(7)> s
"def(blah(i)) do 42 end"
iex(8)> Code.format_string!(s, locals_without_parens: [def: 1, def: 2, def: 3, def: :*])|>IO.puts()
def(blah(i)) do
  42
end
:ok

And yet the Code.format_string! docs state:

### Parens and no parens in function calls

Elixir has two syntaxes for function calls. With parens and no parens. By
default, Elixir will add parens to all calls except for:

  1. calls that have do/end blocks
  2. local calls without parens where the name and arity of the local call
     is also listed under :locals_without_parens (except for calls with arity
     0, where the compiler always require parens)

The choice of parens and no parens also affects indentation. When a function
call with parens doesn't fit on the same line, the formatter introduces a
newline around parens and indents the arguments with two spaces:

    some_call(
      arg1,                                                                                                                           
      arg2,                                                                                                                           
      arg3                                                                                                                            
    )     

On the other hand, function calls without parens are always indented by the
function call length itself, like this:

    some_call arg1,
              arg2,                                                                                                                   
              arg3                                                                                                                    

If the last argument is a data structure, such as maps and lists, and the
beginning of the data structure fits on the same line as the function call,
then no indentation happens, this allows code like this:

    Enum.reduce(some_collection, initial_value, fn element, acc ->
      # code                                                                                                                          
    end)                                                                                                                              
                                                                                                                                      
    some_function_without_parens %{                                                                                                   
      foo: :bar,                                                                                                                      
      baz: :bat                                                                                                                       
    }      

And yet it’s not working, at least in Elixir 1.9.1. Might be worth trying on master and if still not working then submitting a bug issue?

1 Like

On master:

iex(1)> ast = quote do                                     
...(1)>   def add(a, b), do: a + b
...(1)> end
{:def, [context: Elixir, import: Kernel],
 [
   {:add, [context: Elixir], [{:a, [], Elixir}, {:b, [], Elixir}]},
   [
     do: {:+, [context: Elixir, import: Kernel],
      [{:a, [], Elixir}, {:b, [], Elixir}]}
   ]
 ]}
iex(2)> ast |> Macro.to_string
"def(add(a, b)) do\n  a + b\nend"
iex(3)> ast |> Macro.to_string |> Code.format_string!(locals_without_parens: [def: 1, def: 2, def: 3, def: :*])
["def", "(", "", "add", "(", "", "a", ",", " ", "b", "", ")", "", ")", " do",
 "\n  ", "a", " +", " ", "b", "\n", "end"]
iex(4)> ast |> Macro.to_string |> Code.format_string!(locals_without_parens: [def: 1, def: 2, def: 3, def: :*]) |> IO.puts()
def(add(a, b)) do
  a + b
end
:ok

I’ll file a bug report, thanks @OvermindDL1

1 Like

Local without parens do not remove parens. They only keep a call without parens if they were not there in the first place. The feature that you want has to be implemented in #9291.
- Jose Valim (via github)

Turns out it’s not a bug. It doesn’t do what it seems to imply, but maybe via #9291 it will someday!

1 Like