Space vs function call

https://elixir-lang.org/getting-started/keywords-and-maps.html has an example

iex> if(false, [{:do, :this}, {:else, :that}])
:that

Now, if we insert a space after the if, we get an error:

iex> if (false, [{:do, :this}, {:else, :that}])
:ERROR (something to do with a function cal)

Can someone walk me step by step through how the elixir interpreter parses the two above and how the 2nd becomes an error ?

It clearly says in the error

If you are making a function call, do not insert spaces between the function name and the opening parentheses

From Syntax reference — Elixir v1.12.3

Parentheses for non-qualified calls are optional, except for zero-arity calls, which would then be ambiguous with variables. If parentheses are used, they must immediately follow the function name without spaces . For example, add (1, 2) is a syntax error, since (1, 2) is treated as an invalid block which is attempted to be given as a single argument to add .
…
Parentheses for qualified calls are optional. If parentheses are used, they must immediately follow the function name without spaces .

1 Like

Yes, I know that, in the original post I stated:

In particular, you are ignoring the part of the question that states

step by step through how the elixir interpreter parses the two above

Here is what I want to understand. Consider four exprs:

if false do
  :this
else
  :that
end
  1. if false, do: :this, else: :that

  2. if(false, [do: :this, else: :that])

  3. if (false, [do: :this, else: :that])

What do they get PARSED INTO that causes #4 to be an error for 1, 2, 3 to be fine.

Let’s go code spelunking! Let’s start with Code.string_to_quoted because this has the general properties we want: it takes a string, and it will error if you do #4

    case :elixir.string_to_tokens(to_charlist(string), line, column, file, opts) do
      {:ok, tokens} ->
        :elixir.tokens_to_quoted(tokens, file, opts)

      {:error, _error_msg} = error ->
        error
    end

Here we find :elixir.string_to_tokens as well as :elixir.tokens_to_quoted so let’s dig in there and see if we can find the error generating bit. Let’s pop open an iex session to see if the first function returns an error:

iex(7)>     file = Keyword.get(opts, :file, "nofile")
"nofile"
iex(8)>     line = Keyword.get(opts, :line, 1)
1
iex(9)>     column = Keyword.get(opts, :column, 1)
1
iex(10)> string = "if (false, [do: :this, else: :that])"
"if (false, [do: :this, else: :that])"
iex(11)> :elixir.string_to_tokens(to_charlist(string), line, column, file, opts)
{:ok,
 [
   {:identifier, {1, 1, nil}, :if},
   {:"(", {1, 4, nil}},
   {false, {1, 5, nil}},
   {:",", {1, 10, 0}},
   {:"[", {1, 12, nil}},
   {:kw_identifier, {1, 13, nil}, :do},
   {:atom, {1, 17, nil}, :this},
   {:",", {1, 22, 0}},
   {:kw_identifier, {1, 24, nil}, :else},
   {:atom, {1, 30, nil}, :that},
   {:"]", {1, 35, nil}},
   {:")", {1, 36, nil}}
 ]}

Nope, no error. This does begin to answer your question though because we can see what it gets “parsed into”.

Let’s compare this to your #3 case:

iex(13)> :elixir.string_to_tokens(to_charlist(string), line, column, file, opts)
{:ok,
 [
   {:paren_identifier, {1, 1, nil}, :if},
   {:"(", {1, 3, nil}},
   {false, {1, 4, nil}},
   {:",", {1, 9, 0}},
   {:"[", {1, 11, nil}},
   {:kw_identifier, {1, 12, nil}, :do},
   {:atom, {1, 16, nil}, :this},
   {:",", {1, 21, 0}},
   {:kw_identifier, {1, 23, nil}, :else},
   {:atom, {1, 29, nil}, :that},
   {:"]", {1, 34, nil}},
   {:")", {1, 35, nil}}
 ]}

Aha! Note the first item is now :parent_identifier vs :identifier. This likely gets used later.

Let’s move on to the :elixir.tokens_to_quoted function then:

tokens_to_quoted(Tokens, File, Opts) ->
  handle_parsing_opts(File, Opts),

  try elixir_parser:parse(Tokens) of

OK mostly just going to :elixir_parser so let’s go there:

Aha! This is a big old yrl file so let’s just do the easy thing and search for the word “space” and see if our error shows up:

Sure does! Let’s see when that gets called:

Unfortunately here is where my ability to help starts to reach its limits. I don’t particularly know or understand how yrl files work, so beyond seeing that the column values between #3 and #4 are different, I don’t super duper know how it’s arriving at that case. Probably something to do with that :paren_identifier thing. Hopefully this is a good jumping off point though!

EDIT: Made a small edit to include output from the #3 case.

7 Likes

Beat me to it, I was typing a similar response.

But basically, here:

call_args_no_parens_many -> matched_expr ',' call_args_no_parens_kw : ['$1', '$3'].
call_args_no_parens_many -> call_args_no_parens_comma_expr : reverse('$1').
call_args_no_parens_many -> call_args_no_parens_comma_expr ',' call_args_no_parens_kw : reverse(['$3' | '$1']).

It’s implying that if (foo, bar) is being parsed as a function without parenthesis on its arguments, therefore it expects no parenthesis after the space, yet it found one, therefore the error.

It doesn’t parse to anything because it fails to parse in the first place.

The first response was fine :man_shrugging:

Usually when I want to see what something parses to I just call Code.string_to_quoted or quote/2, depending on if I want more metadata or not. If it fails to parse then it’s not valid syntax.

4 Likes

The something to do with a function call is the important part. The message is unique, so it’s a good entry point into the source code:

Tracing the rules backwards by searching for the names leads to this big comment:

5 Likes

Yeah I think this is a good way to put it. It tokenizes, but does not parse.

A small addendum: The elixir code base is really often quite approachable and I highly recommend poking about for these sorts of questions.

As a gentle nudge, when asking for help, do take care to not come down too hard on folks that didn’t quite hit your desired mark with their answer. You are asking for a favor after all.

5 Likes

Thank you, the difference in the :elixir.string_to_tokens output above is precisely what I was looking for. I completely agree that the correct statement is “It tokenzes, but does not parse.”

I absolutely agree that I am asking for help. The issue though is that he claimed the error message answers my question (it does not), and implies I did not even bother to read the error message (I did).

Did I? In the question you omitted the error message, that’s why I thought you didn’t consider it important in the context.

However, further, as the error message doesn’t answer the question, I pointed to the part in the docs where it explains:

For example, add (1, 2) is a syntax error, since (1, 2) is treated as an invalid block which is attempted to be given as a single argument to add

I just though it would be sufficient. :man_shrugging:t2:

Whatever :slightly_smiling_face: I’m glad that you found what you were looking for

4 Likes

I apologize if I misinterpreted your post: I interpreted it as “RTFEM” (read the “friendly” error message). :slight_smile:

1 Like