MLElixir - attempting an ML-traditional syntax entirely within the Elixir AST

Been making an MLElixir thing (not released yet…) for fun in spare time in the past day. I’m just trying to see how much I can get an ML-traditional syntax entirely within the Elixir AST, while being properly typed (with occasional fun with Refined Types and such). An example IEX session with it:

  • Basic Types (adding more over time)
iex> import MLElixir
MLElixir
iex> defml 1
1
iex> defml 6.28
6.28
iex> defml :ok
:ok
  • Let untyped variable bindings:
iex> defml let _a = 2 in 1
1
iex> defml let a = 1 in a
1
iex> defml let a = 42 in
...> let b = a in
...> let c = b in
...> c
42
  • Let Typed variable bindings (The errors are very simplistic and not descriptive right now, still debugging time after all):
iex> defml let ![a: int] = 1 in a
1
iex> defml let ![a: float] = 6.28 in a
6.28
iex> defml let a = 1 in
...> let ![b: int] = a in
...> b
1
iex> defml let ![a: int] = 6.28 in a
** (MLElixir.UnificationError) Unification error between `{:"$$TCONST$$", :float, [values: [6.28]]}` and `{:"$$TCONST$$", :int, []}` with message:  Unable to resolve mismatched types
    (typed_elixir) lib/ml_elixir.ex:566: MLElixir.resolve_types!/3
    (typed_elixir) lib/ml_elixir.ex:250: MLElixir.resolve_binding/3
    (typed_elixir) lib/ml_elixir.ex:163: MLElixir.parse_let/3
    (typed_elixir) expanding macro: MLElixir.defml/1
                   iex:16: (file)
  • Let Refined Typed variable bindings:
iex> defml let ![a: int a=1] = 1 in a
1
iex> defml let ![a: int a<=2] = 1 in a
1
iex> defml let ![a: int a>=2] = 1 in a
** (MLElixir.UnificationError) Unification error between `{:"$$TCONST$$", :int, [values: [1]]}` and `{:"$$TCONST$$", :int, [values: [{2, :infinite}]]}` with message:  Unable to resolve
    (typed_elixir) lib/ml_elixir.ex:566: MLElixir.resolve_types!/3
    (typed_elixir) lib/ml_elixir.ex:250: MLElixir.resolve_binding/3
    (typed_elixir) lib/ml_elixir.ex:163: MLElixir.parse_let/3
    (typed_elixir) expanding macro: MLElixir.defml/1
                   iex:6: (file)

Function calls (shown here via +):

iex> defml 1+2
3
iex> defml 1.1+2.2
3.3000000000000003
iex> defml 1+2.2
** (MLElixir.UnificationError) Unification error between `{:"$$TCONST$$", :int, [values: [1]]}` and `{:"$$TCONST$$", :float, [values: [2.2]]}` with message:  Unable to unify types
    (typed_elixir) lib/ml_elixir.ex:712: MLElixir.unify_types!/3
    (typed_elixir) lib/ml_elixir.ex:108: anonymous fn/3 in MLElixir.Core.__ml_open__/0
    (typed_elixir) lib/ml_elixir.ex:199: MLElixir.parse_ml_expr/2
    (typed_elixir) lib/ml_elixir.ex:145: MLElixir.defml_impl/2
    (typed_elixir) expanding macro: MLElixir.defml/1
                   iex:2: (file)

Opening (‘import’ in Elixir parlance) another module (also showing how to disable the Core opens, as you can see it is the Core that defines the + function):

iex> defml let open MLElixir.Core in 1+2
3
iex> defml no_default_opens: true, do: let open MLElixir.Core in 1+2
3
iex> defml no_default_opens: true, do: 1+2
** (MLElixir.InvalidCall) 6:Invalid call of `+` because of:  No such function found
    (typed_elixir) lib/ml_elixir.ex:196: MLElixir.parse_ml_expr/2
    (typed_elixir) lib/ml_elixir.ex:145: MLElixir.defml_impl/2
    (typed_elixir) expanding macro: MLElixir.defml/1
                   iex:6: (file)
15 Likes

Cannot come up with a good syntax for an anonymous function, I was hoping for fun blah -> blah, especially as that quotes well by becoming [{:->, [], [[{:fun, [], [{:blah, [], Elixir}]}], {:blah, [], Elixir}]}], but putting it in a file (even inside a quoted context) causes a CompileError of unhandled operator ->, which is irritating… what is unhandled about it?! I’m trying to handle it… blah…

Similar separators are no good either, such as =>

I wonder why the inconsistency between a quote do fun blah -> blah end and a someMacro(fun blah -> blah), wonder if this is an Elixir compiler bug…

EDIT: Also, wtf does fn require an end regardless of any internal content in a macro, blah… I’ll probably just use fn, though awfully elixir with the weird trailing ‘end’ for no reason (considering there is only a single expression inside it)… Really really badly hate trailing turds (to use an erlang expression) like end's for no reason though, even in Elixir… Whoever thought of (ruby creators? morons…) putting blocks in random places like an anonymous function with a single expression should go back to language design…

I’d very much like another idea on how to do function definitions in elixir’s syntax. ^.^

2 Likes

There is not an inconsistency. Maybe you think you are doing fun(blah -> blah) but you are actually doing fun(blah) -> blah. -> can only exist inside a block. quote do fun blah -> blah end ← here the quote creates a block for you, but when you call a macro as someMacro(fun blah -> blah) you don’t have create a block.

Since the expression after -> is a block how do you know when to stop adding expressions to the block without end. For example:

list = Enum.map 1..10, fn num ->
  square = num * num
  square
IO.inspect(list)

Since we don’t have an end is the `IO.inspect part of the anonymous function or not?

1 Like

I also tried defml(fun x -> x) with the same operator error. ^.^

(EDIT: Also yesterday I tried this with an error too (defml supports single expression or do syntax both):

defml do
  fun x -> x
end

)

Exactly! The block should be explicit like it is everywhere else (with do/end). fn by default should have been a single expression element with an optional block (potentially also via do/end). The → could just be an infix operator that defines the left as the arguments and the right as the single expression (of which that expression could be a block), kind of like:

list = Enum.map 1..10, fn num -> num * num
IO.inspect(list)
# or
list = Enum.map 1..10, fn num -> do
  square = num * num
  square
  end
IO.inspect(list)
# or even:
list = Enum.map 1..10, fn num -> let square = num * num in square #  ^.^
IO.inspect(list)

Explicit is better than implicit after all. ^.^

Also, wtf magical inconsistency? o.O

/me really hates magically appearing blocks
Although I may be slightly biased considering every language I’ve very often used in the past does not have magically appearing blocks, from:

  • C/C++: Always delimited via {}, or leave it in many case for a single expression.
  • Python: Weird indentation, which is its own oddness, but is consistent at least.
  • Java: Ugh, same as C/C++ though.
  • Rust: Curly braces, optional in many cases when you only have a single expression too.
  • Erlang: The <expr>;<expr> is basically a let _ = <expr> in <expr>, when you stop it the ‘block’ ends, but it is not really a block, just a list of delimited expressions.

Etc… :slight_smile:

1 Like

I don’t see why you would get an error there. I just tried this in iex and it worked.

What is the magical inconsistency? It works because quote do fun blah -> blah end has a block. Do you see the do ... end? :slight_smile:

1 Like

Uhh, really? o.O?

(EDIT: Urp, my mistake, used fn here instead of fun, I’d prefer fn but it is less useful than fun due to auto-block magicness.)

Straight from one of my tests:

    defml do
      let id = fn x -> x in
      id 1
    end

Uncommenting it results in:

** (TokenMissingError) test/ml_elixir_test.exs:129: missing terminator: end (for "do" starting at line 1)
    (elixir) lib/code.ex:370: Code.require_file/2
    (elixir) lib/kernel/parallel_require.ex:57: anonymous fn/2 in Kernel.ParallelRequire.spawn_requires/5

Simplifying it to:

    defml do
      fn x -> x
    end

Results in an identical error.

Why yes, I see a block here:

    defml do
      fn x -> x
    end

And I get an end missing. :slight_smile:
And interestinyl replacing defml with quote to become:

    quote do
      fn x -> x
    end

In the test file also fails with an identical error… o.O

Just commented those lines and the test file passes again so it is not the file causing the issue… Trying from iex now:

iex> import MLElixir
MLElixir
iex> defml fn x -> x
...>
...> end
#Function<6.52032458/1 in :erl_eval.expr/5>
iex> defml(fn x -> x)
** (SyntaxError) iex:3: "fn" is missing terminator "end". unexpected token: ")" at line 3

iex> defml do fn x -> x end
...>
...>
...>
...>
...> end
#Function<6.52032458/1 in :erl_eval.expr/5>
iex> defml(do: fn x -> x)
** (SyntaxError) iex:4: "fn" is missing terminator "end". unexpected token: ")" at line 4

iex> quote(do: fn x -> x)
** (SyntaxError) iex:4: "fn" is missing terminator "end". unexpected token: ")" at line 4

iex> quote do fn x -> x end
...> end
{:fn, [], [{:->, [], [[{:x, [], Elixir}], {:x, [], Elixir}]}]}

And yet trying to use the original syntax that I wanted:

iex> quote do fun x -> x end
[{:->, [], [[{:fun, [], [{:x, [], Elixir}]}], {:x, [], Elixir}]}]

Hmm, so quote takes this syntax fine, let’s dump it into defml then:

iex> defml fun x -> x
** (SyntaxError) iex:6: syntax error before: '->'

iex> defml(fun x -> x)
** (SyntaxError) iex:6: syntax error before: '->'

iex> defml(do: fun x -> x)
** (SyntaxError) iex:6: syntax error before: '->'

Hence here is the wtf. ^.^

EDIT: Think I may have come up with a fairly consistent way that gets rid of the end oddness when you only have a single expression…

Which brings up, is there a construct in elixir (not an anonymous function as those have overhead) where you can make a new variable bindings? case is apparently not it because it does a really stupid thing where:

a = 42
b = case blah() do
    :ok -> a = bloop() <> "world"
      {:ok, a}
    :error -> :error
  end
# Wtf `a` here is `bloop() <> "world"` *or* `42` instead of just always 42?!?  How does a block
# not sanitize variables?!  Definitely not like C++ where every block is a new subscope...

I think I’m just going to have to decorate every-single-variable with a trailing number or something… Some things in the Elixir AST just do not make sense >.<

2 Likes

The difference between quote do fun x -> x end and defml fun x -> x is exactly the block. quote is not magically cheating in any way, the parser accepts -> in quote because it’s inside a block. When you call your defml macro you don’t wrap -> in a block.

If you use fn x -> x instead of fun x -> x you need to also end the block with an end since fn creates a block just like do.

It’s easy to remember what creates blocks in Elixir because there are only three ways to do it. do ... end, fn ... end and ( ... ). do blocks are in fact just sugar for ( ... ) and they are represented the same way in the syntax tree.

Check the “Blocks” section in the docs [1] for more details.

[1] https://hexdocs.pm/elixir/master/syntax-reference.html#syntax-sugar

3 Likes

The macro itself can wrap it though, it is just at the compiler level before the macro is ever hit… :-/

So… this?

iex> import MLElixir
MLElixir
iex> defml(fun x -> x)
** (SyntaxError) iex:2: syntax error before: '->'
iex> defml do fun x -> x end
:test_ok

So it does not seem like just sugar? O.o?

1 Like

The compiler raises with a syntax error which means it failed to parse the syntax. The parser does not expand macros or execute elixir code so it does not matter what you do in your macro. All code still has to be proper Elixir syntax regardless if we have macros.

Those are argument parenthesis. Try this: defml((fun x -> x)).

1 Like

Was just about to edit my last post, already tried that. :slight_smile:

Here is my edit content:

So it seems a function call does not scope inside it, this is such a weird syntax (I understand ‘how’ it is working the way it is, but not ‘why’ it was initially created this way…):

iex> defml (fun x -> x)
:test_failed
iex> defml((fun x -> x))
:test_failed
iex> defml(fun x -> x)
** (SyntaxError) iex:2: syntax error before: '->'

However, an issue here is the ‘:test_failed’ response that I am printing, that means it got nil as the incoming AST, and indeed that is what I get for those if I print out my debug steps instead:

iex> defml (fun x -> x)
{:ML, nil}
{:MLAST, {:"$$LIT$$", [type: {:"$$TCONST$$", :atom, [values: [nil]]}], nil}}
{:MLENV,
 %MLElixir.MLEnv{counter: -1,
  funs: %{+: #Function<0.36760603/3 in MLElixir.Core.__ml_open__/0>},
  type_bindings: %{}, type_funs: %{}, type_vars: %{}}}
{:MLDONE, nil}
nil
iex> defml((fun x -> x))
{:ML, nil}
{:MLAST, {:"$$LIT$$", [type: {:"$$TCONST$$", :atom, [values: [nil]]}], nil}}
{:MLENV,
 %MLElixir.MLEnv{counter: -1,
  funs: %{+: #Function<0.36760603/3 in MLElixir.Core.__ml_open__/0>},
  type_bindings: %{}, type_funs: %{}, type_vars: %{}}}
{:MLDONE, nil}
nil
iex> defml 42
{:ML, 42}
{:MLAST, {:"$$LIT$$", [type: {:"$$TCONST$$", :int, [values: '*']}], 42}}
{:MLENV,
 %MLElixir.MLEnv{counter: -1,
  funs: %{+: #Function<0.36760603/3 in MLElixir.Core.__ml_open__/0>},
  type_bindings: %{}, type_funs: %{}, type_vars: %{}}}
{:MLDONE, 42}
42
iex> defml let a = 42 in a
{:ML,
 {:let, [line: 5],
  [{:=, [line: 5],
    [{:a, [line: 5], nil}, {:in, [line: 5], [42, {:a, [line: 5], nil}]}]}]}}
{:MLAST,
 {:"$$LET$$", [type: {:"$$TPTR$$", 0, []}, line: 5],
  [{:"$$VAR$$", [type: {:"$$TPTR$$", 0, []}, line: 5], [:a, nil]},
   {:"$$LIT$$", [type: {:"$$TCONST$$", :int, [values: '*']}], 42},
   {:"$$VAR$$", [type: {:"$$TPTR$$", 0, []}, line: 5], [:a, nil]}]}}
{:MLENV,
 %MLElixir.MLEnv{counter: 0,
  funs: %{+: #Function<0.36760603/3 in MLElixir.Core.__ml_open__/0>},
  type_bindings: %{a: {:"$$TPTR$$", 0, []}}, type_funs: %{},
  type_vars: %{0 => {:"$$TCONST$$", :int, [values: '*']}}}}
{:MLDONE,
 {:__block__, [type: {:"$$TPTR$$", 0, []}, line: 5],
  [{:=, [], [{:a, [type: {:"$$TPTR$$", 0, []}, line: 5], nil}, 42]},
   {:a, [type: {:"$$TPTR$$", 0, []}, line: 5], nil}]}}
42

And yet:

iex> quote(do: (fun x -> x))
[{:->, [], [[{:fun, [], [{:x, [], Elixir}]}], {:x, [], Elixir}]}]

I’m… confused again… o.O

And for note, the :ML tuple is the first thing listed:


  defmacro defml(opts) when is_list(opts) do
    defml_impl(opts[:do], opts)
  end
  defmacro defml(expr) do
    defml_impl(expr, [])
  end

  defp defml_impl(expr, opts) do
    IO.inspect {:ML, expr}
# ...
1 Like

Here’s your bug ^. Your macro receives something like: ast = [{:->, [], [[{:fun, [], [{:x, [], Elixir}]}], {:x, [], Elixir}]}]. ast[:do] == nil because you have no :do block there.

1 Like

Oh I just caught that myself too! That is brilliant. ^.^

EDIT:
Vwoop:

  defmacro defml(opts) when is_list(opts) do
    case opts[:do] do
      nil -> defml_impl(opts, [])
      ast -> defml_impl(ast, opts)
    end
  end

I can probably live with requiring parenthesis around anonymousall function definitions, though would be nice if an infix operator like => (much as I stylistically dislike fat arrows) would work like normal infix operators (-> is really weird… not like a normal operator…). :slight_smile:

EDIT: What no strikethrough markdown operator on this forum? o.O

EDIT: For note, I’m trying to translate a mini-language I made into Elixir’s AST to use straight in Elixir (right now it parses out text into elixir ast, which makes syntax coloring really bad when used like defml "let a = 42 in a", although it works, just… ugly, we need read-macro’s in elixir ^.^), where its function syntax is:

let myAdder = fun
  | (a:int) (b:int b>=0) -> a + b
  | (a:int) (b:int) -> a - b
  | (a:binary) (b:binary) -> String.concat [a, b]
  | a b -> error "unsupported"
in myAdder 10 (-4)

Types are optional if it can be inferred, though fully typed it would look more like:

let myAdder = fun
  | (a : int) (b : int b>=0) : int a+b -> a + b (* Since a has a valid range of infinite here, the return type is infinite+{b's range, which is {0, infinite}}, which is still infinite regardless, it is not that smart yet *)
  | (a : int) (b : int) : int a-b -> a - b (* int's extra information holds possible values allowed, simple math allowed on those values *)
  | (a : string) (b : string) : string -> String.concat [a, b] (* string's can refine on length only, currently *)
  | (a : any) (b : any) -> error "unsupported" (* any's do not have any refinement yet *)
in let curried = myAdder "Hello " (* A new erlang anonymous function is made here that curries myAdder, curried things and closures cannot be made into top-level module functions in this play language (type error) *)
in curried "world" (* Now I call it *)
3 Likes

Reading this thread has been such an emotional rollercoaster. :joy:

To sum up, -> is only allowed between do/end, fn/end and (/). But you need to be careful because the parens need to apply to the -> and not arguments. The reason why foo(fun x -> x) does not work becomes clearer if you add multiple arguments. If you have foo(fun x, y -> x), does it mean foo(fun x, (y -> x)) or foo((fun x, y -> x))?

The reason I wrote the document linked by Eric is exactly because we wanted to show it is less rules than most would expect. Especially because we don’t need to specify the rules for keywords like case, def, defmodule, receive, if, try, etc.

4 Likes

Heh, sorry, I never meant for it to be as such, just fighting AST (I’m more used to tokanization macros rather than ast macros ^.^) by purposefully seeing how much I can subvert Elixir macro’s to do things not originally intended. MLElixir is not the end-goal by far, rather something else is, a working hint:


    # The extra type is to give a name to the type in simple, so the input and output become the same type.
    # If the spec was `simple(any()) :: any()` then you could not state that the output type is based on the input type.
    defmodulet TypedTest_Typed_Identity do
      @type identity_type :: any()
      @spec simple(identity_type) :: identity_type
      def identity(x), do: x
    end

Making an inference algorithm on something I’m more familiar with using such algorithms on (ML style AST’s) let me build it up and work out kinks. ^.^

The string parser version I made worked pretty well but I made some bad design choices in it, wanted to change it up a bit but I did not want to re-parse either (plus I was curious in the challenge of making it work in a macro), and this one ended up pretty clean, just needed to make one more design change (already done, just not in ‘it’) and have it working in the main project now. MLElixir is never designed to be used, just what I am going to play in. ^.^

All things considered, I was able to get a typed system with surprisingly few AST nodes, I’m half-tempted to parse out Elixir from text and build it out from the base level (plus with my new parser that is not based on leex/yecc in any form holds column information, so I can report significantly more accurate errors ^.^).

But yeah, foo(fun x, y -> x) is entirely ambiguous if -> were an operator, I’d have opted for foo(fun x y -> x) instead, or to have the fun support a block probably foo(fun x y -> do x end) and so forth, with matching semantics working like foo(fun {x, _} y when x>10 -> x) or so. I am exceedingly biased on getting rid of commas though as I consider them entirely superfluous with a decent syntax and they can easily get lost in the noise. The above example I’d personally want to write it as foo (fun x y -> x) where () surround arguments instead of the call as it makes them gone most of the time and keeps the block in-place, take for example some calls that I’ve seen ‘in-the-wild’ like blah = something(var1 + var2, {var3, var4, var5}, var6.var7, var8+var9), if read in a more ML’y language it would be blah = something (var1 + var2) {var3, var4, var5} var6.var7 (var8+var9), which I find more readable (coincedentally it also works in Elixir if not being piped or something), however that is still syntax and not really the purpose behind what I’m trying to do anyway. ^.^

I guess I mostly just expected operator-looking constructs to exist like operators in the AST, it just ended up being very different, which still feels like a special case. I’m guessing Ruby has all these odd constructs, and I did not come from the Ruby world (though to be fair I primarily came from the horror that is C++, no other language has such a sizable Spec… and I know it better than I probably wish, this might also contribute to my like of the simple syntax’s of functional languages that have such a simple syntax like Erlang and ML ^.^), so there are a lot of things assumed in that world that I do not have. :slight_smile:

And yeah, I know dialyzer catches most of this already, but I accept more information than it that can soon be specified via other attributes, it is still mostly just for the challenge though, although this part may actually be something I use in production as it only compiles or not without changing any syntax beyond defmodule to defmodulet based on what it scans (it can pull type information from other modules decorated in the same way, I might have it parse BEAM files later as I originally tried to have it do but that is too much of a pain for a first version, so explicit decorations can be done instead, which is fine for me, plus it has more options this way :-)).

Tokanization macros could do the same, but without needing to special case things like -> or so although, just for convenience special casing all block forms would save the most effort in writing them, but even that would be optional at that step. :slight_smile:

Like something like this to implement if via case:

# This would be called when an identifier token with `:if` was reached, and would consume the
# token stream until it reaches the end or it exits early (the end would be the end of the block
# that it was called within), assuming a 'block' form returned a map with the internal tokens of
# the usual `do`, `else`, etc... that is supported now.
deftokenmacro if(tokens) do
  # The :if should already be eaten by this point, but if it was not you could do it by:
  # [:if | tokens] = tokens
  {tokens, condition_ast} = AST.parse_expression(tokens) # returns the expression_ast and the remaining untouched tokens
  {tokens, clauses} = AST.parse_blocks(tokens, do: true, else: false) # Only support the do and else forms in a block here, do is required, else is optional, returns a map of the AST's
  do_ast = clauses.do
  else_ast = clauses[:else] # `nil` will be returned regardless if empty
  if_ast = quote do # Quoting things could even run through the tokenmacro's too
    case unquote(condition_ast) do
      value when value in [false, nil] -> unquote(else_ast)
      _ -> unquote(do_ast)
    end
  end
  {tokens, if_ast} # Return un-eaten tokens and the ast we generated
end

Like say the case was handled in the ast for matchers and the AST is in the form that Elixir has now, you could make a tokenmacro parse that kind of like (although you’d probably want case built-in to the kernel instead of defined in the language, so it could be used in the earliest of deftokenmacros ^.^):

# `case` would probably be defined internally, but an example by using only function heads
# without using `case` at all:
deftokenmaco case(tokens) do
  {tokens, value_ast} = AST.parse_expression(tokens) # You'd probably be passing an environment around somehow too to be honest...
  {tokens, blocks} = AST.parse_blocks(tokens, &case_block_parser/2, do: true) # You could even make an `else` block easily to handle non-matches ^.^
  {:case, [], [
    value_ast,
    [do: blocks.do]
  ]}
end

# defp case_block_parser(block_name, tokens) # The callback format
defp case_block_parser(:do, tokens, astSoFar \\ [])
defp case_block_parser(:do, [], astSoFar) do # successfuly used every token in this block, or the next head would have already errored
  astSoFar
end
defp case_block_parser(:do, tokens, astSoFar) do
  {tokens, matcher_ast} = AST.parse_matchspec(tokens)
  # [:-> | tokens] = tokens # Probably want to report a decent error if missing though, a helper could be like:
  tokens = Tokens.requireNextToken(tokens, :->, "Matchspecs must be followed by a `->`")
  {tokens, body_ast} = AST.parse_expression(tokens) # Which could be a block, a block is just a set of expressions that returns the last one, but is delimited by do/end or (/) or so
  ast = {:->, [], [matchspec_ast, body_ast]
  case_block_parser(:do, tokens, astSoFar ++ [ast])
end

This version of case would not be parsed like Elixir’s is now (with special whitespace rules and such, though I guess you could add that by attempting to parse a matchspec followed by a :-> at each step and if it fails then parse another expression into the block), it would be more like this:

case blah do # A `do` is absolutely required in this version, not that it isn't already ^.^
  x when is_int(x) -> x+42 # Single expression requires no block
  x when is_binary(x) -> do # Multi-expression requires a block, whether do/end or (/) or whatever, I so love explicit blocks, implicit ones are harder to reason ^.^
    x = String.trim(x)
    "Test: " <> x
  x -> x
end

And I could have easily made a version that supported this:

case blah do # A `do` is absolutely required in this version, not that it isn't already ^.^
  x when is_int(x) -> x+42 # Single expression requires no block
  x when is_binary(x) -> do # Multi-expression requires a block, whether do/end or (/) or whatever
    x = String.trim(x)
    "Test: " <> x
else
  blah
end

Or made a version that supported this more different format:

case blah do # A `do` is absolutely required in this version, not that it isn't already ^.^
  | x when is_int(x) -> x+42 # Make all cases delimited by `|`, first one could even be optional
  | x when is_binary(x) -> # Multi-expression is fine, still delimited by `|` so just parse tokens until a `|` or the end of this block is reached at `end`
    x = String.trim(x)
    "Test: " <> x
  | x -> x
end

That last one would be especially easy if certain tokens were not overridable like the block constructs do/end and (/) and maybe even others (though keeping it small is good) then the tokenizer could be block-aware and make parsing the tokens almost trivially easy. But then you could build entirely new forms of constructs while keeping things well scoped. :slight_smile:

Even the case expression in Elixir now you have to separate cases with a ;, which is such an odd construct to use as ; is usually used to separate expressions, not cases (I’d have chosen | and put it in front of each case):

case condition do blah when is_int(blah) -> a=2; blah+a; bleep -> bleep end
# or in long form
case condition do
  blah when is_int(blah) ->
    a=2
    blah+a
  bleep -> bleep
end
# Compare to:
case condition do blah when is_int(blah) -> a=2; blah+a | bleep -> bleep end
# or in long-form:
case condition do # The first `|` is optional regardless, even here, but consistency...
| blah when is_int(blah) ->
  a=2
  blah+a
| bleep -> bleep
end

Using a ; as a case separator suddenly makes this ambiguous, so you have to try parsing a match then a :-> to see if it is a new case after every ;, or you could just check for |, which I also think makes cases more readable (I even un-indented the body as an example, though you still could). ^.^

But yeah, as for deftokenmacro a base set (like case probably) would probably be implemented in Erlang for initial parse-time. ^.^

Blergh, this ended up much longer than I wanted, syntax is often on my mind since I teach it and I often have to poke at such things… >.>

Feel free to ignore the above (except maybe the defmodulet example showing normal but typed elixir if you are curious), I’m working on typing things, not syntax. ^.^

1 Like

Yes, yes! I didn’t mean it as a negative thing. More like: “will he make it?!”.

The reason we didn’t use tokenization macros is because they are quite more powerful than macros, allowing anyone to create their own syntax for Elixir. It would be wonders for what you are doing but could make Elixir a really hard language, as totally new syntax constructs could be introduced any time.

3 Likes

That is why I had the idea for scoped tokanization macro’s. :slight_smile:

Once scoped based on non-overridable block tokens keep the entire tokanization macro context entirely defined and unable to leak out, so it is true that anyone could create their own DSEL but only within a specific block, which would be fantastic for math, parsing, or anything else, while not ‘leaking’ out. It is not a concept I’ve seen languages with token-macro’s support yet but its an idea that I’ve had for a long time and I’ve been considering implementing it in one of my play languages (although the only block context my recent ML one has is parenthesis (/) so far, been holding off on adding the traditionally ML begin/end blocks ^.^). :slight_smile:

I do have to say that I took the Elixir AST as an idea on how to store the AST, it is very uniform and easy to go over, which would work very well with elixir-style macro’s in there too (making PPX’s in OCaml, its version of macro’s, is… you need to know compiler guts… ^.^;).

For the elixir macro MLElixir test I’m building up modules in reverse order, I have tests for it to become like this:

defml \
  let identity = (fun x -> x) in
  let blah = 42 in
  module {
    identity: identity,
    blah: blah,
  {

# Or you can do it inline too...
defml module {
    identity: (fun x -> x),
    blah: 42,
  }

And what it does is parse my internal function format to build up a def elixir AST instead of an anonymous function AST, and set values like blah above becomes a 0-arg function. But this will also create the function on the module named __ml_open__/0 that returns an information structure that is currently of the form of (this is how I pull information at compile-time from other modules):

%{
  types: %{ # Types defined and exported from this module
    t: <TypeDef>, # This is an exported type named `t` (traditional 'module' default name type for many ML languages
  },
  funs: %{
    +: <AnonFunc/3> # Defines the `+` function
  },
}

The anonymous function for a function is passed the parsing environment (not the elixir environment, but this holds the type information and more), the meta-structure of the call AST node (to do better error reporting), and a list of the argument AST’s, which contain the type information, so it can unify the types as it wants, match it to multiple function heads, etc… For + (since it is built-in) it mandates that the types of all the arguments are the same type and of type integer or float, while updating the refined information and so forth, that way doing 2+2 always returns an int type and 1.1+2.2 always returns a float type yet 1+2.2 errors out because it cannot unify them. If I ever advance ‘far enough’ I want to put OCaml-style first-class modules in the type system (as a value it is just an atom after all, so easy to pass around) while adding implicit module support (been planning for that to be added) that way + could be called on any user-defined type that implements a module to handle it that is in scope, like typeclasses (but more powerful).

For the typed elixir that I’ve been doing as well (the defmodulet macro) it just uses normal Elixir code, so I may have to invent some new module attributes to support such advanced constructs, but I left it open in the parser to do just that, however it should be possible regardless. :slight_smile:

/me really loves statically typed languages, they catch so many errors that leak through otherwise that even dialyzer cannot catch…

As an aside, I am hoping for the basic functionality of defmodulet to be so that you just import TypedElixir, add the t to the end of defmodule (I do not like overriding internal names) and just type your existing code, although it will rightfully yell at you if you call a function that it does not know the type of outside the module so you may need to decorate them as well (just normal @spec’s, you can even put them in another file that you can import so no changes are needed to the original module, then just import the types). :slight_smile:

2 Likes

Have you looked at Alpaca? https://github.com/alpaca-lang/alpaca

1 Like

Eyup, and I’ve been commenting in their issues and discussions since it was known by its previous name. ^.^

They’ve done a few things I’m not a fan of, like they are going the route of 1-arg functions only with automatic currying at the function head, where I’d prefer to do (to fit the EVM better) N-arg heads by default and instead auto-curry at the call sites. That is how my in-elixir tests have been, following that style.

I’m also very iffy about their ideas of typing PID’s/mailboxes, I’d personally prefer to black-box message types and force tests to happen on it to deduce what it is as that seems far more Erlang’y to me as well while also being significantly easier to reason about in the EVM’s architecture (anything could send you a message of anything after all).

^.^

2 Likes

Sorry to be a bit off topic, but I feel like you missed an great opportunity in not calling this project ExML…ok I’ll show myself out now…

Seriously though awesome project!

9 Likes

Lol! Well it is not released or anything yet, so that is an easily doable change. ^.^

2 Likes