Gleam, a statically typed language for the Erlang VM

lpil · November 4, 2019, 10:18pm

They’re tuples, not maps. I may be adding an atom tag to them to make them Erlang records for easier interop, though I’ve not decided on that yet.

OvermindDL1 · November 4, 2019, 10:21pm

Oh I thought I read you were changing them to maps! I’d probably add an atom record, the beam is very optimized for branching based on that first element.

But yeah, add in ocaml’s object system too (which are just row-typed records). ^.^

lpil · November 4, 2019, 10:49pm

The atom tag in structs will be purely for Elixir / Erlang use (as well as some help for the programmer if the struct is printed). There’s no union types outside of enum so it’ll be branching based upon the enum tag rather than the struct tag.

OvermindDL1 · November 4, 2019, 11:13pm

Just combine them in that case.

For note, OCaml has multiple static variant head types, here’s a type demonstrating (I think?) all of them (and of course extensible variants and polymorphic variants have the same possible heads):

type a_record = {i : int; f : float}

type my_enum =
| Plain
| Single of int
| Integrated_Tuple of int * float
| Value_Tuple of (int * float)
| Integrated_Record of {i : int; f : float}
| Value_Record of a_record

let plain = Plain
let single = Single 42
let integrated_tuple = Integrated_Tuple (42, 6.28)
let value_tuple = Value_Tuple (42, 6.28)
let integrated_record = Integrated_Record {i=42; f=6.28}
let value_record = Value_Record {i=42; f=6.28}

Which would correspond to Elixir values like (and yes it really does at the machine code level):

plain = :Plain
single = {:Single, 42}
integrated_tuple = {:Integrated_Tuple, 42, 6.28}
value_tuple = {:Value_Tuple, {42, 6.28}}
integrated_record = %{_: :Integrated_Record, i: 42, f: 6.28}
value_record = {:Value_Record, %{i: 42, f: 6.28}}

And yes, for types like type = Plain | Single of int where the embedded type’s have no overlap it quite literally compiles down to machine code that encodes a boxed integer hash for the Plain (so in elixir that is the atom :Plain), or an unboxed integer, say of 42, so that type would, in elixir parlance, get optimized to be :Plain or 42, no wrapping tuple cost or anything.

And of course with PPX’s able to add annotations you can have them generate in new and unique ways as well.

The reason it splits the embedded vs value is because that has performance and access considerations. Like you cannot do let my_tup = (42, 6.28) in Integrated_Tuple my_tup but you can do let my_tup = (42, 6.28) in Value_Tuple my_tup, and you can extra the entire ‘value’ costlessly where extracting out of an embedded form means you have rebuild the tuple/record externally, however the embedded forms have less allocations and are smaller in memory overall. It’s mostly just so you can pick whichever is best from a performance standpoint, lol. ^.^

But yes, that would be why leading atoms are still useful, and I’d still put them even in purely standalone records, because OCaml does too (specifically it puts a type header, which allows for introspection in ‘magical’ code, which you shouldn’t ever do and should pretend doesn’t exist, lol).

Speaking of, any plan on supporting extensible variants and/or polymorphic variants (polymorphic variants are just a special cased of an unnamed extensible variant though, in OCaml specified via a backtick before a head name)?

SkinnyGeek1010 · November 27, 2019, 6:10pm

I was curious on the priority for some kind of decoder/encoder support for structs. The interop is rather difficult when most of the Elixir ecosystem uses maps to pass data around and then i’m left to create and maintain my own encoder/decoder. These can easily fall out of sync with the Gleam code. Since structs are position based after compile time a change in the struct that’s not updated to on the user’s en/decoder makes for a nasty bug that’s hard to catch without excessive testing on every value.

At the very least, if we had some kind of public function (for pub structs) that returns a list of the struct fields, that could be used in the decoder which can fail to match if it’s not the same as what the decoder is using.

I’ve been re-writing a side project in Elixir for a thin outer “shell” and then any controller or GQL resolver data gets passed into a gleam function to handle all the business logic. However, my tests have become more verbose to check that my decoders are working and maintaining that en/decoding layer is really painful to the point where I don’t really want to even move forward until I can either use gleam for a one off module or can somehow call a decoder/encoder to transform the struct into a map.

Anyhow just wanted to share some feedback from an early adopter who’s mainly in the Elixir land. Keep up the good work and I hope that I can start contributing to the documentation and example apps soon. I’m hoping i’ll be able to use my learnings from this side project to kick start newcomers.

lpil · November 27, 2019, 7:25pm

Great questions and ideas! I think you’re right that currently using Gleam structs from Elixir and Erlang is not very ergonomic. It’s reassuring that other people are having similar ideas to me

With the next version of Gleam (which should be just around the corner) structs will have a tag as the first element, making them compatible with Erlang records. Once this is in place Erlang users can define a record and use as per usual, and Elixir users can use the Record module.

In v0.6 if the record interop is working well I will probably have the Gleam compiler generate a hrl Erlang header file with the record definition in it to cut down on some boilerplate.

I think I would also like to add an automatically generated constructor function that is capable of coercing common compatible data structures into Gleam structs/enums.

For struct Cat { age: Int, cute: Bool } it could generate a function that accepts #{age => X, cute => Y} as well as {X, Y}, and perhaps some other formats. I’m not sure what the name of this function would be. I would like it to be easy to call from both Elixir and Erlang so Cat isn’t suitable as it’s awkward to call from Elixir.

Once again that you for the feedback! Having people trying the language is both helpful and flattering.

lpil · November 27, 2019, 7:33pm

Also I’d love to see your project if it’s public

SkinnyGeek1010 · November 27, 2019, 8:56pm

It’s private at the moment but once I upload it to Github i’ll add you as a contributor

rlefevre · December 4, 2019, 9:34am

Hi. Is the " Gleam: Lean BEAM typing machine" talk available somewhere?
Keep up the good work

michallepicki · December 4, 2019, 10:51am

Code Sync people seem to be slowly uploading the talks from Code Mesh LDN 2019 to this youtube playlist so it will probably land there in the future

lpil · December 4, 2019, 11:11am

I think it should be out in the next couple of days

lpil · December 8, 2019, 4:04pm

My talk from Code Beam 2019 is now out on youtube

I’m afraid the audio is a little quiet for some reason so best turn up the volume a little.

AstonJ · December 8, 2019, 6:32pm

Great talk Louis - I’m halfway through the excellent Why Erlang bit… I did not know ‘erlang’ was a unit of measurement as well!

The erlang (symbol E) is a dimensionless unit that is used in telephony as a measure of offered load or carried load on service-providing elements such as telephone circuits or telephone switching equipment. A single cord circuit has the capacity to be used for 60 minutes in one hour. Full utilization of that capacity, 60 minutes of traffic, constitutes 1 erlang.

You learn something new every day

lpil · December 8, 2019, 7:36pm

Thank you!

AstonJ · December 8, 2019, 8:26pm

You’re welcome

I haven’t looked too much into it, but was there a reason why you didn’t go for more Elixir/Ruby-ish syntax?

Eg:

pub fn multiply(x, y) do
  # here we are multiplying x by y
  x * y 
end

Perhaps as an optional way to use Gleam - so all of the following would be valid:

pub fn multiply(x, y) do
  # here we are multiplying x by y
  x * y 
end

# or even...

pub fn multiply(x, y)
  # here we are multiplying x by y
  x * y 
end

pub fn multiply(x, y) {
  // here we are multiplying x by y
  x * y 
}

pub fn multiply(x, y) { x * y }

pub fn multiply(x, y) do: x * y 

#Tho I personally think this last one may not be needed

Personally I feel syntax is important… particularly when there are other viable options (so less of an issue for languages like Swift). It could be the reason why somebody chooses your language over another.

To me the default feels a little noisy - but that could be because I’ve never really gelled with languages that use lots of braces, brackets, semi-colons etc:

Screenshot 2019-12-08 at 20.14.41

Scroll up to compare to the alternative.

I accept that as the creator your personal preference is always going to be a major deciding factor, but that could also be who you want to market the language to - is there a particular group of developers you are targeting?

(Hope I haven’t offended you with this comment - I know how our creations are very personal to us - happy to delete this post if you prefer )

lpil · December 8, 2019, 10:30pm

Is think syntax is largely about familiarity and that rarely is one syntax objectively superior to another. As it’s largely about personal experience and preferences I’m going to have to accept that fact that the syntax is not going to be everyone’s cup of tea. Having said that, syntax is important and plays a big part of people’s first impressions of a new language, and I had a few other goals with the syntax too.

I wanted the syntax too be as small and predictable as possible. So a small number of rules, no ambiguity, no optional syntax. This makes the language easier to parse both for humans (less need for linters like credo) and for compilers (Elixir’s parser is huge due to the high complexity of the syntax).

I wanted the syntax to be familiar to people from the wider programming community, which meant not using ML syntax (which is my personal favourite), Erlang syntax, or Elixir syntax. The C family syntax is the most widely used so I opted for something largely inspired by that.

Lastly I think adopting a niche syntax that is only used in a few languages (i.e. Ruby + Elixir) can be misleading to newcomers. I’ve seen many people come from Ruby to Elixir with incorrect assumptions about the language, and I speculate that the similar syntax results in people thinking the semantics are similar. Using a less niche syntax used by a wider range of languages may help.

lpil · December 8, 2019, 10:50pm

For a bit of fun we can go back in time and see what Gleam’s syntax was like a little under 2 years ago:

github.com

gleam-lang/stdlib/blob/7db6bf594f59a17ded97e2df5bb8c599943e553e/src/Bool.gleam

module Bool

export Bool(..), not/1, compare/2, max/2, min/2

import Order exposing Order(_)

type Bool
  = True
  | False

fn not(bool) =
  case bool
  | True => False
  | False => True

test not =
  not(True) |> Assert.false
  not(False) |> Assert.true

fn compare(a, b) =

This file has been truncated. show original

Much closer to ML or Python! I quite like this syntax, but the feedback has been very most positive when moving to anything more C-like.

If we go back 4 years ago we can find the original (largely incomplete) syntax.

github.com

gleam-lang/gleam/blob/94509a2f36e3b29dac0e5dd296cf15a493a67ea7/examples/clauses.glm

module clauses

public speak {
  def (1) { "one" }
  def (2) { "two" }
  def (3) { "three" }
  def (_) { "Er?" }
}

It was largely an experiment to see how we could represent multiple function clauses without duplicating the name.

There’s lots of other styles in history too

Here is one based upon indentation:

github.com

lpil/syntax/blob/master/02.txt

import maps as m

# This is a comment.

public fibonacci/1:
  doc: """
  Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed
  diam nonumy eirmod tempor invidunt ut labore et dolore magna
  aliquyam erat, sed diam voluptua.
  """
  spec: (number): number

  (n) when n < 0: error("fibonacci of negative number")
  (0): 0
  (1): 1
  (n): fibonacci(n - 1) + fibonacci(n - 2)

  examples:
    "base case of 0":
      params: 0

This file has been truncated. show original

One thing that was quite cool about that was the first class support for testable examples and documentation

AstonJ · December 8, 2019, 11:09pm

Thanks for the reply and for not getting offending Louis

I personally don’t think it’s about familiarity so much, though I can see why people might think that given many people (particularly those coming to Elixir from Ruby) have said they like the Elixir syntax. I think it is far more that it just happens to be the type of syntax they like, and that they always would have liked. Otherwise we can’t easily explain why newcomers to programming seem to prefer one syntax over another.

However I agree that it is much more about taste (and perhaps what we personally feel is more natural or intuitive) and as the creator of the language you have every right to go with whichever style of syntax you prefer.

The technical reasons you provided are very valid - if the sacrifice is too great it’s probably not worth it.

With regards to having more than a single way to do something, I actually don’t think that would be a problem. It’s easy for someone (who knows the basics of programming) to see that:

pub fn multiply(x, y)
  # here we are multiplying x by y
  x * y 
end

pub fn multiply(x, y) {
  // here we are multiplying x by y
  x * y 
}

pub fn multiply(x, y) { x * y }

Are just three ways to do the same thing.

Just out of curiosity, do you know any other language that offers this kind of dual-syntax flexibility? You could be the first

With regards to your old syntax, I definitely prefer your new one!

LTheGreats · December 8, 2019, 11:39pm

Here’s an idea: just use combinations of trailing white space characters to mark the beginning and end of blocks (not the contents of the blocks) and replace spaces with underscores as the main keyword separator.

For example:

pub_fn_multiply(x,_y)_\n\n\n\t_x_*_y_\t\n\n\t_

__ would be the punctuator

_ would replace space

\n\n\n plus one \t for how many blocks deeps would represent the start of a new block

\t\n\n plus one \t for how many blocks deeps would represent the end of the \t count block

Space would replace the underscore

Both the start of block and end of block patterns would be in between the underscores of a punctuator.

LTheGreats · December 8, 2019, 11:42pm

Also, any white space that doesn’t match the pattern should be compiled into code that just halts the BEAM without any explanation.