Gleam, a statically typed language for the Erlang VM

lpil · February 22, 2019, 12:33am

Gleam is a statically typed language for the Erlang VM that I have been slowly building over the last year or so. I’ve had a few people ask me about it so I’m making this thread as a place to talk about it.

The type system features full type inference without annotations, generics, flexible and ad-hoc records using row types, first class modules, ADT style enums, and no null/nil/undefined or subtyping. Users of Elm, OCaml, ReasonML etc will probably find it quite familiar.

Gleam compiles to Erlang and my intention is for iterop between Gleam and other BEAM languages (such as Erlang and Elixir) to be straightforward and easy in either direction.

It’s not yet ready for use, but we’re getting close

See the project here -> https://github.com/lpil/gleam

globalkeith · February 22, 2019, 3:14am

Great progress happening here… the book is a nice touch xciting times!!

Crowdhailer · February 22, 2019, 8:59am

The book is nice, should definitely add a link to it in the top level README.

Also interesting to see how records are handled.

lpil · February 22, 2019, 9:02am

I’m planning to make the URL for the rendered book public once I’ve written something for each page in the tour.

@Crowdhailer does the record page of the book have enough detail on records for you?

Crowdhailer · February 22, 2019, 9:07am

I’d just add a link to the markdown on github until you do that it’s readable enough as is and I didn’t spot it until keith mentioned it. Had just gone through the readme

Yes, although I’m just browsing at the moment, havent tried to write anything

jeremyjh · February 22, 2019, 1:53pm

This looks really cool. Do you have any plans in the future to do operator overloading or some form of traits/typeclasses so you don’t have to use different operators for floats/ints?

lpil · February 22, 2019, 2:18pm

I am interested in having some form of overloading or ad-hoc polymorphism, though what shape that would take is unclear.

Currently I’m thinking about an implicit parameter system similar to that of Scala, though this is still in an early research stage.

For mathematically operators specifically we could take the approach that SML takes and assume that the arguments are Ints unless one has been locally inferred as being a Float. It’s not extendible, but I think that + etc are spacial cases already- Gleam users cannot define new operators.

josevalim · February 22, 2019, 3:40pm

I would love to hear the outcome of this research because I tend to hear more positive things about Haskell typeclasses and/or OCaml’s module system compared to Scala’s!

kamilchm · February 22, 2019, 4:08pm

There’s a long thread on implicits in the OCaml community https://discuss.ocaml.org/t/critique-of-implicits/3031

tristan · February 22, 2019, 4:09pm

Do you have plans for how you’ll handle processes, messages and OTP behaviors?

Sadly that seems to usually be the point at which these attempts at statically typed languages on beam tend to stall.

lpil · February 22, 2019, 4:52pm

I would love to hear the outcome of this research because I tend to hear more positive things about Haskell typeclasses and/or OCaml’s module system compared to Scala’s!

Haskell’s compiler works very hard to monomorphise all the uses of type classes where possible, which seems like a hard problem given which implementation to use must be determined at compile time. I’m not convinced that Haskell’s type classes are the right tool for the job here, so I’d like to explore plenty of options before starting the long journey to a production ready implementation such a system.

I’ve a lot less experience with Scala than Haskell, but I think that without some of the other features of Scala (subtyping, implicit conversions) and some restrictive changes to the ergonomics we can avoid some of the pain points of Scala’s implicits. First step is to write a bunch of Scala (in the Gleam style) and see how it feels

Gleam has a module system that in some ways similar to OCaml’s as they are first class values in both. I believe Gleam’s module will be somewhat easier to work with as values as they are row typed. Currently there’s no concept of Functor modules so they are more limited in this way.

Whatever the end result we are a long way off starting this work. My focus now is to get the compiler ready for v0.1 of the language, after that comes the stdlib and documentation, tooling, and so on. As more Gleam code is written we will learn more about the language and what kind of polymorphism fits it.

There’s a long thread on implicits in the OCaml community Critique of implicits - Ecosystem - OCaml

This is great, thank you @kamilchm

Do you have plans for how you’ll handle processes, messages and OTP behaviors?

I believe OTP’s behaviours will map nicely onto first class modules with parametric polymorphism, though I’ve not yet tested this theory in practice.

Processes and messages are much more difficult, and I don’t have a solution. For now I’m opting to avoid the problem until we have a solution and any code that uses the low level concurrency primitives will have to be written in Erlang or Elixir.

josevalim · February 22, 2019, 6:09pm

Saved to read later today, thank you!

Hopefully EEP 48 can be helpful here. With Erlang/OTP planning to adopt EEP 48 soon, it is likely that ExDoc will be changed to support multiple languages, so I would love to hear pain points and find ways to make it work for Gleam too.

Do you have a FFI for integration with Erlang/Elixir? Regarding interop, do you support keyword lists or proplists in any way? I would love to learn more about the typing in there. Thank you for the replies and feel free to point me to any docs or references if you can’t go very deep into this.

lpil · February 22, 2019, 8:21pm

I would love to hear pain points and find ways to make it work for Gleam too.

I’m also curious about EEP 48. Back when I was compiling to .beam files via core Erlang I had an idea of how it would work, but now that I’m writing Erlang text files it’s unclear to me how I would make use of it. I need to go back and read the spec

Do you have a FFI for integration with Erlang/Elixir?

Sure, here’s some docs. Let me know if they are unclear or lack some required detail.

https://lpil.uk/gleam/tour/external-function.html

https://lpil.uk/gleam/tour/external-type.html

Regarding interop, do you support keyword lists or proplists in any way?

Depending on the structure of the keyword you may be able to represent it as a list of an enum.

enum PostgrexOption =
  | Port(Int)
  | Database(String)
  | Username(String)
  | Password(String)
  | Timeout(Int)

pub external fn start_postgrex(List(PostgrexOption)) -> Result(Pid, ()) 
  = "Elixir.Postgrex" "start_link"

Otherwise you may need to use the Erlang FFI to do something more clever.

josevalim · February 22, 2019, 8:30pm

If you are callnig compile:forms/2, then it is a matter of passing the extra_chunks option to it with a docs chunk. The docs chunk is a data structure, as specified in the EEP48, serialized as an Erlang term. Here is the code in Elixir. But I am glad to discuss when the time arrives.

Thanks!

This looks neat!

lpil · February 22, 2019, 8:37pm

This is what I previously did when the compiler was written in Erlang, but now the compiler writes .erl files and lets the Erlang compiler do the rest.

term_to_binary/1 isn’t something I have access to in Rust (unless someone has implemented this as a library for me) so I’ll need a different API or I’ll need to write a little Erlang (or Gleam?) program that converts docs from JSON or similar into BEAM chunks.

chrismcg · February 23, 2019, 9:06am

There is GitHub - obmarg/serde_eetf: Erlang external term format support for serde though I haven’t used it.

lpil · February 25, 2019, 4:09pm

I’ve started adding error messages to the compiler. Here’s a message I imagine I’m going to be seeing a lot of in the near future!

There’s plenty of improvements to make (and bugs to fix) here but having any error printing at all is making a world of difference

eteeselink · February 26, 2019, 8:23am

Do you have plans for how you’ll handle processes, messages and OTP behaviors?

I believe OTP’s behaviours will map nicely onto first class modules with parametric polymorphism, though I’ve not yet tested this theory in practice.

Processes and messages are much more difficult, and I don’t have a solution. For now I’m opting to avoid the problem until we have a solution and any code that uses the low level concurrency primitives will have to be written in Erlang or Elixir.

Did you consider that maybe you don’t need a solution? I’m just a random guy on the internet so don’t weigh my opinion too heavily, but people keep repeating the drum that static typing on BEAM is hard/impossible because of the processes and the messages, but most Elixir code I write is remarkably non-concurrent in nature. Eg in a typical Phoenix app, most code you’d write runs in a single process (the one tied to the request), managed under the hood by the framework. Wouldn’t it be formidable to have all that code statically checked?

Saying processes and messages make it hard to do static typing on BEAM is a bit like saying microservices make it hard to do static typing in Java. After all, there’s lots of microservices out there written in statically typed languages, with serialized JSON or protobuf or whatever format messages sent between them. All of these make assumptions on the types of those messages, maybe do a bit of validation and then just static-cast it all to MyStructuredDataType and call it a day. Having something akin to this on the BEAM would be fantastic (and personally I’m sad every day I define an Elixir struct in one module and the compiler can’t even warn me about a typo in another).

If you make it possible for pattern matches to do full recursive runtime type checking, then all code following eg a receive block can still be certain that each value is indeed typed as specified (because otherwise the runtime type checker would’ve crashed the process) (or rejected the message, not sure). Sure, that’s still not Haskell-level certainty, but it’s still a lot better than what we have now, and you’ll still be 100% typesafe within each process.

Cool stuff!! Will definitely be following this.

lpil · February 26, 2019, 9:04am

For sure, this is the first goal. I am hoping that one day someone will discover a suitable way of typing Erlang message passing but even if that never happens then I think having everything but ad-hoc processes being soundly typed will bring great value.

If you make it possible for pattern matches to do full recursive runtime type checking, then all code following eg a receive block can still be certain that each value is indeed typed as specified (because otherwise the runtime type checker would’ve crashed the process) (or rejected the message, not sure).

This is what Alpaca currently does, though it’s imperfect in that at runtime there’s not enough information to successfully infer the type of all values. An example would be that in Gleam {'ok', Int} and Ok(Int) have the same runtime representation.

One option would be to restrict the message patterns to a subset of types, though that would mean that we’re unable to write a soundly typed implementation of gen_server as the type information would be lost at the process boundary.

Lots to think about

Cool stuff!! Will definitely be following this.

Thank you for your support!

OvermindDL1 · February 26, 2019, 5:50pm

OCaml is working on an implicit module system as well called Modular Implicits.

Essentially type-based polymorphism in OCaml is done via the witness pattern:

(* Let's define a modular witness that just converts a type to a string *)
module type ShowWitness = sig
  type t
  val show : t -> string
end

(* And let's define a useful helper function to use the above witness (or just use it straight, whatever) *)
let show (type t) (module Show : ShowWitness with type t = t) = Show.show


(* Now let's define the above witness for, oh, booleans to keep it nice and simple *)
module Bool_ShowWitness : ShowWitness with type t = bool = struct
  type t = bool
  let show = string_of_bool
end

(* We can use it straight *)
let "true" = show (module Bool_ShowWitness) true
let "false" = show (module Bool_ShowWitness) false

(* Or save it to a variable to pass in later *)
let bool_showwitness = (module Bool_ShowWitness : ShowWitness with type t = bool)

let "true" = show bool_showwitness true
let "false" = show bool_showwitness false

Ignoring Discourses abhorrant syntax coloring (@AstonJ it really might be useful to enable syntax coloring for some other often-referencing languages on these forums ^.^;), you might notice this looks familiar, a lot like haskell typeclasses, and indeed it is! Haskell typeclasses are defined internally via witnesses, you can even see that it does actually define the witnesses before the arguments in the function types!

What a typeclass does is based on the type of the argument passed in then it will look up an appropriate matching typeclass based on that type to pass in automatically instead of manually. This has 2 implications though, the first being that it dispatches only based on the type, meaning if you want a temporarily override, like print the boolean result of the above output as a 0 or 1 instead, you can’t, you have to do it manually entirely, and second that the lookup requires compiling all other files and running the pass after, which causes typeclasses to slow down compiling.

What both Scala does and what OCaml is planning is to only lookup a matching witness based on module types explicitly brought in to the scope, or you can pass it manually, whichever. This means you can both override what witness is passed in and it doesn’t have a hit on compile side, but this does mean that the witness needs to be in scope, which will probably already be the case when you are passing the type to a function that needs the witness anyway, so it will be ‘for free’ in 99% of cases already and otherwise it will just need a single open into the scope in a single place of where the type is first used. The above OCaml example with modular implicits would look more like this instead:

module type ShowWitness = sig
  type t
  val show : t -> string
end

let show (type t) {module Show : ShowWitness with type t = t} = Show.show


implicit module Bool_ShowWitness : ShowWitness with type t = bool = struct
  type t = bool
  let show = string_of_bool
end

let "true" = show true
let "false" = show false

And yes, this means you can even write an operator like + that operates over integers and floats both, all while being super fast to compile and super fast at runtime.

That’s what OCaml does currently too, + is for ints and +. is for floats, but with modular implicits it could dispatch properly with just + by defining + like let (+) (type t) (module Add : Addition with type t = t) = Add.add and simple as that it works, just define some internal versions for integer and float and have them be opened in prelude and it will work with them in addition to use types as well.

Putting modular implicits in early in the language life means that a lot of weird things like +/+. and more just won’t be issues anymore. It’s like the best of both worlds with both haskell typeclasses and manual/explicit witnesses.

The critiques of it are similar in the scala world, and do exist there, but honestly it’s worked very well there and as long as you keep to a most-recently opened resolution by default then it just works in 99% of cases, pass it manually for the last few that it doesn’t.

Actually OCaml’s Algebraic Effects system (currently in fork but being brought into mainline very piecemeal) handles processes, messages, and OTP behaviours very well. Messages should be black boxed where the types should be matched out as that would best match how the beam works and how processes handle different ‘types’ of messages over its life.

That’s something else that should be in gleam sooner rather than later is an algebraic effects system. OCaml’s is one of the best designed that I’ve seen, since it’s so new.

Yep, as described above, it slows down compilation excessively. There is a push in the haskell community to use Witnesses directly instead of typeclasses as it significantly reduces compilation time, though it’s explicit usage there since no implicits in haskell yet either.

I’d say look at OCaml’s, one thing about the OCaml community is that they will pick a design to absolute pieces before it is accepted into core, and thus OCaml’s modular implicits has had a lot of discussion so far with test implementations. Even a stripped down version of modular implicits would be a good start as it can always be expanded later.

I really don’t like this aspect about alpaca, you can’t always know what a PID will accept, especially as it varies over time or can be on remote systems or so forth. PID’s should be black-box message receivers and a receive call should match based on normal type matching.