darkmarmot

darkmarmot

I think Elixir 2.0 should drop structs

Someone recently asked “What feature would you most like removed from your language?”

And, for me, it has to be structs or the way Elixir currently handles them.

In my job, we do a lot of deployments on distributed clusters, either pushing hot code updates against running servers for small changes or running multiple versions of our code side-by-side in the cluster as we do rolling deployments for large changes.

Thus far, we’ve had zero downtime running with a cluster of roughly 30 servers (knock hard on wood) over the last year and a half (since we went into production).

Our biggest difficulty (and danger) has been Elixir’s structs. While Erlang was designed for our kind of environment, it feels like Elixir broke away from it with its struct implementation.

To explain, functions that handle structs don’t happily duck-type them. If you make changes to a struct definition and move it between nodes, you can’t match on it unless all the fields are in perfect agreement. So there’s no easy way to update them in the running system.

We don’t let any data that moves between nodes contain them.

And this is where it gets really ugly. A lot of Elixir’s base data types are implemented as structs, such as Range and DateTime.

So, what seems like a simple non-breaking change for the language, as when the recent version of Elixir added step to Range, could actually cause catastrophic chaos on our distributed system.

I would propose that Elixir consider making structs act as duck-typed maps in the future as I love the language, but I hate to see it limiting the power of the original VM and ecosystem.

Most Liked

bitwalker

bitwalker

Leader

I second @wojtekmach on this, there is nothing about structs that is inherently more difficult to deal with when it comes to hot upgrades. As pointed out, pattern matching on structs is already basically duck typed, there is nothing stopping you from creating a struct by hand, e.g. %{__struct__: DateTime, foo: bar} and then passing it around, though obviously things will explode pretty quick, but as long as matches are only looking at the __struct__ field, i.e. %DateTime{} = %{__struct__: DateTime, foo: :bar}, nothing will fail until you actually access one of the missing fields.

I think the trouble with hot upgrades in general is that it is very difficult to carefully reason about how the upgrade process will occur, which is why testing them is so important. You can of course look at the upgrade script to get an idea of what order things will happen, but it won’t tell you if you have any old versions of data structures hanging around, regardless of whether its a struct, a record, or just a plain old tuple/list/map. Obviously anything you create post-upgrade will work properly, and anything you hold in process state can be upgraded predictably, but if you stuffed a struct in an ETS table, do a hot upgrade, then fetch that struct out of ETS, you’re going to get the old version. So you need to make sure that upgrading the data in that table happens as part of the overall ugprade as well, which can be tricky to say the least, primarily for public/protected tables since you can have readers/writers that are not upgraded yet and may choke on the new schema. Taking into account local vs external function calls, and intra/cross-node messaging is just another set of layers on the problem.

It takes an enormous amount of effort to properly orchestrate a system that uses hot upgrades. It’s an awesome capability to have at our disposal, but not only do you have to manage this complexity for your own code, but that of all your dependencies as well, since it is extremely rare that any of them even bother to plan for hot upgrades, let alone write appups. I’d argue that it is rarely ever worth the effort to use them, except for very small, purpose-built components which are deployed separately from the rest of your system, and can be carefully managed. For example, its ridiculous (IMO) to build a web application or backend API that uses hot upgrades. But let’s say that the web application provides an interface for a control plane that has some crazy high uptime requirement - the web application itself doesn’t need hot upgrades, but the control plane might, so you build them as separate deployments and design the web application to talk to the control plane using a protocol that rarely, if ever, changes.

I’m digressing wildly from the point here I guess, but what I’m really getting at is that I don’t think Elixir has done anything to make hot upgrades more difficult than they already were - these problems were all very much present before Elixir existed, and due to the dynamic nature of both Erlang and Elixir, I’m not sure its even possible to build tooling that makes it substantially easier. The closest thing to “automatic” appups were what I built into Distillery, but that was only ever intended as a starting point for building out a hot upgrade, since it didn’t do any of the manual stuff that I mentioned earlier. More often than not people would to try and use them for hot upgrades without any manual auditing.

Elixir itself would need to define appups for each application, for every release, much like Erlang does, and do some level of testing to ensure they work, for there to be any chance of hot upgrades not ending badly anyway. Luckily, the bulk of Elixir is library code, but there are some things that would need to be hot-upgradeable (e.g. Registry). Hard to say whether the core team has the bandwidth for that though, which means if you are using hot upgrades and building on top of Elixir, you need to be writing the appups for things in Elixir that you use which require upgrading. I suspect very few people using hot upgrades are doing this.

wojtekmach

wojtekmach

Hex Core Team

I must be missing something because I don’t see where the problem is. Please bare with me. This Elixir code:

def f(%Date{} = date) do
  date
end

compiles to this Erlang code:

f(#{'__struct__' := 'Elixir.Date'} = _date@1) ->
    _date@1.

so at runtime there are no checks on anything besides the __struct__ field.

Of course, if you choose to write f(%Date{year: year}) and you pass something other than a map
with %{__struct__: Date, year: year} then that’s gonna be a match error.

This functions:

def f(%Date{} = date) do
  date.year
end

compiles to:

f(#{'__struct__' := 'Elixir.Date'} = _date@1) ->
    case _date@1 of
        #{year := _@1} -> _@1;
        _@1 when erlang:is_map(_@1) ->
            erlang:error({badkey, year, _@1});
        _@1 -> _@1:year()
    end.

so there are no additional checks either.

If you don’t want this behaviour of date.year then call Map.get(date, :year)? Or implement
Access for your structs? (cannot implement it for things you don’t own.)

Is it less about your code where you can choose not to use structs and more about libraries that you want to use which do use them, and thus you run into a danger of a runtime errors when things change?

I’m really curious, what dd you exactly mean by structs becoming more like duct typed maps. Which semantics of the structs are you proposing to change? How should they behave?

darkmarmot

darkmarmot

Yeah, thanks! I’m talking to my team now to see if we can build a project to reproduce what we thought we had been experiencing (to make sure we weren’t just deluding ourselves and misinterpreting data as well as to provide feedback here since it sounds like there are technical reasons we shouldn’t have experienced what we thought we were). And it’s totally possible we were wrong.

It might take a bit, though, as we’re in the middle of a bunch of stuff – but the basic setup will need to send structs between 2 nodes on different versions of a codebase using OTP 22 cluster, rpc and gzip. I’ll update this thread when we have something definitive.

Thanks!

Where Next?

Popular in Discussions Top

PragTob
Hello everyone, I know we had quite some threads (read through lots of them) about background job processing but it remains a hotly deba...
New
owaisqayum
I have a sample string sentence = "Hello, world ... 123 *** ^%&*())^% %%:>" From this string, I want to only keep the integers, ...
New
Nvim
Elixir appears to be a superior language to Python. I don’t see any advantage of Python over Elixir. Are there any?
New
AngeloChecked
What learn first? Rust or Elixir Hi Elixir community! I’m here because i want learn a new language. I’m a junior developer and mainly i ...
New
mmport80
I have put far too much effort into Dialyzer over the last year or so - and basically - I doubt it’s worth the effort. It’s not as easy ...
New
rms.mrcs
A couple of days ago I was discussing with a friend about different approaches to write microservices. He said that if he was going to w...
New
restack_oslo
Hello, Please pardon me for any faux paux. I am 46 and this is my first time on a forum of any kind. I wanted to to get answers from tho...
New
chulkilee
Here are the list of HTTP client libraries/wrappers, and some thoughts on HTTP client in general. I’d like to hear from others how they w...
New
AstonJ
Seen any cool LiveView demos, sample apps or examples? Please post them here! :003:
New
Markusxmr
Since Drab has been developed for a while in the open, introducing the Liveview functionality in a way it happend appears to undermine th...
New

Other popular topics Top

chrismccord
As promised, the first release candidate of Phoenix 1.3.0 is out! This release focuses on code generators with improved project structure...
New
msaraiva
Surface is an experimental library built on top of Phoenix LiveView and its new LiveComponent API that aims to provide a more declarative...
564 43591 214
New
JeremM34
Hello, how can I check the Phoenix version ? Thanks !
New
grych
Hi folks, Few months ago I have announced the proof-of-concept of the library to manipulate the browsers DOM objects directly from Elixi...
639 52238 488
New
jay1
Why is it that the mnesia database isn’t the most preferred database for use in Elixir/Phoenix?
New
dblack
I’ve got an issue with an app and I’ve no idea of how to troubleshoot it. I’m hoping someone here might have seen something similar. I p...
New
shijith.k
I am trying to start a new phoenix project with elixir 1.9, but mix phx.new does not work. It says that ** (Mix) The task "phx.new" could...
New
AstonJ
Seen any cool LiveView demos, sample apps or examples? Please post them here! :003:
New
jononomo
For some reason my phoenix channels are working for me in my local dev environment, but as soon as I deploy via Docker, I get a 403 error...
New
lanycrost
Hi everyone! I need implement if…else if…else condition from my elixir code, and anymore of this control flow structures not work proper...
New

We're in Beta

About us Mission Statement