The issue with the static access operator (the dot)

Qqwy · April 7, 2017, 3:23pm

I would like to spark a discussion about the static access operator: ..
For whom does not know: it is used in Elixir to access fields of a struct or map, where you want to be sure (/are sure) that a given (atom) key exists.

It is arguably more easy to write out foo.bar instead of foo[:bar] or binding it in a pattern-match, but I have found that relying on . too quickly/often is a recipe for tightly-coupled code, in my opinion.

Namely, changing the internals of a %Foo{} struct becomes impossible when you (or, even worse, users of your library) rely on being able to type foo.bar. If you want to be able to change the internals of a data structure, you should instead have written a function like Foo.bar(foo). Besides being more explicit (and also slightly longer to type), this allows you to alter the way bar is obtained from within the structure at a later time.

I have run into this issue multiple times now while developing libraries, so I want to ask you all about your opinion about this subject: Do you rely on using . a lot? Do you agree that . should be used with care? Are there alternatives? How to you mitigate this issue? Or is it, in your opinion, not an (important) issue at all?

OvermindDL1 · April 7, 2017, 4:19pm

Direct field access does irk me about every time I use it because I know it is doing a hidden function call behind the scenes, if there was truly unique syntax for accessing each thing unambiguously that could be properly inlined, I would be significantly happier, plus I’d not have any worry of handling the wrong things, though then this becomes “Why don’t we add types so you know you are accessing something that actually exists at compile time instead of just praying”, and etc… slippery slope (sudden urge to work on ElixirML some more, blehg too busy with work projects)…

EDIT: Also, I know you can override some Access.* things, is get one of them? If so you can override access that way…

Qqwy · April 7, 2017, 5:55pm

@OvermindDL1 The Access protocol has

fetch(container, key) (get value for key from structure, return success tuple).
get(container, key, default) (get value for key from structure with default if it does not exist).
get_and_update(container, key, function) (call function on value found for key if it exists, depending on the result possibly mutate container).
pop(container, key) (remove key and its value from the container).

I do wonder what happens when the static access syntax is compiled. It might actually be inline-able to some extent. (Either at the Elixir -> Erlang or at the Erlang -> BEAM level).

benwilson512 · April 7, 2017, 6:49pm

I’m a bit confused, what is “direct field access” here, foo.bar ? If that’s what you mean it isn’t really a function call, IIRC it translates to basically:

case foo do
  %{bar: val} -> val
  _ -> apply(foo, :bar, []) # or an error handling clause, can't recall
end

foo[:bar] However is a function call Access.get which you can see easily from quoting it:

iex(1)> quote do: foo[:bar]
{{:., [], [Access, :get]}, [], [{:foo, [], Elixir}, :bar]}

I don’t believe any fancy inlining happens there.

PatNowak · April 7, 2017, 7:33pm

The most important things to remember:

foo.bar # strict access, throws error if key doesn't exist
foo[:bar] # dynamic access, returns nil if key doesn't exist

So:
dots for structs
square brackets for keyword lists
both for maps

Qqwy · April 7, 2017, 7:37pm

@PatNowak You are mistaken, the dot operator will fail at runtime for both structs and maps (and not at compiletime).

OvermindDL1 · April 7, 2017, 7:39pm

I think he meant that dot access is statically ‘called’ and bracket access is dynamically ‘called’, but yeah the terms could have been more clear.

PatNowak · April 7, 2017, 7:56pm

@Qqwy Thanks, fixed
I meant that strict access is called in compile time, because it checks whether key exists in map /struct or not. It helps a lot to ensure that data is valid in terms of the content.

I usually use dots when dealing with structs and square bracket access when dealing with maps - especially in case and if statements.

benwilson512 · April 8, 2017, 2:55am

But this isn’t the case. Nothing at all happens at compile time to ensure that the key does or does not exist in the map, nor even that foo is a map.

Qqwy · April 11, 2017, 8:45pm

I’d love to hear more opinions about this.

One of the things I am considering right now, is to create a small library that adds overridable functions to a module that defines a struct, to access all struct fields instead of using .fieldname, so these access methods can be changed in the future.

OvermindDL1 · April 11, 2017, 8:49pm

How would it be different then just making a struct field by an anonymous function, just saving the . part of the invocation to become blah.vweep(42) instead of blah.vweep.(42) or so?

Qqwy · April 11, 2017, 9:02pm

It would be Blah.vweep(yourbla).

Or, to give a clearer example:

In the small programming language I am building, the current runtime state is represented as a %Jux.State{} struct. It used to have a field called stack, but at some point it turned out that it required to have a list of stacks, with most of the operations only accessing the first. But now I already wrote state.stack everywhere, and had to painstakingly replace it (What did not help, was that not all my state variables were called state, and that stack was also used as name for some other things elsewhere).

I changed it to Jux.State.stack(state).

In the future, I’ll probably write it this way for all but the most trivial structs from the get-go, because this is more clear and allows you to change the inner implementation later on, as the external world is not tightly-coupled with your struct’s, well, structure.

OvermindDL1 · April 11, 2017, 9:25pm

Wouldn’t static typing be nice there to catch all occurrences that you need to replace? ^.^

mgwidmann · April 12, 2017, 2:51am

No one wants to give up the flexibility of dynamic typing. With that said, if somehow dialyzer could be incorporated into the Elixir compiler to provide a static type analysis I think that’d be the holy grail of all this, this discussion would be over.

Qqwy · April 12, 2017, 8:08am

One thing that static typing nor Dialyzer would fix, is what happens when you write some code, publish it as a library, and then people start matching on your struct’s structure.

Because of backwards-compatibility, the structure of your struct is now ‘frozen’. The old fields will need to be kept in there because some other code might depend on it.

In e.g. Haskell, when you define a data constructor, you can give names to the different fields of the structure (Haskell data constructors are closer to records I think, in that they are basically a tuple underwater, so without field naming you are forced to match on the xth’ position of the data structure). The only thing such a name does, is creating a function for your that extracts the value at that position in the data structure, and return it. If your code becomes more complex in the future, you can remove the name from the data structure, and write out a manual function definition instead.

I think this approach is better, in the sense that it keeps you open to alter or delete your implementation in the future.

OvermindDL1 · April 12, 2017, 4:38pm

This is huge! And also why many typed languages (like ocaml as one of many) do not export the internal type but instead have a set of accessors on the module of the type just for those reasons.

In fact, OCaml goes so far as to make a special record type called ‘object’ that does not have its structure fully typed (so these are more like Elixir Structs, where Elixir has nothing as powerful as an OCaml record), but it allows you to match on specific parts of it too, so like:

let thing = object
	method vwoop = 42
end

let extra_thing = object
	method vwoop = "a string!"
    method more = 42
end

let different = object
	method no_vwoop = "nope"
end

let i_accept_vwoops v = v#vwoop

let tests =
	let 42 = i_accept_vwoops thing in
    let "a string!" = i_accept_vwoops extra_thing in
    (* let "this will not compile" = i_accept_vwoops different in *)
    ()

So if you uncomment that one line (I think the forum’s syntax coloring for other languages is broken… but it is the line surrounded in (* ... *)) then it will not compile because the different object does not have a vwoop. Also, vwoop is a function, so you can change it later to do something else, like look up a value as in:

let thing = object
	val this_is_not_accessible_externally = 2
	method vwoop = 42 * this_is_not_accessible_externally
end

let extra_thing = object
	method vwoop = "a string!"
    method more = 42
end

let different = object
	method no_vwoop = "nope"
end

let i_accept_vwoops v = v#vwoop

let tests =
	let 84 = i_accept_vwoops thing in
    let "a string!" = i_accept_vwoops extra_thing in
    (* let "this will not compile" = i_accept_vwoops different in *)
    ()

I transparently change thing#vwoop to do an operation, which it does on call (even though it could be done inline in this case I guess…), but it is a normal function, can do whatever you want. val’s are not exposed externally, but can still be whatever you want internally), and method’s are exposed externally as the ‘interface’ for interacting with the object, they can even return a copy of the object with val’s or method’s changed too, such as in:

let create_thing init = object
	val i_am_hidden = init
	method vwoop = (i_am_hidden, {<i_am_hidden = i_am_hidden * 2>})
end

let tests =
	let thing = create_thing 21 in
	let (21, thing2) = thing#vwoop in
	let (42, thing3) = thing2#vwoop in
	()

Yes this is the barest part of the immutable object system in OCaml, but this is the part that is most used (almost no one uses the inheritance and such because it is just not needed). But these I can see in Elixir as a combination struct and module, which could perhaps also be called an object, so going with your idea I could maybe imagine a syntax like, hmm, actually I just hacked this together so this code works:

defobject MyTesting(init) do
  val blah = init

  def vwoop(mult) do
    blah * mult
  end
end

And the iex session:

iex> thing = MyTesting.make(21)
...
iex> thing.vwoop(2)
42
iex> thing.blah
** (UndefinedFunctionError) ...
iex> thing.vwoop
** (UndefinedFunctionError) ...

A couple minor changes to the syntax (but more work then I’ve done currently) would be able to return altered versions of its current state, like in the OCaml example. ^.^

I’d probably also change make to be new instead, OCaml uses make to make new things and that was the mindset I was in. ^.^

However yes, if you expose your type then it is hard to change it, hence why you should not expose your type, make it opaque, like in the OCaml world, dialyzer has a -opaque type declaration for a reason.

OvermindDL1 · April 12, 2017, 4:46pm

I personally find static typing to be far more flexible in terms of not needing to worry that my data is being accessed properly. Think of OCaml, you almost never need to specify a type anywhere (see my examples in prior post), it ‘looks’ dynamically typed, but it is not, it is one of the most strongly typed languages out, but with nary a type in sight, you get flexibility and security that things are right both.

Qqwy · April 12, 2017, 5:02pm

Creating a macro that accepts a bit of Elixir ast and transforms all dot access statements to equivalent calling code is not that difficult. You could even create a custom version of defmodule so it is completely hidden from view.

But of course, the static access operator was made ‘unoverridable’ for a reason: Because what you now end up with, looks very much like implicit foo.bar.baz.qux OOP method dispatch syntax.

OvermindDL1 · April 12, 2017, 5:11pm

Which is why I am not publishing this.

However, what you are proposing would allow the same thing, would it not? And really, a record/struct/map of anonymous functions is really just that anyway.

Qqwy · April 12, 2017, 6:40pm

An important difference with a struct filled with anonymous functions is that ones functionality will be altered when the module is reloaded, and one will not.

Anyhow, I like Foo.bar(my_foo) a lot better than my_foo.bar, because it is more explicit and pipeable.