Joining list and tuple?

I think I do not understand something. Can someone explain why this works?

iex> [] ++ {:ok, "好"}
{:ok, "好"}

Is a tuple a list?

3 Likes

That works for any term:

iex> [] ++ 3
3

iex> [] ++ "foo"
"foo"

The return value is an improper list. As you can read in the documentation, when the right-hand side of this function (++/2) is not a list, the return value is an improper list. (Point being, a tuple is not a list).

5 Likes

In this case as the lhs of ++ is [], an empty list, it will be skipped by ++. The reason is that ++ appends it rhs to the end of the list in the lhs. It doesn’t check the rhs just appends it to the lhs.

iex(2)> [1,2,3] ++ [:a,:b,:c]
[1, 2, 3, :a, :b, :c]
iex(3)> [] ++ [:a,:b,:c]
[:a, :b, :c]
iex(4)> [1,2,3] ++ {:ok,42}
[1, 2, 3 | {:ok, 42}]
iex(5)> [] ++ {:ok,42}
{:ok, 42}

If the rhs is not a list then result will be an improper list.

2 Likes

What exactly is an improper list? If a list is a singly-linked list, it has hierarchical structure something like ["a", ["b", ["c"]]], yes? Sorry if I am not understanding.

An improper list is a linked list that does not have a list as its last element. So

# this is fine 
[1 | [2 | [3]]] 

# this is an improper list 
[1 | [2 | 3]]

Which you can see if you type the above into an iex shell.

ok, so lists in elixir are linked lists, so the list [1, 2] is actually (1 -> (2 -> ??))

That ?? has to be something, it can’t be dangling. By convention we pick [], because it makes a lot of recursion type things look really nice, and as a bonus [] at two bytes is the smallest object in the erlang world.

so really it’s (1 -> (2 -> []))

However, ?? could be literally anything, so if it’s anything besides [], then it’s called an improper list. A lot of things (like the entire Enum module) break with improper lists, and you have to handle recursion with care - so be careful. Why would you use it? It does take slightly less space, so if you are implementing something on a system with tight storage requirements (like embedded) with a ton ton ton of really short lists, it might be worth it.

Also note the notational differences, if you have an improper list (1 -> (2 -> 3)), the default notate it is:

[1, 2 | 3]

which is not the same as

[1, 2, 3]

which is (1 -> (2 -> (3 -> [])))

2 Likes

This strikes me as not a great tradeoff. Is this really the only reason improper lists are allowed?

Iirc it’s generally considered to be a 'legacy concept from an earlier era". Just know you might run into it if you’re using an older erlang library, otherwise you will probably not run into it. Also certain datatypes allow it, e.g. iodata, as you send to IO.iodata_to_string

1 Like

thanks this is very much helpful!

I guess my concern was that I might create an improper list inadvertently, and the compiler won’t flag it. So then your Enum breaks and you don’t understand why.

I think it’s pretty unlikely. In about 2.5 years of elixir and probably about 60 kloc that’s never happened to me that I can recall.

Besides, now you’ll understand why!

1 Like

There was one evening where I was confused because I was building improper lists and they were causing problems because I had no idea what an improper list was or why the cons operator seemed to be appearing randomly in my lists. Once I knew what they were and how to avoid inadvertently creating them it was not a big deal.

This article and the follow-up to it give some good insight into why iolists and the fact that improper lists work as iolists is very helpful at times. Phoenix make liberal use of this as the posts explain. I have a library that generates SVG and I can attest to how incredibly convenient this feature of the language is for building up a big sequence of strings.

4 Likes

Ha - I was going to reference that article for the same reasons! So I’ll reference this one instead…

Yes - improper lists make it super easy & efficient to build output from deeply nested structures, and then make it super efficient to write that output to a file or network socket. That’s why Phoenix has response times measured in microseconds rather than milliseconds. It’s all about maximising the work the runtime doesn’t have to do. It may be old-fashioned to think like that now you can rent infinite computing resources, but I wouldn’t say it’s legacy.

3 Likes