Is it possible to have lazy evaluation on list comprehensions?

Fl4m3Ph03n1x · June 2, 2022, 10:51am

Background

I have some code that uses list comprehensions in elixir and performs some long operations. I need this code to be lazily evaluated instead of running immediately (eager evaluation).

Code

Imagine I have a list comprehension like the following:

for a <- long_operation(),
  b <- longer_operation() do
  a + b 
end

Now, here a and b will be eagerly evaluated, so as to return a + b. However, both long_operation() and longer_operation() take a very long time to run (as the name implies) and I will only absolutely run them if I have to.

I want to come up with a solution that allows for my code to be lazily evaluated. Say, for example, to have a data structure or some other construct where I can then call run() to actually do the hard work.

Research

My first idea was to just put everything inside of anonymous functions:

for a <- fn -> long_operation() end,
  b <- fn -> longer_operation() end do
  a + b 
end

This worked as well as you would expect, i.e., it didn’t. The main reason being: “You cannot sum two functions”.
And it makes sense.

My next option is then to have a data-structure hold the values of these computations, and have the list comprehension return said data-structure.

I would then call run() or something similar, and then the operation would be executed.

You can probably think of this as the IO construct in languages like Scala or Haskell (pardon for the poor comparison).

Other people have suggested the use of GenServers to achieve this, but this avenue does not sit well with me for two main reasons:

I am adding a runtime dependency to a code that should not even know GenServers exist in the first place.
Also I fail to see how this would help in any way.

Problem

The issue here is that I believe this will force me to implement a mini AST with the instructions of what needs to be executed when I call run(). Not only do I lack the knowledge to do this, It also sounds overkill at first glance.

Surely, Elixir has a way to have expressions being lazily evaluated that I am not aware of.
So this brings me to the question:

What mechanisms does Elixir use to have lazy evaluation?
Are there any data-structures/libraries out there that do this? (lazy evaluation)

Please let me know!

Nicd · June 2, 2022, 10:55am

Have you looked at the Stream module? Stream — Elixir v1.13.4

Fl4m3Ph03n1x · June 2, 2022, 10:59am

I fail to see how this would help in my predicament. I am aware that Stream is lazily evaluated, but I don’t see how it fits in the puzzle of “A generic data-structure that works with list comprehensions and executes functions when I ask it to”.

At best I could call to_list in which case the stream to actually do something but I still don’t see how it fits with the overall dilemma I am facing.

Nicd · June 2, 2022, 11:02am

Ah, I did not realise using list comprehensions was a requirement. As you were able to modify the comprehensions (the function example), I assume you would also be able to rewrite that code to use Stream. I’m not aware of a way to make list comprehensions lazy in the way you require them to be. At most you could implement your own datatypes that have their own Enumerable implementation, but you wouldn’t be able to use a + b just like that inside the comprehension.

LostKobrakai · June 2, 2022, 11:05am

I don’t want to judge too hard here, but this – like a few prev. topics of yours – seems to come from a place of shoehorning features available in other languages into elixir. Doing things lazily is totally possible, but I don’t see why there would be the need to do so in a for comprehention.

What’s the problem with this?

a = fn -> long_operation() end
b = fn -> longer_operation() end
a.() + b.()

Anonymous functions and Stream, where stream is specifically for enumerables (so not single values) and it also often uses anonymous functions.

hauleth · June 2, 2022, 11:05am

Currently there is no support for that. There were proposals to allow for …, into: Stream, but AFAIK currently there is no work towards it AFAIK.

Fl4m3Ph03n1x · June 2, 2022, 11:26am

I am OK with that, in previous topics I did create some datatypes that implement the Enumerable protocol to a good level of success. It’s mostly a matter how getting the reduce part of the protocol to actually reduce what I want, which is easier said than done.

I am not sure how to reply to this.

On one hand, you are not wrong. I am absolutely exploring concepts from places that are alien to Elixir and playing around, testing the waters and seeing what I can use and what I cannot.
On the other hand, I don’t see this as being bad. I see this as being positive. I am also of the opinion that I am not the only person doing these kind of experiments.

When I first started creating a desktop application for Elixir many people suggested I moved away and that “this was not the purpose of Elixir”. And today we have things like elixir-desktop.
You may argue, “Yes, but you would have to be delusional to compare you silly questions to a project like elixir-desktop”, which I would agree, however the point I am trying to make here is that many of the new tools and functions elixir has now, come from alien places. Some of them stick around. Others don’t. I make no claim that anything I do will stick to anyone

But if it helps clear the air, then please be kind enough to consider this an academic exercise

This is quite interesting. Do you know where I could contact the developers and ask this question? Perhaps some mailing list?

LostKobrakai · June 2, 2022, 12:01pm

No worries. I do like the exploration, but I personally favor migrating usecases to ideomatic code within another language instead of shoehorning concepts over. That’s a bit difficult to do with just an abstract example though, as there’s not much to make tradeoffs by from “I’d like to have lazy language features”.

As far as I understand lazy evaluation is baked into both scala and haskell upfront. Elixir on the other hand doesn’t do that. Lazyness is introduced for Stream due to practical concerns – enumerating data, which is to big to fit onto the computer or even is infinite – and not as a “language level” concept. Anonymous functions can also enable lazy evaluation, but it’s also clearly differenciated at the syntax level and clearly not a lazily, single evaluated, variable.

You could consider a macro, which embeds lazy evaluation onto elixir. That would be comparable to e.g. what Nx did with defn, which runs elixir syntax with completely different underlying semantics, but that’s quite a bit of work, especially given it needs to include both the code for assigning the lazy variable as well as any code reading it.

You might be able to use two separate macros between marking something as lazy and reading the value, but unless you want to evaluate the code multiple times, you need to also think about retaining/caching the evaluated values somehow.

LostKobrakai · June 2, 2022, 12:37pm

I was interested how far this can be driven with macros and maybe arrived on an interesting solution:

defmodule Lazy do
  @type t(value) :: (-> value) | value

  defmacro lazy(do: ast) do
    quote do
      fn -> unquote(ast) end
    end
  end

  defmacro lazy(ast) do
    quote do
      fn -> unquote(ast) end
    end
  end

  defmacro resolve({name, _, context} = var) when is_atom(name) and is_atom(context) do
    quote do
      var!(unquote(var)) = Lazy.resolve_lazy(unquote(var))
      _ = var!(unquote(var))
      var!(unquote(var))
    end
  end

  defmacro resolve(not_a_var) do
    quote do
      Lazy.resolve_lazy(unquote(not_a_var))
    end
  end

  @doc false
  @spec resolve_lazy(t(value)) :: value when value: term
  def resolve_lazy(value) do
    if is_function(value, 0) do
      value.()
    else
      value
    end
  end
end

Usage

defmodule A do
  import Lazy

  def test do
    a =
      lazy do
        IO.inspect("eval")
        1
      end

    b = lazy(IO.inspect("1"))

    IO.inspect("Before resolve a")
    _x = resolve(a)
    IO.inspect("Before first resolve b")
    y = resolve(b)
    IO.inspect("Before second resolve b")
    z = resolve(b)
    IO.inspect("After second resolve b")

    resolve(return_lazy()) |> IO.inspect()

    {y, z}
  end

  @spec return_lazy() :: Lazy.t(integer())
  defp return_lazy do
    x = lazy(1 + 2)
    IO.inspect("after return value created")
    x
  end
end

A.test()
# "Before resolve a"
# "eval"
# "Before first resolve b"
# "1"
# "Before second resolve b"
# "After second resolve b"
# "after return value created"
# 3
# -> {"1", "1"}

Fl4m3Ph03n1x · June 2, 2022, 1:07pm

That’s the thing. To understand how to represent concepts in an idiomatic manner, I first need to understand these concepts in their original state and then see how they behave in the language I am using them. Then, I study “how do people in this language deal with this concept X?” and from there I draw my conclusions/learning.

Sometimes I end up using these concepts in their original state.
Sometimes I adapt the concepts I am studying and use the idiomatic manner.
Sometimes I decide I don’t need to use the concepts after all.

This process of learning is not a “one size fits all”, I understand, therefore I am always grateful to anyone willing to go along for the ride

Your macro example truly is amazing and has prompted me to write this quick thread summary.

With everything we have discussed thus far, I feel like a quick summary makes sense:

Lazy evaluation is not possible in Elixir list comprehensions.
Elixir list comprehensions may one day have support for Laziness via Stream, but thus far this has not happened.
Elixir does not support laziness by default. It’s closest thing to supporting this is in the form of Stream and Anonymous functions. Other languages, like Scala and Haskell do support this concepts by default as this concept in part of their very essence. In Elixir, laziness exists because of specific use-case needs and therefore is more limited than in the previously mentioned languages.
One can, via the use of Macros have support for some level of laziness.

These conclusions support what I have been learning thus far. Basically, every time I see a functional library with Monads and such, people always use Macros. I never understood why there was such a need in the first place, but this topic about lazy evaluation makes it clear now!

msimonborg · June 2, 2022, 1:59pm

FWIW, I think the difference is in how one asks the question:

“How do you implement this feature/concept from X language in Elixir”
vs.
“In X language you can solve this problem with Y feature/concept. How do you solve this problem in Elixir?”

The first question is centered on how to use concepts and solutions in a language that maybe doesn’t natively or easily support it. The second question centers the problem itself and leads to answers that go directly to idiomatic solutions to common problems. The second question will still produce answers about how those original concepts can be used in Elixir if they exist.

Fl4m3Ph03n1x · June 2, 2022, 4:11pm

So, for the sake of being precise:

How do you implement an IO Monad from Scala in Elixir?
VS
In Scala you can push impurity to the edges of your system with the IO Monad. How do you solve this problem in Elixir?

I can see the second one would produce some juicy answers. However I also fear people will argue “This is not Scala, leave it be” and kill the discussion right there. It is mostly for this reason I try to formulate my questions in a more generic/broad way. My objective is not to get stuck in the details of Scala/Haskell/Python/OCaml/ReasonML (or whatever language I am using to study concept X), but when I make questions so specific, readers tend to get stuck into that.

However … how would you answer question #2?
I am truly curious

msimonborg · June 2, 2022, 5:29pm

Interesting question! I don’t have a CS background or even a basic knowledge of FP outside of the context of Elixir. To be fair, question 2 still seems very much colored by a problem framed within the idioms and semantics of another language, so I’m not sure I’ll provide a good answer. As you said before, this can be considered more of an academic exercise. Your original post seemed to be more concerned with deferring an expensive operation until it’s needed, not with functional purity and referential transparency as I understand to be the domain of IO monads. Please correct me if I’m wrong, I’d love to learn!

At the risk of speaking to concepts I am fairly uneducated about, the best I can say is that when it comes to dealing with expensive work you might draw power from primitives like lazy evaluation and monads in other FP languages to defer the work until you need it. Elixir and Erlang’s power comes from the primitives of concurrency, parallelism, isolation, and state management. With Elixir you might parallelize that work or process it in the background and store the state for when you need it. Clearly the best solution to that particular use case is different according to the semantics of the language. To “push impurities to the edge of your system” sounds like a more abstract problem discussed in the context of more pure or academic varieties of FP, and maybe Elixir is not the best tool for the job if that’s the problem you need to solve. Just my two cents, I’d love learn from others that are more knowledgeable on this

brettbeatty · June 2, 2022, 6:14pm

I actually made a macro for stream comprehensions a few months back as an academic exercise. Under the hood it just hooks up Stream functions.

JEG2 · June 2, 2022, 6:53pm

This looks like some code that you would find in a lazy language to ensure order of execution. However, that doesn’t make much sense in Elixir where this means the same:

a = long_operation()
b = longer_operation()
a + b

I’m not criticizing. I’m suggesting that a more realistic example may lead to better ideas for solving the problem in idiomatic Elixir.

Other ideas:

lazy =
  for a <- fn -> long_operation() end,
      b <- fn -> longer_operation() end do
    fn -> a.() + b.() end
  end

# ...

lazy.()

Removing the for that I don’t understand:

lazy = [&long_operation/0, &longer_operation/0]

# ...

lazy
|> Enum.map(&apply(&1, []))
|> Enum.sum

I would probably favor some sort of Stream manipulation:

a = Stream.unfold(0, fn 0 -> {long_operation(), 1}; 1 -> nil end)
b = Stream.unfold(0, fn 0 -> {longer_operation(), 1}; 1 -> nil end)

# ...

a
|> Stream.concat(b)
|> Enum.sum

Fl4m3Ph03n1x · June 3, 2022, 9:51am

I understand people here truly want something more concrete and less general. Let’s then consider this piece of code (Scala):

def schedule(attendees: List[String], lengthHours: Int): IO[Option[MeetingTime]] = {
  for {
    existingMeetings <- scheduledMeetings(attendees)
    possibleMeeting = possibleMeetings(scheduledMeetings, 8, 16, lengthHours).headOption
    _ <- possibleMeeting match {
        case Some(meeting) => createMeeting(attendees, meeting)
        case None => IO.unit
      }
  } yield possibleMeeting
}

Now this piece of code would somewhat be equivalent to (Elixir):

@spec schedule([String.t()], integer()) :: IO[Option[MeetingTime]] 
def schedule(attendees, lengthHours) do
  for existingMeetings <- scheduledMeetings(attendees),
    possibleMeeting <- possibleMeetings(scheduledMeetings, 8, 16, lengthHours) do
    case Option.head(possibleMeeting) do
      %Some{} -> createMeeting(attendees, meeting)
      _ -> IO.unit() #does nothing
    end
end

This is an example from one of the books about FP that I have been reading lately. It uses Scala, but it could as easily use Python (for example).

The import things here are:

The spec clearly tells me this function talks to the outside ugly world
It uses an Option type, that plays nice with the list comprehension
It uses an IO type that does plays nice with the list comprehension
The IO type runs nothing. To actually do something you would have to call:

program = schedule(["person1", "person2"], 1)
program.run()

List comprehensions here matter because they allow all the datastructures that obey certain protocols to play nicely with each other (Option, Eihter, IO, Reader, etc …).

Now, I am not defending that Elixir should be coded this way.
This is Elixir. It’s not Python/Scala/Haskell/(insert language here).

However, this idea of:

pushing functions that do things with the outside world pushed to the edges of my system (functions with side effects are usually pushed to the top layer of the system, meaning you are likely going to have 1 file where you simply call program.run() and things happen then)
Having referential transparency (hello easy tests)
Helping the compiler to be smart enough to help me (yes, even dialyzer)
Having a language concept that makes all of the above play well together

This is something that I am trying to explore here.
I am using list comprehensions because other languages use them too, and my Macro knowledge is not great enough. I am simply going for the easiest implement I can think of.

Your insights are always welcome

LostKobrakai · June 3, 2022, 10:31am

Elixir doesn’t natively deal with (explicit) monads and it also isn’t statically typed, which means problems will be solved a lot different than in a language where monads and static types are a thing.

I feel like the ask for more concreteness here is more about this piece of code, not so much the schedule function. What do you expect to gain from this being lazy? As is it’s not really different to just doing schedule(["person1", "person2"], 1) without lazyness. Your list of important things doesn’t really map well to elixir due to monads not being a thing.

This is not enforcable without static typing, which doesn’t exist in elixir.

Elixir commonly uses error tuples like {:ok, result} | {:error, details} instead of an option monad.

I’ve only a vague idea what an IO monad does, but again without static typing there’s no way to enforce pure vs non-pure differenciation for code.

“To actually do something” you call a function in elixir. If you don’t want to do something you don’t call it. For passing around a reference to some function you either use anonymous functions or mfa tuples.

So maybe the most direct mapping to your example would be:

program = fn -> schedule(["person1", "person2"], 1) end
…
program.()
# or
mfa = {SomeModule, :schedule, [["person1", "person2"], 1]}
…
{m, f, a} = mfa
apply(m, f, c)

… where schedule is a plain old elixir function without any fancy lazy evaluation to start with. But again, what’s the reason for the need of lazy evaluation here. Why not call the function later, when needed.

Fl4m3Ph03n1x · June 3, 2022, 11:53am

What is the property of static typing that makes it a requirement for me to know if something is talking to the outside world?

According to my understanding, if I say function returns X and my function never does, dialyzer will pick it up. This is what I am aiming at: to help dialyzer.

Elixir commonly uses error tuples like {:ok, result} | {:error, details} instead of an option monad.

Not to be pedantic, but this is actually the Either monad (aka, result monad or error monad, because it differentiates between an output that is OK and one that is not), the Option monad is usually something like {:ok, reuslt} | :ok (where :ok can be any atom you want. Some people also use nil).

While studying Monads, I found out (thanks to the community) how this concept is represented in Elixir. This would have never happened if I hadn’t started studying Monads, nor would it have happened if I hadn’t asked.

Now I am trying to find out if something similar exists for IO, Effect monad.

I see. Let’s consider this code:

program = fn -> 1 end
program2 = fn -> 2 end

# Obviously, will not work
program3 = program + program2

Lets also say I have a dialyzer type: program(t) :: (-> t).
Now, both program and program2 are of type program(integer).

Is there a way, to have program3 be of type program(integer) and help dialyzer figure it?

The main reason I am insistent in this is that I have this belief (perhaps incorrect) that IO type is nothing more than a data-structure and that there should be no issue to implement it in Elixir for list comprehensions as long as I have the right Enumerable protocol implementation.
Please correct me if I am wrong :S

lud · June 3, 2022, 1:05pm

IO is impure in Elixir, it is implemented with messages. I am not sure you can use a monad for that. As soon as IO code is called, IO operations take place.

I don’t understand why you code has to use list comprehensions. I don’t know Scala, but that for looks like a for loop, not a list comprehension, i.e. it is used to run imperative code. Elixir’s for is a true list comprehension so it is targeted at building lists, not streams.

I know the two concepts are similar, but they are not the same.

edit:

IO type is nothing more than a data-structure

There is no IO type. There are ports, iodata() :: iolist() | binary() and iolist() :: maybe_improper_list(byte() | binary() | iolist(), binary() | []).

Fl4m3Ph03n1x · June 3, 2022, 2:46pm

It is not a for loop, but a classic list comprehension in Scala (also called for comprehensions). Here is a decent article in case you feel curious:
https://docs.scala-lang.org/tour/for-comprehensions.html

As I mentioned:

I am using list comprehensions because other languages use them too, and my Macro knowledge is not great enough. I am simply going for the easiest implement I can think of.

I am aware. Thus the discussion.