Genetic Algorithms in Elixir Book Club!

AstonJ · March 25, 2023, 8:59pm

Welcome to our Genetic Algorithms in Elixir Book Club!

With AI set to play a big role in our industry we’re lucky to have Nx, so here’s our first Nx related book club, on Genetic Algorithms in Elixir (PragProg)!

The book cub is being led by @stevensonmt and participants are:

Everyone is welcome to join the book club or to simply comment on anything you find interesting

code-shoily · March 25, 2023, 9:08pm

Hello everyone! Excited that we’re doing this. Looking forward to learning from you all and the book!

BradS2S · March 25, 2023, 9:09pm

I’d love to be a member

m4hi2 · March 25, 2023, 9:43pm

I have this book and this will be a really cool way of reading it. Looking forward to learn from you all!

stevensonmt · March 25, 2023, 9:45pm

Thanks for setting this up @AstonJ. I’m very excited about this book. I’ve started the book but only gone through the first couple of chapters. I’ll probably try to post every Wednesday with thoughts about the portion of the book I got through that week. I hope everyone else will do something similar. I know life is probably going to interfere at some point and I’ll fall behind, plus most, if not all, of the participants are going to bring more technical acumen to the discussion.

So far I’ll say I really enjoy the writing style. Concise without being dense. Precise without being obscure. I also like the repetition of the basic concepts between the first two chapters. Really helps to cement the ideas for me.

Looking forward to the coming weeks and hearing what everyone has to say about the book and the topic in general.

dr.a · March 25, 2023, 10:16pm

I’ll join in too!

code-shoily · March 26, 2023, 12:29am

I am starting chapter 1 now. Will update y’all tomorrow

bdarla · March 26, 2023, 10:21am

I went over the first chapter. It comprehensively introduces the core steps of a genetic algorithm. The provided solution of the One-Max problem demonstrated that as the size of the problem scales up, it is necessary to consider further optimisations that I seek to find in the rest of the book. I tried to manually explore alternative values for the hyper-parameters but I was not overwhelmed by the performance results.
I looking forward to study the rest of the book.

P. S. An interesting question for me is whether genetic algorithms are suitable for matching two pieces of text, for example, in order to assess whether a patent claim extends beyond the original patent description.

code-shoily · March 26, 2023, 5:41pm

Hi, I have a question, I tried running the one_max.exs script without mutation and it almost always gives me 1000 where the book mentions that it will be difficult for me to reach that. Adding mutation also gives me the same result ~~(although it’s slower)~~. So I was wondering if any of you could get the result matching what the book wrote?

UPDATE: I added an iteration counter and seems like (again based on the few tries I did), WITH MUTATION reaches the best result in ~50% less iteration count.

bdarla · March 26, 2023, 5:59pm

Indeed, also in my experiments the algorithm always reaches 1000 (even when disabling mutation).

code-shoily · March 26, 2023, 6:17pm

I first wondered if it had to do with high machine specs but the book isn’t an old one and I think this has little to do with machine power and more to do with the way random number gets generated?

If I decrease the chromosome count but keep the bast value the same (thereby making it impossible to reach) then I get stuck at “current best: 10”. The algorithm doesn’t “stop” stop, but rather the data become stagnant and the algorithm tried forever. I tried multiple times but it is a solid 1000 each time

fceruti · March 26, 2023, 6:29pm

I was getting cozy in my chair, still blowing my hot tea, when I found this bug in page 1:

130 trillion paths → 1.3 trillion paths.

Quite a lot paths as well, but still, it’s a two-order magnitude error.

code-shoily · March 26, 2023, 6:34pm

Good catch. Post an erratum maybe?

Update: Added the link here for future reference. Also, clicking erratum on my pdf sent me to the book link and then I had to click again to get to the DevTalk link.

code-shoily · March 26, 2023, 6:51pm

I wanted to try out some VegaLite on this and these are what I got:

Without Mutation:
visualization

With Mutation:
visualization-2

I did not configure the chart for look pretty though, just added:

chart =
  Vl.new(width: 400, height: 400)
  |> Vl.mark(:line)
  |> Vl.encode_field(:x, "x", type: :quantitative)
  |> Vl.encode_field(:y, "y", type: :quantitative)
  |> Kino.VegaLite.new()
  |> Kino.render()

and tweaked the algorithm to look like:

algorithm = fn population, iter, algorithm ->
  best = Enum.max_by(population, &Enum.sum/1)
  IO.write("\rCurrent Best (# #{iter}): " <> Integer.to_string(Enum.sum(best)))
  Kino.VegaLite.push(chart, %{x: iter, y: Integer.to_string(Enum.sum(best))})

  if Enum.sum(best) == 1000 do
    best
  else
    population
    |> evaluate.()
    |> selection.()
    |> crossover.()
    |> mutation.()
    |> algorithm.(iter + 1, algorithm)
  end
end

The one without mutation nearly broke my LiveBook though

stevensonmt · March 26, 2023, 8:02pm

I think the algorithm will always run until it reaches the terminating condition, right? The only terminating condition was for Enum.sum(best) == 1000 so it should run until it hits that or maybe hits some sort of timeout. On my machine it definitely slowed down in progress around 950 but eventually reached 1000.

My notes as I was going through chapter 1:

Basic concept/flow of genetic algorithm:

Initialize population
Evaluate population (apply fitness function)
Select parents
Create children (crossover)
Mutate children (introduce randomness)
Loop back to evaluate

Population size (i.e., number of chormosomes)

larger population → longer transformation step
smaller population → more generations needed to produce soln and more
likely to have premature convergence

Premature convergence

focusing on solution that appears “good enough” when more optimal solutions exist
Mutation helps fight premature convergence

Initial example shows how to get to a solution when you know what the answer is, but how
will genetic algorithms work when you don’t know the solution ahead of time? ??

Fitness functions should be sorting/ordering (?)

Single point crossover I assume is faster per generation but may lead to more generations needed if initial population is small.

Mutation

Mutation probability in example is 5%.

Impact of higher/lower probability of mutation?
- I assume higher probability leads to slower optimization as higher chance of undoing
  progress, however it might serve to counterbalance low initial population size/diversity
The example mutation shuffles entire chromosome. How does single gene mutation or transposition mutation impact performance?
- Again, I assume that broader mutations lead to more diversity and can offset small initial
  population size at expense of potentially requiring more generations to acheive optimal solution

code-shoily · March 26, 2023, 8:09pm

Chapter 1 is read!

This was a great read. I don’t have any prior exposure to genetic algorithm, other than my university AI course that happened 20 years ago. So, this was quite a fun experience for me.

My initial reaction was to read the prose first and take a stab at the code later, but the way the example was arranged, I felt it would be more convenient if I experiment with each function and their result on a shell, and what better shell than LiveBook?

I experimented with each artifacts (i.e. population, evaluate, selection, crossover, mutation (when it appeared after the interval), algorithm). I even came up with an acronym (P.E.S.C.A), until M for mutation came along and mutated my acronym into an unpronounceable one. I keep thinking evaluate could evaluation to belong to same parts of speech class as all other functions (or selection → select, mutation → mutate etc). Or maybe that’s just me.

I really liked how, for the first time in my life, I understood what an AI book is trying to tell me in one run. The writing style is enjoyable and the introductory elements fit right in. The presentation, both in terms of english and code, are very easy to visualize. I have not seen any genetic algorithm code in other languages but I felt like both the recursion and piped workflow is perfect representation. I eagerly await reading next chapters to see if this can turn into a generic pattern for larger set of problems.

I have discussed the code running completely without mutate (or is it mutation?) at all attempts I made (contrary to what the book said), but one update is that, when I tried the same on 10, 100 on REPLIT (1 CPU, OTP 24 etc), then it kept dancing around 82, 87 for 200k iterations … but that aside, running all the code is strongly recommended if you ask me, even better with LiveBook. It gives a “feel” for what the author is trying to explain.

That’s my brief summary of my experience with chapter 1. A bunch of things I would like to do (Please folks, tell me if this would be futile):

Create a repository of livemds for the code
Try using the Nx data structures and functions to solve the same problems, who knows, if chapter permits, Explore?
Produce VegaLite charts.
Benchmark things on different backends (I will admit, I don’t even know what backends will do for these problems)

Anyways, I will try to pick one or more of those and share it with y’all. That’ll be all for my chapter 1. I briefly read chapter 2 and already like the way it built upon Chapter 1. Thank you @AstonJ and @stevensonmt for doing this.

stevensonmt · March 27, 2023, 2:43am

Maybe not that complex, but chapter 3 has an example of text matching.

stevensonmt · March 27, 2023, 2:45am

What do you mean by iterator here? I’m mostly familiar with that in the context of Rust and think of anything that’s Enumerable in Elixir as being roughly analogous to Rust iterators via the Enum module.

code-shoily · March 27, 2023, 3:03am

My bad with quick typing on the phone. I mean ‘iteration-counter’ - something that keeps track of each iteration (I mean, every time algorithm is called). Thank you for catching this.

fceruti · March 27, 2023, 3:19am

Thanks, I just left an errata there, I was unaware that medium existed.