It’s going to take me some time to look through all the material that came together so until then, thank you for all the great ideas and contributions to the discussion!
Another idea is to explore quantization. Although I don’t know how easy or hard it would be and if doing it in Nx as a tensor compiler has enough novelty.
If anyone is tracking this, I built this Gist with slightly more concrete ideas (which were asked by other students): ideas-2023.md · GitHub
That list of ideas seems pretty cool, but I can’t help noticing that they are very much like project ideas. I don’t know if other students would find it useful, but I think a similar list of ideas for Elixir and Erlang in general would be useful too!
I’ve wondered whether some of the code that Jeff Hawkins and Numenta are writing could be adapted to Elixir, Nx, Axon, et al. For example, it might be interesting to use Elixir processes to model (sets of) cortical columns…
One of Hawkins’ areas of interest is cortical columns. In 2016, he hypothesized that cortical columns did not capture just a sensation, but also the relative location of that sensation, in three dimensions rather than two (situated capture), in relation to what was around it. Hawkins explains, “When the brain builds a model of the world, everything has a location relative to everything else”.
In 2021, he published A Thousand Brains: A New Theory of Intelligence, a framework for intelligence and cortical computation. The book details the advances he and the Numenta team made in the development of their theory of how the brain understands the world and what it means to be intelligent. It also details how the “thousand brains” theory can affect machine intelligence, and how an understanding of the brain impacts the threats and opportunities facing humanity. It also offers a theory of what’s missing in current AI.
Wow this is cool. I had the similar idea that Elixir could be big in AI if we can bring the the same energy of “million web servers” to ML. For instance, we could build a giant model of thousands of transformers talking to each other simultaneously and this is just not practical in Python.
I have a project to suggest, but it isn’t strictly related to AI. The general context is Erlang (etc) use cases that involve large numbers of small, infrequently activated, processes. The task would be to:
- characterize the VM’s virtual memory behavior
- document (and explore?) ways to optimize this
For example, let’s say that we were trying to model 1M or so cortical columns. Since only a small fractions of the columns are likely to be relevant to a given “thought process”, they won’t be receiving messages (and thus running) very often. So, it would make sense for their memory to be paged out.
One question we might have is: Which portions of a process’s memory footprint have good spatial locality of reference (and how much room do they tend to occupy).
Following up on this, we could ask what techniques can be used to improve the (spatial or temporal) locality of reference, etc. Aside from having some interesting and useful things to report, a thesis of this nature could document existing tools and possibly present new ones.
i don’t quite get the idea of cortical columns and im not so sure if implementing them as processes makes sense. if a group of cortical columns can be represented as tensors, a process should manage many of them so that this could be performant enough. it seems that the memory locality optimization could only be applied to the tensors managed by a single process because the processes’ interactions are dynamic.
I have not done a lot of study/work in this area, though I am curious about it. My readings of science fiction and of how the brain works makes me curious about some of this in a different way. How might Axon models work if they were organized hierarchically like a supervisor tree? Could the output of one model be reported/fed into another to control/manage/improve that model? Or, could the outputs of multiple models be combined for yet another model to summarize/process? Or, even more broadly, how might one Axon model tree communicate with another?