Which (realistic) feature(s) would you most like to see added to LLMs?

,

My personal favorite is the ability to answer how probable it is their answer is correct, or simply to be able to answer “I don’t know”.

This is what I’ve been asking for and the answer why I won’t get it.

https://x.com/heynavtoor/status/2030010676157239600

Which (realistic) feature(s) would you most like to see added to LLMs?

The ability to run completely on our own (affordable and completely private) devices would be nice.. wonder how many years before that will be possible? (Perhaps a good topic for a poll/thread!)

2 Likes
  • Realtime memory

  • An actual model for simulating the world it lives in, instead of just being an (amazing) next token prediction function.

1 Like

Yes, that’s my second favorite for now but may as well become my #1 soon.

UPDATE:
Actually, the limited way I’m using it now is mostly because there’s no ability to use them privately. I’m also amazed by how many people don’t find this problematic at all (i.e. “mea casa tua casa” but for code).

1 Like

Is this realistic?

1 Like

I don’t think so… but LLMs (and most other technology, really) weren’t realistic either, until one day they were.

It could take a week, a year, a century, or it may never happen. But it would be nice (and probably also terrifying) if my “wish list” became a reality.

1 Like
  • Memory, (much) larger context, consistent predictable performance.
  • Understanding diagrams, charts, formulas etc.
  • A conceptual understanding of the real world. How it works and why it works. This would improve quality, and extend AI for use in many real world areas. Chair design, car dynamics, road work planning, material science, medicine etc.
  • Adding curiosity, direction and self-learning. Point it in the direction of curing cancers and have it deep dive for a year or three. (Likely comes up with 42).
  • Critical sense and keeping a tree of relations and truths from the ground up from hard basic truths. Say this new source claims this - how does it fit in with what I already know, how do I know what I know, how sure am I about what I already know, and if in conflict what seems more likely to be correct? Can both be true if a third concept is changed? How do I know that one and so on. Then keep an open mind on the options for later in case new evidence shows up in support of one or the other. Only discard claims in conflict with hard truths.
2 Likes

Again, is this realistic?

Ok, but what would you like added it to it now? Something that’s no too far fetched or even yet to be proven possible?

I would say yes. The first two are possible or very close today.

Understanding the real world is of course (currently) limited by the reach of our own understanding. All topics that are well established are also well documented, and for sciences like math and physics quite exact at everyday levels. (Then there are off cases like stock markets which would change logic overnight if someone “solved” it).

My big picture thinking is that if AI can have a latent space where the word shoes, drawings of shoes, pictures of shoes live closely together then it should also be able to hold why, how, and other more functional conceptual information too. Say you add ‘shoes’ to ‘running’ you would end up near ‘running shoes’. Instead of just brands, shoe models and pictures of shoes there could be knowledge of why they are formed like they are, the functionality, production methods, materials, relation to feet and so on. And from such basic building blocks, and their relations, I think we could have something much more useful. And I have no doubt other models will be better at it than my 5c brain droppings on a Sunday morning. :slight_smile:

Curiosty could flow from that. Say the model have all these latent spaces of hammers, saws and wooden buildings, but notice it has no concept of how or why in that space. Well, if it is curious, it might aim to fill in those lacking how and whys. And the next time someone ask about an image of carpenters hammering that can be guided by how a hammer functions and how it is used. Or curiosity, when combined with a knowledge tree, could try to branch out further based on existing knowledge. And if possible extend existing knowledge where needed. Direction would be where to do this and towards what goal.

As for critical sense and knowledge tree I think that is possible although a data structure, methods for estimating certainties, how to face uncertainties, and how it used would take some doing. And as AI increasingly face feeding on their own lesser AIs earlier creations, which may or may not be quality food, I think critical sense will just become more important.

And maybe more to the point - I think it is realistic, and I don’t think it is fair I should need to know all the answers for it to be so. If I am even close then the real edge will be years ahead already. :slight_smile:

An actual model for simulating the world it lives in, instead of just being an (amazing) next token prediction function

At this point you’re asking for something else, not LLM right?

1 Like

A bit of real world knowledge could avoid things like this one. (Or for purists of pure text use - texts describing scenes like this one).

There is so much right with this image, to an amazing level, but then there are hammers and how to use them. Major flaw in the fabric of reality right there.

1 Like

But the manner (mechanism) in which LLMs learn and memorize relations and associations is infinitely far from achieving reasoning, hence it’s not reasoning at all but just spitting out the closest of associations as they appear throughout its learning process.

The suggested would not be an add-on to LLMs, but a replacement for them.

But this is understandable given the way it works.

There is a move towards multi-modal LLM using other media than text as well, so to the extent those are LLMs too then going beyond text seem fair game? Large language model does frame it as a language focus though, so maybe such world understanding models should really be called Large World Models.

When thinking of realism of features I wanted to add I didn’t restrict myself to LLMs inner working or limits today as that is likely just transient technology anyway. Rather I thought of LLMs as generators of meaningful text, code, and other media too if multi modal is fair game. (Sound certainly should be for language? Sign language for deaf people is visual). Maybe that is too liberal a definition, and maybe my suggestions would go beyond LLMs, but that is functionality I want regardless of name. :slight_smile:

1 Like