Expectations for GPT-4?

Today (few hours ago) OpenAI released their new language model, GPT-4.
What are your expectations for this release?
It has significant better performance, and improved context understanding.
Also, I’ve been wondering if there is Elixir first AI that works like copilot :thinking:
What are your thoughts?

I’ve been using a ChatGPT plugin for Neovim and have found it to be useful for generating basic Elixir code - genservers, filters, and reducers - better than starting from a blank screen.

Not sure what GPT4 will bring.

1 Like

Do you know whether this plugin uploads your source code remotely in order to make suggestions or is it completely locally?

I believe the plugin I linked to calls out to the open-ai api using curl.

1 Like

Maybe you’ll be lucky enough to have that code regurgitated back at you in GPT-4.

3 Likes

My expectations are that many thousands of forum topics (not here in particular but likely here as well) and tweets will appear where people passionately claim that this program “deeply understands” stuff. :003:

In all seriousness, I appreciate these tools’ ability to produce boilerplate so you don’t have to go check how to make a new empty GenServer or such. Beyond that, I have no expectations.

Would love to be proven wrong. Some of my past customers have allowed me to keep old versions of their code – provided I don’t distribute them and only use them for educational purposes, of course – and I’d be curious if I can run some of the “AI” tools on them and have them tell me if there’s a bug in this or that logic. If so, then and only then I’d be actually excited.

6 Likes

Not sure what my own expectations are but was interesting reading Musk’s in this story on DT the other day:

His quote in bold:

On Friday morning, Elon Musk responded to a comment on his favorite website about OpenAI, the research laboratory-turned-company that he co-founded along with a number of other tech luminaries in 2015. Following up on a tweet by finance writer Genevieve Roch-Decter, Musk wrote “OpenAI ws created as an open source… non-profit company to serve as a counterweight to Google, but now it has become a closed source, maximum-profit company effectively controlled by Microsoft. Not what I intended at all.

:upside_down_face:

4 Likes

My expectation (and probably beyond v4) is that inside VSCode, while I’m working on a module, there’s a side tab, that is giving me all sorts of information based on cursor position:

  • [compute] This function is log O(n), n being number of products
  • [compute] This function calls a database 3 times.
  • [compute] This function call these external services: …
  • [docs] Here’s the docs for all the functions you are calling.
  • [docs] Here’s the docs for things you might need.
  • [popular examples] You are using Enum.reduce, here is how it’s used in popular projects.
  • [broad examples] Seems like you are trying to ***, here are some examples of how to do it:

Not necessarily AI-driven, but wishful-driven:

  • [local refs] This function is being called by:
    – directly —
    – indirectly: —
    (and for each one of those, probable type clashes)
  • [local examples] YourProject.make_things_happen is being used like this:

That’s what I would call a true copilot.

10 Likes

I haven’t been looking into AIs much, but to me the first thing they will be able to automatically do (and better than humans) will be generating comprehensive test suites from a codebase.

1 Like

I have tested it and I have got back “template” or “boilerplate” code. Not bad at all.
For example, i asked “simple Erlang nif in rust for elixir” and GPT4 gave me an answer that I need Rust, rustler, create a rustler module with mix, (simple) Rust code, an Elixir module and iex code to test. All with links and code, easily to be copied.

So overall not bad and much better than GPT-3.5.

Atm it is not a “danger” for developers: Architecture, special requirements etc. is still with us. But as a first code example within a minute (with ChatGPT plus) it is a good help. And easier than to go through several links in Github and blogs.

2 Likes

Hmm, this would be a dream tool for developers :thinking:
But I think GPT-4 can’t do that because of tokenization, on small projects (it’s somewhat possible but anything beyond 40ish files is out of GPT4-32k context capability)

Overall, sounds like an interesting project to work on :star_struck:
And regarding space-time complexity well halting problem :smiley: but I’ve seen GPT-3.5 have a good guess on Elixir code and big O so maybe an estimation giver :thinking:

My prediction is that it will be used to write lots of code that the author doesn’t understand, and that folks will continue to insist that a fancy autocomplete is “sentient”.

7 Likes

There’s a cut-off on data it was trained on. It was 2021 for GPT 3, no idea for GPT4

Coupled with the fact that it always tries to generate a plausible-looking answer, expect it to generate a lot of garbage, especially for something reasonably new or unpopular. I tried to ask it to generate something with Ash Framework, and it generate actual correct code… using Ash modules that have never been in existence and have never even been discussed.

I expect GPT4 with it’s better generative model to be even worse in this regard. [1]


Then there’s the problem of choice. If you have many ways of doing something, you will get wildly different results for the same input.

Just today I was playing with it (again, v3, not v4, but I expect the outcome to be the same). The query I had was “using latest Java, and the best libraries, write a Telegram bot integrating with OpenAI”. Running this query three times gave me implementations with OkHttp, unrest and httpclient. It looked plausible, but I didn’t check it.


So, if you’re willing to babysit it like you would a junior dev, and code review every step of the way, and use rather outdated libraries for some common boilerplate code… then it’s pretty good.

[1] This is true for programming and significantly worse for anything outside programming. Basically, as long as it’s popular, non-technical, and in English, the results will often blow your mind. Since the modern world (especially online world) is becoming so predominantly English-speaking, LLMs have an insanely large trove of data to pull from. For anything else the results they produce will very quickly become invalid garbage even if there’s at least some data on the subject available in English. E.g. try asking it about relatively non-popular non-English literature. Same goes for programming languages and technical topics.

2 Likes

Since the input size of GPT4 is ~25000 words you can put the current library into the prompt and ask specific questions. I think this can be a great learning resource. I already created a draft PR to ex_doc to directly create a single file from the docs.
I’m curious what you guys think :slight_smile:

1 Like

The issue of outdated info still remains. Quick googling shows that even GPT-4 is rained on data up to September 2021. So it’s useless for any info in the past 2.5 years which includes some of the biggest announcements and releases in Elixir: the vast majority of Nx, Livebook, etc.

1 Like

I guess there’s plenty that can be done without AI.

Just basic ad-hoc documentation and examples would be amazing!

1 Like

True, outdated training data is a challenge, but GPT-4 can still be useful in many situations. Combining AI with ad-hoc documentation, examples, and community-driven content could provide a more comprehensive solution for developers. Excited to see GPT-4’s evolution and its impact on the developer community.

1 Like

My experience so far has been that GPT has been fine, or even quite good, for more “mainstream” languages, JavaScript, Python, maybe Rust, but useless for Elixir. The Elixir code it produces is almost always garbage. I don’t mean bad code but, not Elixir. It usually writes weird hybrid of Elixir and Ruby.

My expectation of 4 is pretty much the same unless they have used a much broader training set.

1 Like