I have just published v0.4.1 with
- A first cut at adding all the new beta api calls to the user guide.
- DALL-E-3 support in the
- changes to the FQN for some modules (although I did not bump up the major version)
if there’s anything missing or not working as it should, please file an issue (or better yet, a PR )
Still catching up with the new features.
I just published v0.4.2
- Added the Text-To-Speech endpoint
- Changed the FQN of the Audio and Image functions to match the equivalent Python functions (and yet, I left the major and minor versions unchanged )
- Updated the docs.
Anybody built on top of the new assistants/threads API yet? What’s the sequence of calls you should be making to get the threads use case working? It’s not linear anymore as it requires you to periodically check if a run is complete. OpenAI can now suggest you to make multiple function_calls and these can be executed in parallel. Is using a GenServer for every thread that keeps track of all this state and acts as a bridge between UI ↔ Phoenix App ↔ OpenAI a good idea?
I’ve been hearing good things about the zephyr model created by HF, and was curious whether something that capable could be run locally on my machine.
since there are multiple apps that proxy the openai api to other models (including local models), this seems to be quite straightforward. the only thing that needs to be changed is to parameterize the
I have done this now with the function
OpenaiEx.with_base_url/2 to parameterize the api base URL. I tested it against the
llama.cpp-python library and the non-streaming versions of the completion and chat-completion apis worked on the first try. the streaming versions seem to be causing some difficulty still. i will try the streaming calls against another proxy when i get a chance, to check if it’s a bug in the proxy.
my current dev machine is a mac with 8GB and it turns out that’s not enough to run a 7b model as well as a docker dev container (and assorted other apps). i suspect 16 Gb would work.
if anyone wants to try local LLMs via an OpenAI API proxy, please give the library a whirl and let me know what you think.
i have not published this to hex as yet, so the mix entry has to point to the github main branch for now.
Thanks for the work on this Library. I have a couple questions.
- Do you think
openai_ex is a good fit for handling requests in a production app instead of just LiveBooks?
- Is exponential backoff supported via
finch to handle rate limiting issues?
@arton sorry for the delay, I’m not on the forum every day, so I didn’t realize you had asked the questions. Here are my 2 cents.
- I don’t see why it couldn’t be used for production. It’s just a very thin wrapper over the JSON RPC api. If there are some changes that need to be made, I’m happy to make them. For instance, there may need to be some way to handle Finch pools, but these design decisions are best driven by an actual use-case, so I haven’t tried to add them in ex-ante.
- The library doesn’t do any exponential backoff. I’d prefer to leave those kinds of decisions to the library user and/or whatever Finch does.