I am trying to use Meta’s Llama 2 with Bumblebee, but I am getting a 401 error when I try to load it. I have been granted access to the repo on Hugging Face, but I think I need to provide an access token when loading the model, and I am not sure how to do so.
** (MatchError) no match of right hand side value: {:error, "HTTP request failed with status 401"}
(stdlib 5.0.2) erl_eval.erl:498: :erl_eval.expr/6
#cell:oqtu7kdr36by6ud4yug4rxbxs3qw544i:15: (file)
Actually, I don’t know what version of Bumblebee I was using, as it’s been long enough that Hugging Face deleted my notebook. But I will certainly look into that if I build out a similar project in the future. Thanks for the info!
I was able to get “NousResearch/Llama-2-7b-hf” working using 0.3.1. Does anyone know if we can use GGML locally so i don’t have to wait for 5minute inference for two sentences on my M2?