Persisting Livebooks on HuggingFace

aar2dee2 · August 14, 2023, 8:17am

Hey, Livebook devs!

I’m using HuggingFace to deploy livebooks. I have a HuggingFace space created using the Install Livebook button on the livebook home page. I create notebooks using the New button option in the deployment and pick autosave in the notebook settings.
After a few days of inactivity, the space is set to sleep by HF automatically. However, when I run the space again, the notebooks are no longer there.
Maybe there’s a setting I need to change?

jonatanklosko · August 14, 2023, 11:15am

Hey @aar2dee2, by default files on Hugging Face Space are not persisted. You can enable persistent storage in the settings, but it is billed. Alternatively you can configure an S3 bucket (or anything S3-compatible) as an additional file system (in Livebook Settings) and keep the notebooks there. Note that the settings are saved in a file, so you’d need to redo the settings on every restart, but we plan to make it configurable via env var (Add startup filesystems · Issue #2143 · livebook-dev/livebook · GitHub), so you’d set it once in the HF Space settings.

aar2dee2 · August 14, 2023, 12:07pm

thanks, @jonatanklosko!

Curious if you think there’s a way persist the code to a github repo. So I can autosave + commit to Github and load the Livebook from there.
(I know this is somewhat redundant, but a useful hack for now).

Happy to work on a PR for this, if useful.

jonatanklosko · August 14, 2023, 12:23pm

Currently there isn’t. We did consider a GitHub-based file system, where each save would become a commit. The history wouldn’t be clean, but the idea was basically to use GH as plain storage. One reason we did not do it was lack of fine-grained personal access tokens, but these are now available. We are doing some changes to the pluggable storages with regards to Hub and we may revisit GitHub in the future, just not right now : )

aar2dee2 · August 14, 2023, 12:34pm

thank you!

w0rd-driven · August 14, 2023, 2:29pm

I’ve had ideas to do this as an app combined with Kino supervised processes. You’d want something that watches the filesystem for changes, a lot like how Phoenix hot reloading works. Once you detect a change after a threshold, git commit and git push.

You could get away with a naive approach pretty quickly because there’s now the egit library to manipulate git. There was also a git-sync set of shell scripts I believe so pieces are definitely there, I just haven’t had the free time to work through a PoC.

By being a notebook app you could bake in some customizations like configuring commit templates or things like Slack alerts if something goes wrong. You could even have somewhat of a GUI to do it manually or resolve merge conflicts. There’s a lot of layers to a robust onion.

Huggingface provides you with the repo for the Dockerfile that could be an option in a pinch but I haven’t played with changing that. I’d rather keep mine pure but that’s worth exploring too.

aar2dee2 · August 18, 2023, 5:52am

This is very interesting. Pls reach out if you’re working on a hack. Would love to work together!