I am trying to change a livebook smart cell to a one-file Elixir script, mostly for educational reasons. I successfully Mix.install
dependencies, load the model (“Salesforce/blip-image-captioning-base”), create a featurizer, a tokenizer, run the configure step and create a serving.
serving =
Bumblebee.Vision.image_to_text(model_info, featurizer, tokenizer, generation_config,
compile: [batch_size: 1],
defn_options: [compiler: EXLA]
)
And then I load an image like so:
image =
"/tmp/22911352.jpg"
|> File.read!()
|> Nx.from_binary(:u8)
|> ....
I cannot figure out what the last step in the image processing pipeline should be. The original smart cell does the following:
image =
image.file_ref
|> Kino.Input.file_path()
|> File.read!()
|> Nx.from_binary(:u8)
|> Nx.reshape({image.height, image.width, 3})
The height and width are 3468, 4624 but when I hard-code the values and do
..
|> Nx.reshape({3468, 4624, 3})
I get the following error:
cannot reshape, current shape {4190582} is not compatible with new shape {3468, 4624, 3}
The very last step needs to be
Nx.Serving.run(serving, image)
This is my first look at livebook - I don’t yet understand where image
in the smart cell comes from.