Image - an image processing library based upon Vix

Amazing work Kip!

Awesome!! Can’t wait!! :003:

Looking forward to seeing what you add and thanks for all your work, I think all these features are going to interest a lot of people :smiley:

2 Likes

@AstonJ, StableDiffusion “image-to-image” mode is not yet available in Axon/Bumblebee but it is being tracked. I go ahead and add “text-to-image” for now (in the next couple of days).

2 Likes

Thanks for the update Kip, I look forward to seeing text-to-image …and image-to-image when/if it becomes available :smiley:

Nearly there with Image.Generation.text_to_image/2 (need some more work on documentation before release). Huge amounts of fun with this. On my M1 Max is takes about a minute to generate an image so definitely better to have a supported GPU.

iex> [i] = Image.Generation.text_to_image("impressionist purple numbat in the style of monet")
[%Vix.Vips.Image{ref: #Reference<0.1344162992.4105306142.245952>}]

numbat

8 Likes

Looking good Kip! I’m sure lots of fun will be had with it once it’s released! :003:

Also I’m not sure whether you (or @seanmor5) are aware but Google extended Stable Diffusion and created DreamBooth. It’s open source and offers a way to tweak Stable Diffusion to produce variations of photos in different styles. It’s what Avatar AI uses and is exactly what I am looking for - it would be awesome if we could have this! :star_struck:

The project I would use it for is…

Drum roll...

Edit: screenshot snipped! I need to stop talking about potential projects because apparently when you share details of them it gives you the same sense of achievement of having completed them - so ends up working against you! Open to my arm being twisted tho! But it will take a lot of twisting, haha!

I’m in no major rush for it tho - if I start that project it will be when LV is at 1.0 (or when Chris thinks we’re unlikely to see major changes to it).

3 posts were split to a new topic: Elixir Deployment Options - GPU edition!

Motivated by the discussion on extracting frames from video, newly published Image version 0.22 supports some basic capabilities based upon the excellent Evision.VideoCapture module.

Enhancements

  • Adds Image.Video.image_from_video/2 to extract images from frames in a video file or video camera. Includes support for :frame and :millisecond seek options. Seek options are only supported for video files, not video streams.

  • Adds Image.Video.stream!/2 that returns an enumerable stream of frames as images. The stream takes a range as a parameter. For example:

  • Adds Image.Video.scrub/2 that scrubs the video head forward a number of frames.

  • Adds Image.Video.seek/2 to seek the video head to the requested frame or millisecond. Seeking is supported for video files only, not video streams. Seeking is not guaranteed to be frame accurate (due to underlying OpenCV issues).

Examples

# Extracting an image
iex> {:ok, video} = Image.Video.open "./test/support/video/video_sample.mp4"
iex> {:ok, _image} = Image.Video.image_from_video(video)
iex> {:ok, _image} = Image.Video.image_from_video(video, frame: 0)
iex> {:ok, _image} = Image.Video.image_from_video(video, millisecond: 1_000)

# Streaming images
# Extract every second frame starting at the
# first frame and ending at the last frame.
iex> "./test/support/video/video_sample.mp4"
...> |> Image.Video.stream!(frame: 0..-1//2)
...> |> Enum.to_list()
[
  %Vix.Vips.Image{ref: #Reference<0.2048151986.449445916.177398>},
  %Vix.Vips.Image{ref: #Reference<0.2048151986.449445916.177400>},
  %Vix.Vips.Image{ref: #Reference<0.2048151986.449445916.177402>},
  ...
]
5 Likes

@kip there is another regression straight from dependabot here:


== Compilation error in file lib/image/options/video.ex ==
Error: ** (CompileError) lib/image/options/video.ex:8: Evision.VideoCapture.__struct__/0 is undefined, cannot expand struct Evision.VideoCapture. Make sure the struct name is correct. If the struct name exists and is correct but it still cannot be found, you likely have cyclic module usage in your code

This is similar to the bumblebee stuff where you should safeguard the existence of dependent libraries and flag on/off your features I reckon?

There’s mix compile --no-optional-deps --warnings-as-errors to make sure that the application compiles successfully without optional dependencies present. Sounds like that would be useful to have in CI.

Arrggghhhh. Fixed in Image version 0.22.1. Thanks for the report of this authors idiocy!

1 Like

Yep, I certainly need to take some time to configure CI properly. That’s a chore I find hard to prioritise but you’re right.

1 Like

I just published Image version 0.23.0 with the following improvements and bug fixes. In particular Image.normalize/1 and Image.autolevel/1 are steps towards automated image improvement.

Bug Fixes

  • Fix specs for Image.Options.Write. Thanks to @jarrodmoldrich. Closes #36.

  • Fix spec for Image.exif/1. Thanks to @ntodd for the PR. Closes #35.

Enhancements

  • Adds Image.normalize/1 which normalizes an image by expanding the luminance of an image to cover the full dynamic range.

  • Adds Image.autolevel/1 which scales each band of an image to fit the full dynamic range. Unlike Image.normalize/1, each band is scaled separately.

  • Adds Image.erode/2 which erodes pixels from the edge of an image mask. This can be useful to remove a small amount of colour fringing around the edge of an image.

  • Adds Image.dilate/2 which dilates pixels from the edge of an image mask.

  • Adds Image.trim/2 which trims an image to the bounding box of the non-background area.

  • Adds Image.flatten/1 which flattens an alpha layer out of an image

  • Image.Options.Write.validate_options/2 now validates options appropriate to each image type in order to make validation more robust.

  • Adds :minimize_file_size option to Image.write/2 for JPEG and PNG files which if true will apply a range of techniques to minimize the size of the image file at the expense of time to save the image and potentially image quality.

12 Likes

Today is object detection Sunday. Greatly inspired by the fabulous talk by @hansihe at the Warsaw Elixir Meetup earlier this month I transcribed his live coding example and implemented it in a new experimental module, Image.Detection.

Based upon his solid observations I also added some quality of life improvements to the detect branch of Image:

  • Image.from_kino/2 and Image.from_kino!/2 to easily consume the image data from a Kino.Input.Image data source in Livebook
  • Image.Shape.rect/3 and Image.Shape.rect!/3 to have a composable way to draw rectangles - specifically object bounding boxes in this case.
  • Image.embed/4 and Image.embed!/4 to make it much easier to conform an image to the dimensions required by an ML model.

The code is ready for fun and experimentation. Given some of the tricky dependency configuration at the moment it can only be used as a GitHub dependency for now. You can add it in a mix.exs as:

{:image, github: "elixir-image/image", branch: "detect"}`

Demo example

Livebook coming this week!

iex> i = Image.open!("./test/support/images/elixir_warsaw_meetup.png")
%Vix.Vips.Image{ref: #Reference<0.3196308165.1633288229.90166>}
iex> Image.Detection.detect(i)
{:ok, %Vix.Vips.Image{ref: #Reference<0.3196308165.1633288229.90638>}}

elixir_warsaw_detected

The code

Its amazing how little code this takes with Nx, Axon and Axon Onnx.

  def detect(%Vimage{} = image, model_path \\ default_model_path()) do
    # Import the model and extract the
    # prediction function and its parameters.
    {model, params} = AxonOnnx.import(model_path)
    {_init_fn, predict_fn} = Axon.build(model, compiler: EXLA)

    # Flatten out any alpha band then resize the image
    # so the longest edge is the same as the model size,
    # then add a black border to expand the shorter dimension
    # so the overall image conforms to the model requirements.
    prepared_image =
      image
      |> Image.flatten!()
      |> Image.thumbnail!(@yolo_model_image_size)
      |> Image.embed!(@yolo_model_image_size, @yolo_model_image_size)

    # Move the image to Nx. This is nothing more
    # than moving a pointer under the covers
    # so its efficient. Then conform the data to
    # the shape and type required for the model.
    # Last we add an additional axis that represents
    # the batch (we use only a batch of 1).
    batch =
      prepared_image
      |> Image.to_nx!()
      |> Nx.transpose(axes: [2, 0, 1])
      |> Nx.as_type(:f32)
      |> Nx.divide(255)
      |> Nx.new_axis(0)

    # Run the prediction model, extract
    # the only batch that was sent
    # and transpose the axis back to
    # {width, height} layout for further
    # image processing.
    result =
      predict_fn.(params, batch)[0]
      |> Nx.transpose(axes: [1, 0])

    # Filter the data by certainty,
    # zip with the class names, draw
    # bounding boxes and labels and the
    # trim off the extra pixels we added
    # earlier to get back to the original
    # image shape.
    result
    |> Yolo.NMS.nms(0.5)
    |> Enum.zip(classes())
    |> draw_bbox_with_labels(prepared_image)
    |> Image.trim()
  end

Next steps

This is a proof-of-concept only. The API will almost certainly change - not all use cases require painting a bounding box with labels. Feedback however is most welcome!.

Thanks again to @hansihe, the work is all his.

15 Likes

Is there anywhere I can see a full script for this? I tried to get it running in Livebook and a newly created mix project but cant get the dependenies right :upside_down_face:

1 Like

Sure, it’s referenced in the post in the config line for mix.ibex’s and link behind the Image,Detection link. Just look at the detect branch at https://github.com/elixir-image/image`.

I had to do quite some munging to get the deps right too hence the GitHub dependencies.

1 Like

I’ve made a simple Livebook to demonstrate how to install, configure and do object detection:

Run in Livebook

2 Likes

Is there an easy way to make a copy of a mutable Image? I looked around the docs for something, but I couldn’t find anything. That’s the reason I used the closure to make images for drawing in the talk.

1 Like

@hansihe, A mutable image is already copied for you and operations are serialised behind a genserver. So it is mutable - but only by you. And all operations are serialised so it’s also thread safe.

You can perform multiple mutations on a single copy of the image by using Image.mutuate/2 (I have just updated the documentation on Image.mutate/2 quite a bit to make this clearer).

Mutation example

# The image is copied and operations
# are serialized behind a genserver.
# Only one copy is made but all operations
# will be serialized behind a genserver.
# When the function returns the genserver
# is broken down and the underlying
# mutated `t:Vix.Vips.Image.t/0` is returned.

iex> Image.mutate image, fn mutable_image ->
...>  mutable_image
...>  |> Image.Draw.rect!(0, 0, 10, 10, color: :red)
...>  |> Image.Draw.rect!(2, 20, 10, 10, color: :green)
...>  |> Image.Draw.rect!(50, 50, 10, 10, color: :blue)
...> end

By wrapping multiple mutations in an Image.mutate/2 call there should be performance improvements - although for safety each mutation is still serialised through a genserver.

1 Like

Has anyone succeeded in pulling vix & image to a Windows machine? The docs says pre-build binaries are available but nothing is fetched.

Never tried on windows, but I didn’t have much luck on my Raspberry Pi either :slight_smile: