Raw camera frames on Scenic with Ports and OpenCV

alvises · February 5, 2020, 3:39pm

In my journey of playing with Ports, OpenCV python wrapper and real-time object detection, I’ve finally ended up playing (for the first time) with Scenic… what a joy (thanks @boydm!) !!

What I’m trying to do is to render raw camera frames on a window with Scenic (then I’ll render object detection labels and bounding boxes). At the moment I’m using a Python script with OpenCV, which reads the frames from the camera, coverts the numpy array to a binary with 3 uint8 bytes (rgb) per pixel and pushes the frames to Elixir via port.

The example below works well, but since I’ve just started with Scenic, I’m just wondering if there is a better/easier pattern, maybe a ready to use library that reads camera frames !?!?

What impressed me is that the result with Scenic + Port is far smoother than rendering the frames using OpenCV with cv2.imshow("Frame", arr) on Python! For what I see there is almost no perceptible delay.
I’ll try to post a quick video showing the comparison.

#camera.py

import os, sys
from struct import unpack, pack
import cv2
import time

def setup_io():
  return os.fdopen(3,"rb"), os.fdopen(4,"wb")

def write_frame(output, message):
  header = pack("!I", len(message))
  output.write(header)
  output.write(message)
  output.flush()

def open_camera(source=0):
  cap = cv2.VideoCapture(source)
  cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
  cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
  return cap

def get_and_write_frame(cap, output_f):
  start = time.time()
  _, arr = cap.read()
  stop = time.time()
  
  arr = cv2.cvtColor(arr, cv2.COLOR_BGR2RGB)
  data = arr.tobytes()
  write_frame(output_f, data)


def run():
  input_f, output_f = setup_io()
  cap = open_camera(0)

  while True:
    get_and_write_frame(cap, output_f)

run()

Camera Scene on the Elixir side

defmodule Cv.Scene.Camera do

  use Scenic.Scene
  require Logger

  alias Scenic.Graph
  alias Scenic.ViewPort

  import Scenic.Primitives

  def init(_, opts) do
    {:ok, %ViewPort.Status{size: {width, height}}} = ViewPort.info(opts[:viewport])
    scenic_ver = Application.spec(:scenic, :vsn) |> to_string()
    glfw_ver = Application.spec(:scenic, :vsn) |> to_string()


    graph =
      Graph.build(font: :roboto)
      |> rect({1280, 720}, id: :white, fill: {:dynamic, "camera_frame"})

    _port = Port.open({:spawn, "python camera.py"}, [:binary, {:packet, 4}, :nouse_stdio])

    {:ok, graph, push: graph}
  end

  def handle_info({_port, {:data, raw_frame}}, graph) do

    Elixir.Scenic.Cache.Dynamic.Texture.put("camera_frame", {:rgb, 1280, 720, raw_frame, []})
   
   {:noreply, graph}
  end
end

On the scenic side there is no need of decoding, handle_info receives the raw frame bytes that can be directly put in the camera_frame dynamic texture (I still have to better understand how textures work in Scenic…)

At the beginning I was concerned about performance, thinking that port could be a bottleneck, but hopefully I was wrong! A 720p (1280x720) RGB raw frame is ~3mb and I measured that a round-trip-time of this message via ports is ~0.6ms - the maximum throughput I could get with a port is ~1.7Gbyte/s on a macbook pro.

Some snapshots of the observers!

640x480 frames

1280x720 frames

xlphs · February 5, 2020, 5:13pm

FFmpeg can read from camera device (/dev/video0) but it probably gives you yuv, you can tell it to convert to rgb with pixel format conversion. Combined with pipe I think you can use a single ffmpeg command to push the bytes to elixir.

alvises · February 6, 2020, 12:04pm

Thanks! I’m now able to get camera frames with ffmpeg, but I can’t prepend a size header to each frame. I need the header to let the port split messages (frames) for me.

My current ffmpeg command is

ffmpeg -f avfoundation -framerate 30 -i "0" -s 640x480 -pix_fmt 0rgb

Any idea how to pipe the output to fd 4?

xlphs · February 6, 2020, 3:22pm

Try pipe:4 as the output file name.

As for the size, if you know width and height and format is rgb, each frame is always width*height*3 number of bytes. I think it is not difficult to add a bit of read logic into Cv.Scene.Camera to read exactly that many bytes per frame and hand off the frame to Scenic.

alvises · February 6, 2020, 4:00pm

As far as I know to let Port read, split and deliver immediately the frames of width*height*3 fixed size to the process, it needs to read a size header prepended to the payload.

xlphs · February 6, 2020, 5:50pm

Try open_port with stream, the doc says

Output messages are sent without packet lengths. A user-defined protocol must be used between the Erlang process and the external object.

alvises · February 8, 2020, 12:05pm

thank you @xlphs for the help with ffmpeg, I’m still having some issues but I’ll try again to make it work
At the moment, I’m going ahead with python/opencv + ports, especially because my next action is to run Yolo on frames, so reading camera frames directly in python + opencv can be pretty handy.

About performance, Scenic beats OpenCV hands down on rendering the frames, which is surprising considering that Scenic receives raw frames from python via ports, and there is almost no lag (a 3mb message is sent and received via port in less than 1ms).

Short comparison (video of myself saying and drinking ) trying to show the difference between the two

The Python version is pretty simple, while the Scenic version is the one in the first post.

import cv2

cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)

while True:
	_, arr = cap.read()
	cv2.imshow("OpenCV imshow 1280x720", arr)
	cv2.waitKey(1)