I’m putting this in its own thread as that makes more sense.
The little prehistory here is that I have a RPI5 system with Hailo 8 working, but the non AI/ video part is not quite where we want it. The Rockchip 3588 processor, with SBCs often compared to the RPI5, seems like a more capable beast. It has twice the number of processor cores, but more importantly it has a built in NPU and very capable VPU. It also have other specialized hardware which might be of interest such as GPU, jpeg encoding and decoding, 48MPx ISP with functions like [auto focus, HDR, RAW conversion, lens corrections], 8K hdmi output, 4K hdmi input, more audio functions than I’ll ever need, various IO, and then some. So I got a Radxa 5T version, which also comes in an industrial version, with the full intent of running Nerves on it in anger.
Getting this up and booting with Nerves was quite straight forward. There are two different kernel options:
-
The manufacturers with most hardware working fine but with a custom older kernel (6.1) and many proprietary blobs for drivers. (The NPU driver itself is actually open source). Issues I ran into with this one was getting WiFi working (the Radxa 5T SBC I got has a newer chip than 6.1), and getting a GPU accelerated browser for a kiosk mode using the Mali610 GPU. I backported the WiFi so that was fine, but getting a GPU accelerated browser was not so easy.
-
I then tested the other alternative, an open source 6.18 kernel with patches for the 3588 made by Collabora with Mesa3D graphics, Rocket driver and Teflon TFlite delegate. I believe many of these or maybe all (?) are now in mainline Linux. Anyway, getting Nerves booted was troublefree, and I also got a GPU accelerated Chromium running in kiosk mode fairly easily. Headphone sound is a bottomless mystery to me though so that is not working yet.
The problem with this is the NPU and VPU rely on the open source drivers and kernel to work. Teflon is young and has basically just implemented the operations needed to run their test Mobilenet (if I remember correctly). I’ve written more in another thread about trying to get these open source alternatives to work with Yolo 8, and in the end I basically explored the NPU registers and poked them directly. (The LUT was surprising as both Rocket and the official documentation had the number of LUT values wrong. Further, rather than being an actual LUT table it is more like a look up interpolated graph). In the end I found that the Rocket NPU initialization does not seem to support all that is needed for all operations on the NPU, and poking registers directly was hardly a good solution. So not really a way forward.
At that point I thought I could take the original Rockchip open source NPU driver and patch that into something that would work with the open source 6.18 kernel. But it seems I can’t do that without also having to change the 6.18 kernel (which is adopted to Rocket I presume). That actually makes sense as Rockchip themselves also had to fork the their kernel from the mainline to get their combination working. But if I changed the 6.18 kernel then other parts depending on it might accidentally break now or in the future. And all this was just for the NPU… I also wanted the VPU working, and the open source alternative is not full operation there either. A better plan was needed.
So, back to square one. Instead of trying to wrestle the Rockchip NPU and VPU drivers to fit with a 6.18 open source kernel the better plan was to keep the older Rockchip 6.1 kernel with the hardware working. And then somehow massage the Mali610 GPU driver blob into providing browsers with GPU acceleration.
Internetting I discovered that the key issue had been narrowed down to some missing communication between the browser and the renderer. And someone had even made a fix! But after applying the fix the speed was just 20 fps in the browser. The hook fix did make the GPU work, but it solved it by copying frames from the GPU to the CPU and then to memory for the renderer. Hence the low speed.
The better solution would be a zero frame copying by using references instead. So far that seems to work. The browser is now GPU accelerated at around 60 fps. Video playback in the browser should be accelerated by the specialized VPU though, so that is next on the list.
So far so promising.






















