CameraDevice listener callback performance (Mac)


#1

Hi folks,

I’m working on a live video processing app and have been playing with the CameraDevice class a bit. I noticed something odd, which is that adding a listener resulted in a rather significant CPU and memory penalty, even if the callback does nothing. On my newish MacBook Pro (running Sierra) this means roughly 50% CPU usage just passing frames to the listener, which is well beyond what my app can tolerate in overhead.

After doing some digging, I noticed this call in the Mac CameraDevice Pimpl:

internal->callListeners ([CIImage imageWithCVImageBuffer: videoFrame],
                                             (int) CVPixelBufferGetWidth (videoFrame),
                                             (int) CVPixelBufferGetHeight (videoFrame));

And then in the callListeners method, another conversion using this method:

Image juce_createImageFromCIImage (CIImage* im, int w, int h)
{
    CoreGraphicsImage* cgImage = new CoreGraphicsImage (Image::ARGB, w, h, false);

    CIContext* cic = [CIContext contextWithCGContext: cgImage->context options: nil];
    [cic drawImage: im inRect: CGRectMake (0, 0, w, h) fromRect: CGRectMake (0, 0, w, h)];
    CGContextFlush (cgImage->context);

    return Image (cgImage);
}

So effectively we’re taking a CoreVideo pixel buffer, converting it to a CIImage, drawing that into CoreGraphics context, and then copying the data into a JUCE Image. There are at least two issues here, one of which is that (in my experience) these context draws are comparatively slow operations, and the other is that all the Objective-C object allocations per frame are costly (much unlike the core JUCE classes).

My question in a roundabout way: would it not be preferable to copy the contents of the pixel buffer directly into a JUCE Image? I did a little hacking on CameraDevice and managed to get CPU usage down to nominal levels with the implementation below:

void callListeners (CVImageBufferRef frame, int w, int h)
{
    Image image(Image::ARGB, w, h, false);
    const Image::BitmapData bm(image, Image::BitmapData::writeOnly);

    CVPixelBufferLockBaseAddress(frame, 0);
    memcpy(bm.data, CVPixelBufferGetBaseAddress(frame), w * h * 4);
    CVPixelBufferUnlockBaseAddress(frame, 0);

    const ScopedLock sl (listenerLock);

    for (int i = listeners.size(); --i >= 0;)
    {
        CameraDevice::Listener* const l = listeners[i];

        if (l != nullptr)
            l->imageReceived (image);
    }
}

Note that since Quicktime defaults to 2vuy pixel format, this requires setting up the capture session to explicitly output BGRA, which avoids having to do any conversions. Otherwise everything seems to be working smoothly.

I’d love some feedback on this because barring a new JUCE release that covers it I’ll probably be writing my own AVFoundation wrapper in the near future and would like to avoid similar problems. But I’m pretty new to JUCE and haven’t done a ton of QT/CG/CV stuff before. Does this look like a workable solution? If so I’d be happy to submit a PR, but if this is janky or dangerous let me know as well.

cheers,

[m]


#2

We’ve already added AVFoundation support to the develop branch. Does this improve things?


#3

Thanks for your reply, t0m, I didn’t even think to look there. I’ll check it out and let you know how it goes.


#4

Hi t0m, I got a chance to browse the develop branch. It looks as though AVFoundation support is only in the MoviePlayerComponent at the moment, not the CameraDevice, which at a glance looks to the be the same as the current stable implementation. My issue above has less to do with AVFoundation vs Quicktime per se than it does the relative performance of CIImage rendering vs pixel buffer access in the listener callback.

It’s nice to see that AVFoundation support is coming along, though, and I’m looking forward to seeing this stuff make its way into master for JUCE 5. In the meanwhile I’ll probably just use my own implementation, as I’ll need to do something similar for movie playing anyway in order to get access to frame data. I get the impression that these classes were not intended for my use case, and that’s fine.

Thanks for taking the time to respond!