I'm working on a plug-in and I have a frequency response sort of graph which is updated about every 50ms. I noticed that updating the graph takes a considerable amount of CPU. In some cases the CPU usage is much more than the DSP processing. I see similar performance on Mac and Windows.
The bulk of CPU usage seems to be from drawImageAt. After doing some profiling on the Mac I found most of the time is spent doing color conversion routines. Here's the profiling info:
Running Time Self Symbol Name 1776.0ms 10.2% 0.0 juce::Graphics::drawImageAt(juce::Image const&, int, int, bool) const 1776.0ms 10.2% 0.0 juce::Graphics::drawImageTransformed(juce::Image const&, juce::AffineTransform const&, bool) const 1776.0ms 10.2% 0.0 juce::CoreGraphicsContext::drawImage(juce::Image const&, juce::AffineTransform const&) 1775.0ms 10.2% 0.0 juce::CoreGraphicsContext::drawImage(juce::Image const&, juce::AffineTransform const&, bool) 1767.0ms 10.2% 1.0 CGContextDrawImage 1766.0ms 10.2% 0.0 ripc_DrawImage 1667.0ms 9.6% 1.0 ripc_AcquireImage 1666.0ms 9.6% 0.0 CGSImageDataLock 1665.0ms 9.6% 1.0 img_data_lock 1659.0ms 9.5% 0.0 img_alphamerge_read 1482.0ms 8.5% 0.0 img_colormatch_read 1130.0ms 6.5% 0.0 CGColorTransformConvertData 1111.0ms 6.4% 0.0 CGCMSTransformConvertData 1111.0ms 6.4% 0.0 CMSTransformConvertData 1111.0ms 6.4% 0.0 CMSColorWorldConvertData 1109.0ms 6.4% 0.0 ConvertImageGeneric 1108.0ms 6.4% 0.0 ColorSyncTransformConvert 1105.0ms 6.3% 1.0 ColorSyncCMMApplyTransform 1103.0ms 6.3% 0.0 AppleCMMApplyTransform 1102.0ms 6.3% 1.0 DoApplyTransform 1096.0ms 6.3% 1.0 CMMProcessBitmap(CMMConversionParams*) 1090.0ms 6.3% 0.0 ConversionManager::ApplySequenceToBitmap(CMMConvNode*, CMMEncoDec&, CMMRuntimeInfo*, unsigned long, CMMProgressNotifier*) 1089.0ms 6.2% 2.0 long ConversionManager::DoConvert<CMM8Bits>(CMM8Bits&, CMMConvNode*, CMMEncoDec&, CMMRuntimeInfo*, unsigned long, CMMProgressNotifier*) 385.0ms 2.2% 0.0 CMM8Bit3ChanNoConvEncoder::DoEncode(CMM8Bits&, CMMRuntimeInfo*, unsigned long*, unsigned long*) 385.0ms 2.2% 385.0 CMM8Bit3ChanNoConvEncoder::InnerDoEncode(CMM8Bits const&, CMM8BitBuffer&, unsigned long*, unsigned long*) 366.0ms 2.1% 0.0 CMMConvRGBToRGB::Convert(CMM8Bits&, CMMRuntimeInfo*, unsigned int, unsigned int) const 355.0ms 2.0% 355.0 vCMMVectorConvert8BitRGBToRGB 11.0ms 0.0% 11.0 CMMConvRGBToRGB::Convert8BitMtxOnlyWithLookup(CMM3x3Type, int*, unsigned int, unsigned int, void const*, void const*) const 317.0ms 1.8% 0.0 CMM8Bit3ChanNoConvDecoder::DoDecode(CMM8Bits const&, CMMRuntimeInfo*, unsigned long) 317.0ms 1.8% 317.0 CMM8Bit3ChanNoConvDecoder::InnerDoDecode(CMM8Bits const&, CMM8BitBuffer const&, unsigned long)
Based on this blog (http://1014.org/?article=516) a colleague of mine made some changes that removed the color conversion overhead and seemed to make a big performance improvement. In juce_mac_CoreGraphicsContext.mm he added the following code:
//============================================================================== class MacColorSpace { public: static CGColorSpaceRef GetCGColorSpace () { return MacColorSpace::instance ().colorspace; } private: MacColorSpace () { colorspace = CreateMainDisplayColorSpace (); } ~MacColorSpace () { CGColorSpaceRelease (colorspace); } static MacColorSpace& instance () { static MacColorSpace sInstance; return sInstance; } static CGColorSpaceRef CreateMainDisplayColorSpace (); private: CGColorSpaceRef colorspace; }; //============================================================================== CGColorSpaceRef MacColorSpace::CreateMainDisplayColorSpace () { #if TARGET_OS_IPHONE return CGColorSpaceCreateDeviceRGB (); #elif MAC_OS_X_VERSION_MIN_REQUIRED >= MAC_OS_X_VERSION_10_7 ColorSyncProfileRef csProfileRef = ColorSyncProfileCreateWithDisplayID (0); if (csProfileRef) { CGColorSpaceRef colorSpace = CGColorSpaceCreateWithPlatformColorSpace (csProfileRef); CFRelease (csProfileRef); return colorSpace; } #else CMProfileRef sysprof = NULL; if (CMGetSystemProfile (&sysprof) == noErr) { CGColorSpaceRef colorSpace = CGColorSpaceCreateWithPlatformColorSpace (sysprof); CMCloseProfile (sysprof); return colorSpace; } #endif return 0; }
//In CoreGraphicsContext::CoreGraphicsContext add: rgbColourSpace = MacColorSpace::GetCGColorSpace(); //In CoreGraphicsContext::~CoreGraphicsContext() remove: CGColorSpaceRelease (rgbColourSpace);
After making these changes drawImageAt is appears to be about 10 times faster on the Mac in this scenario.
144.0ms 1.0% 0.0 juce::Graphics::drawImageAt(juce::Image const&, int, int, bool) const 144.0ms 1.0% 1.0 juce::Graphics::drawImageTransformed(juce::Image const&, juce::AffineTransform const&, bool) const 142.0ms 1.0% 0.0 juce::CoreGraphicsContext::drawImage(juce::Image const&, juce::AffineTransform const&) 141.0ms 1.0% 0.0 juce::CoreGraphicsContext::drawImage(juce::Image const&, juce::AffineTransform const&, bool) 132.0ms 0.9% 0.0 CGContextDrawImage 132.0ms 0.9% 1.0 ripc_DrawImage 115.0ms 0.8% 0.0 ripc_RenderImage 114.0ms 0.8% 0.0 RIPLayerBltImage 101.0ms 0.7% 0.0 ripd_Mark 10.0ms 0.0% 0.0 ripd_Lock 2.0ms 0.0% 1.0 ripd_Unlock 1.0ms 0.0% 1.0 CGBlt_initialize 1.0ms 0.0% 0.0 <Unknown Address> 8.0ms 0.0% 1.0 ripc_GetRenderingState 4.0ms 0.0% 0.0 ripc_AcquireImage 2.0ms 0.0% 2.0 ripc_GetImageTransformation 1.0ms 0.0% 1.0 DYLD-STUB$$CGGStateGetShouldAntialias 1.0ms 0.0% 0.0 <Unknown Address> 5.0ms 0.0% 0.0 juce::CoreGraphicsImage::createImage(juce::Image const&, CGColorSpace*, bool)
Does this seem like a reasonable change to you guys? Can you think of any downsides to doing it this way? Any chance this could be incorporated into the Juce sources?
The performance on Windows of drawImageAt also seems to be pretty slow, although I haven't profiled it yet, so I'm not sure what might be the cause. It would be great to improve that also.
Thoughts?
Thanks,
Chris