Something extremely inefficient in WavAudioFormatReader chunk handling

I had a fairly large audio file to which i create a lot of subsection readers. This works well and fast. However, after i added a 1000 markers to the audio file in WaveLab, each of the WavAudioFormatReader (InputStream* in) contructors started taking a very long time.

I would expect WaveLab to add chunks (apparently at least 'que ’ and ‘ltxt’) for these markers, and that it would not be totally free of overhead of course, but the constructor must be doing something stupid for a 1000 markers to turn something that takes a split second into 30 seconds. If i’m not mistaken the time is spent mostly in StringPairArray::set method.

Is there even any internal use for these chunks in JUCE? Is there some way to skip the reading of such chunks unless explicitly needed?

I can live with this by just not adding any markers to the files. That’s fortunately an option. Just wanted to report in case there’s actually something fishy in there. BTW i’m testing on Mac.

Are you making copies of data structures or variables by mistake? Maybe every time you make a new marker you’re also copying or creating a new duplicate data structure. Obviously idk what you’re doing because there’s no code to base my opinion off, but you could set a…

DBG(…)

When creating a new marker so it prints out the data you’re passing and you can check if it’s passing the data as a duplicate?

Thanks for your reply! The markers are created in WaveLab audio editing app from Steinberg. WaveLab stores these markers into the WAV file. All i do in code is create a WavAudioFormatReader for the file. The constructor then parses these chunks implicitly and without asking to do so and that takes IMO way too long.

Oh sorry, I thought you were the one creating these objects. In which case I don’t really know how to help with this. Perhaps :thinking: you can contact Steinberg on their forums and let them know of this and they may say “oh yeah, you forgot to do xyz” or something. Apologies!

No problem. And thanks for your suggestions!

I was thinking that i’ve observed a fault in JUCE WavAudioFormatReader design or implementation, possibly one of two scenarios:

a) the constructor is doing something wrong and inefficiently while parsing these chunks, like passing data back and forth by value, falsely assuming the data is small, or

b) the class is doing something in the constructor it really shouldn’t be doing. Perhaps a reader class shouldn’t read this much data in the constructor - like a reader doesn’t read any of the audio stream upon construction, it shouldn’t read all this chunk data either until needed.

AFAIK there’s nothing wrong in storing data into a file. I guess it should be the user’s responsibility to access the data as efficiently as possible. Since i can avoid this problem altogether, i can’t invest more time into debugging and solving it. Just wanted to report this rather surprising behavior.

1 Like

Thanks for reporting this issue. I think I have a pretty good idea what the problem is, but it would be helpful to test with the same input to make sure that I’ve actually fixed the issue.

Are you able to share any problematic file so that I can test it with my changes in place? No worries if not, but it would give me more confidence in the fix. Thanks again!

Sure, here’s an audio file recorded in JUCE. The markers.xml contained in the package was imported to the wav in WaveLab Pro 10, using Markers/Functions/Import Markers from XML File. The file was saved and no further modifications were made. It takes a while to load this into WavAudioFormatReader.

As i mentioned my actual use case had a 1000 markers and a 1000 readers. This example has more markers and should demonstrate the issue easily with only a single reader instance.

Lotsa Markers.zip (2.7 MB)

Have you tried using a BufferedInputStream for the file?

Rail

Just tested out my potential changes, and the time to read has gone from 5.6s to 0.6s (in debug - I’d expect a bigger difference in release but I don’t like waiting for the linker…)

The slowdown seems to be due to quadratic behaviour when looking up chunk names in a StringPairArray, so I wouldn’t expect using a buffered input stream to help that much, although it’s still probably a good idea.

2 Likes

I would recommend turning off the link-time-code-generation feature that Projucer, unfortunately, has enabled by default when creating projects. It adds a LOT of linker time, but the resulting code is not really any faster. It’s usually not worth the trade-off.

1 Like

Great! 10x improvement makes already a big difference.

The fix is now on develop:

Hopefully this will resolve the issues you were seeing. Please let us know if you still run into problems.

1 Like