Request to expose AudioThumbnailDataFormat

We found that the AudioThumbnailDataFormat is handy for some basic DSP operations. Currently the definition resides in juce_AudioThumbnail.cpp, which makes it unavailable outside the thumbnail code.
I created a new header file, called "juce_AudioThumbnailDataFormat.h

 *  juce_AudioThumbnailDataFormat.h
 *  Juce



struct JUCE_API AudioThumbnailDataFormat
    char thumbnailMagic[4];
    int samplesPerThumbSample;
    int64 totalSamples;         // source samples
    int64 numFinishedSamples;   // source samples
    int numThumbnailSamples;
    int numChannels;
    int sampleRate;
    char future[16];
    char data[1];
    void swapEndiannessIfNeeded() throw()
        flip (samplesPerThumbSample);
        flip (totalSamples);
        flip (numFinishedSamples);
        flip (numThumbnailSamples);
        flip (numChannels);
        flip (sampleRate);
    static void flip (int& n)   { n = (int) ByteOrder::swap ((uint32) n); }
    static void flip (int64& n) { n = (int64) ByteOrder::swap ((uint64) n); }



Would appreciate adding it to JUCE,

Seems like a reasonable request, I’ll take a look…

In reference to

After all the [quote]tarting-up the thumbnail stuff[/quote], how would I get access to the ThumbData ?
It was available before 1.52 was released, but it doesn’t seem to be true anymore.

Yes… it no longer uses just a blob of memory like the old version, it has a bunch of internal classes to hold the data in a smarter way, and exposing them wouldn’t make much sense. You can still write the thumbnail to a stream and get the raw data that way, of course.

But what are you actually trying to do? There might be a much better way to do it than just providing access to the internal structure.

We are looking through the Min/Max values per sample. The values are being used to find silent gaps in long recordings. It’s much faster then trying to do it with the raw samples.
Maybe you could provide access to AudioThumbnail::ThumbData::getMinMax() ?

Yes, that sounds like a reasonable request - I’ll see what I can do…

Yikes, this thumbnail stuff still has issues.

  1. I noticed that after loading a pre-saved thumbnail, I lose the sample level zoom. This looks like an easy one to fix. AudioThumbnail::loadFrom calls clear() no matter what source you had set.

void AudioThumbnail::loadFrom (InputStream& input) { clear(); ... }

I’ve changed it to:

void AudioThumbnail::loadFrom (InputStream& input) { if (input.readByte() != 'j' || input.readByte() != 'a' || input.readByte() != 't' || input.readByte() != 'm') { clear(); return; } ... }

Which means it’ll call clear() only if I passed in a non-jat file.

  1. Performance slowdown when loading/saving thumbnail files. I’ve profiled that code and found that AudioThumbnail::loadFrom is considerably slower now. For example, a 2 GB aiff, where the thumbnail weights 10.4 MB took:
    juce 1.52
    Performance count for "Thumbnail " - average over 1 run(s) = 18844 millisecs, total = 18.84426 seconds

juce tip, pre 1.52, dated 20 Sep 2010
Performance count for "Thumbnail " - average over 1 run(s) = 239 millisecs, total = 0.23917 seconds

My guess is that the new thumbnail code needs some tuning and optimization.

  1. The binary format of the jat file changed slightly, just enough to make it incompatible. I didn’t look deep enough to say what’s changed. Comparing the file size shows it’s off by 7 bytes. It’ll be counter productive to ask users to re-generate all the thumbnails just because they’ve updated the software.

Note that when I say juce 1.52, I mean juce 1.52 + changes to juce_AudioThumbnail from 13 Apr 2011. My goal is to get our software on juce 1.52, so I didn’t try the latest tip. I doubt it’ll make any difference with respect to the issues reported.

Lastly, thanks for adding back some of the old functionality. I did had to add a bit more code to get silence detection working. Instead of using AudioThumbnail::getApproximateMinMax, I added AudioThumbnail::getApproximateAbsMinMax which is similar but returns absolute values and their sample position.

[code]void AudioThumbnail::getApproximateAbsMinMax (const double startTime, const double endTime, const int channelIndex,
float& minValue, float& maxValue, int& minSampleIndex, int& maxSampleIndex) const throw()
MinMaxValue result;
const ThumbData* const data = channels [channelIndex];

if (data != NULL && sampleRate > 0)
    const int firstThumbIndex = (int) ((startTime * sampleRate) / samplesPerThumbSample);
    const int lastThumbIndex  = (int) (((endTime * sampleRate) + samplesPerThumbSample - 1) / samplesPerThumbSample);
    data->getAbsMinMax (jmax (0, firstThumbIndex), lastThumbIndex, result, minSampleIndex, maxSampleIndex);

minSampleIndex *= samplesPerThumbSample;
maxSampleIndex *= samplesPerThumbSample;

minValue = result.minValue / 128.0f;
maxValue = result.maxValue / 128.0f;


I think your slowdown must be because of the type of stream you’re reading from - the old code just pulled the whole stream into memory in one lump, but now it uses finer-grained read methods, which may be less efficient if e.g. it’s a FileInputStream. Try wrapping your stream in a BufferedInputStream and it’ll probably be just a quick as the old one. (Actually, it might be wise for me to add a BufferedInputStream inside the thumbnail code itself, as it wouldn’t do any harm…)

I’m a bit surprised that you say it’s not binary compatible, as I did make an effort to get that right… Am up to my neck in other work this week, but keep nagging me if I don’t take a look at that soon!

I’ve tried wrapping-up the stream with BufferedInputStream, with bufferSize = 128 and 1024. I didn’t see any performance boost.

It does look like you’ve made the effort to keep binary compatibility, still it’s not. I diff’d a 1 second silence wav file using hex editor. On the left is the 1.52 dump, on the right is the pre 1.52 tip I mentioned before.

There’re a few differences that might give you a hint:

  1. The new is 7 bytes shorter. I’ve verified that it doesn’t depend on the audio content.
  2. I’ve hi-lighted the numFinishedSamples on both. They don’t match, but I don’t know really if it matters.
  3. The MinMax values are different for the same audio content.

Will keep naggin’…

[quote]1. The new is 7 bytes shorter. I’ve verified that it doesn’t depend on the audio content.
2. I’ve hi-lighted the numFinishedSamples on both. They don’t match, but I don’t know really if it matters.
3. The MinMax values are different for the same audio content.[/quote]

None of that matters… The numSamplesFinished is probably just getting rounded to the nearest block size, and 7 bytes at the end is irrelevant. The min/max values are calculated in a slightly different way, so there could be differences in rounding. Why does any of this affect you?

I must admit that I’ve tripped on something that I can’t reproduce now. While I was updating our code to 1.52, I noticed that the waveform display has discrepancies between the left and right channels. One channel seemed to draw out of sync with the other. Sorry for the noise.

Don’t apologize! AudioThumnbail* is a very powerful set of classes, but also a class that has recently had functionality added to it. If you look at records, I’ve also been asking about it and I signed on right now to find out what I did in the last few days to break thumbnails in my own code :frowning: (certainly my fault, it was working last week).

The performance issues are potentially a big deal, and Jules is always open to anything that will make his code work better! The key there is coming up with a simple benchmark that we can try on different platforms…

There’s a “killer feature” for the AudioThumbnail family that right now cannot be attained - that’s zoom continuously down to the sample level without having to keep a sample level graphic for your entirely audio buffer. The trouble is that you have to set a specific zoom level

So… AudioThumbnail is a pretty big class with a lot of hair. Perhaps it needs splitting up?

It might benefit from having a few simple pure virtual interfaces, one or two regarding data updating and one or two regarding redrawing, and then some concrete implementations.

That would, for example, solve the “sample level zoom” issue reasonably well. You’d glue together two concrete implementations, one at the top level which you stored, and another at the lower level that didn’t even implement the “save” interface - so the sample level would only be generated “on the fly”.

But you wouldn’t have to do that immediately, just abstracting things and splitting out the parts would make e.g. people like hayes who want to use the data much happier!

The performance you’re reporting puzzles me… The old version used to just create a MemoryBlock and dump the entire sample file into it, so obviously that’s as fast it could be. But the fact that it now reads each byte should make very little difference, because as long as the stream is buffered, and your code is optimised, there’s no reason why it’d take more than a few extra cycles per sample. You’re not comparing performance in debug mode, are you?

That’s basically how it works now… As long as you supply an InputSource it’ll load it up on the fly to read the fine detail, right…?

Does it?! I find when I zoom in it bottoms out at the samples/pixel ratio I set in the constructor…?

Yep - that’s what the InputSource is for.

Interesting! I’ll check it out. And a great feature (if it works :-D) you should advertise it!

Finally, partial success in getting the performance loss back !

I’ve profiled the application and made some important observations, then came up with a theory that seems to be holding-up. That said, I get migraines, digging too deep into the thumbnail code, so I’m waiting to hear back.

The time profile pointed to a couple places, one is AudioThumbnail::setLevels. I also used ‘Quartz Debug’ to look for drawing issues. It was clear that the 1.52 based application was double drawing, which it didn’t with the previous one.

My theory is that although we load a pre-saved thumbnail, the thumbnail code still re-generates the thumbnail from scratch for no reason. This involves writing to same data structure while taking locks and such, which can really slow down performance.
To test this, I marked-out the call to AudioThumbnail::setSource, before loading the pre-saved thumbnail. This made the double drawings go away, but wasn’t enough. Then I added the BufferedInputStream, which made loading a snap.

All this still doesn’t give me a solution as I must call AudioThumbnail::setSource for sample level zooming. I’m also in the dark as for why generating a thumbnail is under-performing.

It definitely did.
Make sure you fix what I’ve reported as item 1. You are going to fix it jules, right ?

You said that after reloading a thumbnail, you lose the sample level zoom… well, to have sample level zoom you need to provide an InputSource for it, so I guess you’re just loading the thumbnail data but not also giving it the inputsource?

You said that after reloading a thumbnail, you lose the sample level zoom… well, to have sample level zoom you need to provide an InputSource for it, so I guess you’re just loading the thumbnail data but not also giving it the inputsource?[/quote]

I know that, but as I said it’s broken. AudioThumbnail::loadFrom clears the InputSource once invoked.