Flexible Array Members

martinrobinson-2 · October 14, 2019, 1:03pm

I’m looking for thoughts on the use on the typical pattern from C in C++. I’m pretty sure it’s officially and currently UB although there is a proposal:

https://thephd.github.io/vendor/future_cxx/papers/d1039.html

For example, it is used widely for serialisation/deserialisation in the audio codecs in JUCE

struct BWAVChunk
{
    char description[256];
    char originator[32];
    char originatorRef[32];
    char originationDate[10];
    char originationTime[8];
    uint32 timeRefLow;
    uint32 timeRefHigh;
    uint16 version;
    uint8 umid[64];
    uint8 reserved[190];
    char codingHistory[1];

    void copyTo (StringPairArray& values, const int totalSize) const
    {
        values.set (WavAudioFormat::bwavDescription,     String::fromUTF8 (description,     sizeof (description)));
        values.set (WavAudioFormat::bwavOriginator,      String::fromUTF8 (originator,      sizeof (originator)));
        values.set (WavAudioFormat::bwavOriginatorRef,   String::fromUTF8 (originatorRef,   sizeof (originatorRef)));
        values.set (WavAudioFormat::bwavOriginationDate, String::fromUTF8 (originationDate, sizeof (originationDate)));
        values.set (WavAudioFormat::bwavOriginationTime, String::fromUTF8 (originationTime, sizeof (originationTime)));

        auto timeLow  = ByteOrder::swapIfBigEndian (timeRefLow);
        auto timeHigh = ByteOrder::swapIfBigEndian (timeRefHigh);
        auto time = (((int64) timeHigh) << 32) + timeLow;

        values.set (WavAudioFormat::bwavTimeReference, String (time));
        values.set (WavAudioFormat::bwavCodingHistory,
                    String::fromUTF8 (codingHistory, totalSize - (int) offsetof (BWAVChunk, codingHistory)));
    }

    static MemoryBlock createFrom (const StringPairArray& values)
    {
        MemoryBlock data (roundUpSize (sizeof (BWAVChunk) + values[WavAudioFormat::bwavCodingHistory].getNumBytesAsUTF8()));
        data.fillWith (0);

        auto* b = (BWAVChunk*) data.getData();

        // Allow these calls to overwrite an extra byte at the end, which is fine as long
        // as they get called in the right order..
        values[WavAudioFormat::bwavDescription]    .copyToUTF8 (b->description, 257);
        values[WavAudioFormat::bwavOriginator]     .copyToUTF8 (b->originator, 33);
        values[WavAudioFormat::bwavOriginatorRef]  .copyToUTF8 (b->originatorRef, 33);
        values[WavAudioFormat::bwavOriginationDate].copyToUTF8 (b->originationDate, 11);
        values[WavAudioFormat::bwavOriginationTime].copyToUTF8 (b->originationTime, 9);

        auto time = values[WavAudioFormat::bwavTimeReference].getLargeIntValue();
        b->timeRefLow = ByteOrder::swapIfBigEndian ((uint32) (time & 0xffffffff));
        b->timeRefHigh = ByteOrder::swapIfBigEndian ((uint32) (time >> 32));

    values[WavAudioFormat::bwavCodingHistory].copyToUTF8 (b->codingHistory, 0x7fffffff);...<etc>

Does anyone know if there any official “it works in compilers X and Y” or only using certain kinds of usage? Or are we literally risking “killing the cat” in all cases. In the JUCE classes it is typically used via allocating memory with a MemoryBlock then casting to a C++ type. Another example is:

static MemoryBlock createFrom (const StringPairArray& values)
{
    MemoryBlock data;
    auto numLoops = jmin (64, values.getValue ("NumSampleLoops", "0").getIntValue());

    data.setSize (roundUpSize (sizeof (SMPLChunk) + (size_t) (jmax (0, numLoops - 1)) * sizeof (SampleLoop)), true);

    auto s = static_cast<SMPLChunk*> (data.getData());

    s->manufacturer      = getValue (values, "Manufacturer", "0");
    s->product           = getValue (values, "Product", "0");
    s->samplePeriod      = getValue (values, "SamplePeriod", "0");
    s->midiUnityNote     = getValue (values, "MidiUnityNote", "60");
    s->midiPitchFraction = getValue (values, "MidiPitchFraction", "0");
    s->smpteFormat       = getValue (values, "SmpteFormat", "0");
    s->smpteOffset       = getValue (values, "SmpteOffset", "0");
    s->numSampleLoops    = ByteOrder::swapIfBigEndian ((uint32) numLoops);
    s->samplerData       = getValue (values, "SamplerData", "0");

    for (int i = 0; i < numLoops; ++i)
    {
        auto& loop = s->loops[i];

…

Where SampleLoop loops[1]; referred to there.

Of course there is the other option to allocate enough (and correctly aligned) memory then use placement new and potentially have one of these single element arrays as the last member. I’ve not currently encountered any problems on current Clang or MSVC with either the JUCE classes nor other implementations (which use the placement new technique).

reuk · October 14, 2019, 2:22pm

I think it’s practically impossible to do this in a standards-conformant way because placement new for arrays has a surprising characteristic:

Array allocation may supply unspecified overhead, which may vary from one call to new to the next. The pointer returned by the new-expression will be offset by that value from the pointer returned by the allocation function. cppreference

This means that, even if you carefully create a pointer with the correct alignment for the type you want to store, and use placement new to create an array ‘starting from’ that pointer, the compiler might add an unspecified amount of padding and muck up the alignment. There was a talk that includes a discussion of this topic at cppcon this year, found here.

If you want to write portable code that doesn’t trigger UB, I think your options are:

Use clang/gcc with the VLA extension enabled
Avoid VLAs completely

martinrobinson-2 · October 14, 2019, 2:54pm

Great thanks! Although that quote appears to relate specifically to array allocation via new[] variants rather than a structure containing an array of literals (or POD types). Although I’m not arguing this is not fundamentally UB!

reuk · October 14, 2019, 3:07pm

I think it applies to all array forms of new, whether they are doing placement new or allocating fresh memory, although I’m not sure about that.

Topic		Replies	Views
Tutorial: Simple synth (noise) Getting Started	3	858	April 11, 2016
Array<Value> question General JUCE discussion	3	1799	January 18, 2015
Basic question about Juce Arrays Getting Started	10	2101	June 12, 2021
Copy or assignment weirdness in StringArray? General JUCE discussion	13	925	June 3, 2010
AudioSampleBufferArray template class (source provided) Useful Tools and Components	1	608	April 12, 2012

Flexible Array Members

Purchase

Discover

Learn

Support

About

Events

Flexible Array Members

Related Topics

Purchase

Discover

Learn

Support

About

Events