Recovering from malformed sysex

Hi!

We had some problems with malformed sysex messages hanging our midi-thread. After some snooping around, it seems that a malformed sysex message causes juce to perpetually parse the incoming midi stream as a mesage, until the sysex finished byte (0xf7) is received.

Looking at the midi specification, any data bytes in the sysex message should not have it's 7th bit set. This is checked in Juce_MidiDataConcenator.h: (from juce git, starting from line 124)


if (pendingBytes > 0 && *d >= 0x80)
            {
                if (*d >= 0xfa || *d == 0xf8)
                {
                    callback.handleIncomingMidiMessage (input, MidiMessage (*d, time));
                    ++d;
                    --numBytes;
                }
                else
                {
                    if (*d == 0xf7)
                    {
                        *dest++ = *d++;
                        pendingBytes++;
                        --numBytes;
                    }
                    break;
                }
            }
            else
            {
                *dest++ = *d++;
                pendingBytes++;
                --numBytes;
            }

There are checks for midi start and midi clock (0xfa and 0xf8), but other other midi messages don't stop the parsing of the sysex.

To solve this,  i added an else statement for these other bytes. Here is my code (it's not really from the tip, a bit behind):

 

if (pendingBytes > 0 && *d >= 0x80)
{
  if (*d >= 0xfa || *d == 0xf8)
  {
    callback.handleIncomingMidiMessage (input, MidiMessage (*d, time));

    ++d;
    --numBytes;    
    pendingBytes = 0; // don't process sysex after wierdness.
  }

  else
  {
    if (*d == 0xf7)
    {
      *dest++ = *d++;
      pendingBytes++;
      --numBytes;
    }
                  
#pragma MODIFIED_BY_RESOLUME // added else statement: data bytes with bit 7 are malformed, quit parsing

    else
    {
      Logger::writeToLog("Illegal sysex data with bit 7 set, stopping parse..");

      int used = 0;
      const MidiMessage m (d, numBytes, used, 0, time);

      if (used <= 0)
        break; // malformed message..

      callback.handleIncomingMidiMessage (input, m);

      numBytes -= used;
      d += used;
      pendingBytes = 0; // don't process sysex after wierdness.
    }
  }
}

 

I hope it is of use to someone.

 

Cheers,

 

Tim

Thanks.. That code looks slightly broken to me though, as you've removed the final else statement, which would surely mean that an infinite loop is possible if pendingBytes == 0 or the top bit is clear (?)

Right, i forgot to paste that in.. 

So:


if (pendingBytes > 0 && *d >= 0x80) 
{ 
  if (*d >= 0xfa || *d == 0xf8) 
  { 
    callback.handleIncomingMidiMessage (input, MidiMessage (*d, time)); 
    ++d; --numBytes; 
    pendingBytes = 0; // don't process sysex after wierdness. 
  } 
  else 
  { 
    if (*d == 0xf7) 
    { 
      *dest++ = *d++; 
      pendingBytes++; 
      --numBytes; 
    } 

#pragma MODIFIED_BY_RESOLUME // added else statement: data bytes with bit 7 are malformed, quit parsing 
    else 
    { 
      Logger::writeToLog("Illegal sysex data with bit 7 set, stopping parse.."); 
      int used = 0; 

      const MidiMessage m (d, numBytes, used, 0, time); 
      if (used <= 0) 
        break; // malformed message.. 

      callback.handleIncomingMidiMessage (input, m); 

      numBytes -= used; 
      d += used; 
      pendingBytes = 0; // don't process sysex after wierdness. 
    } 
  } 
} 

else 
{
  *dest++ = *d++; 
  pendingBytes++; 
  --numBytes;
}

 

 

 

 

Ok... In the original version it would always break after hitting an 0xf7, but you've changed it so that the loop continues - was that deliberate, because it doesn't seem right to me?

Hmm.. I should not post this kind of thing in the middle of the night because this is where mistakes get made.

Of course it should break if there is a 0xf7 byte, or if any byte comes in that has it 8th bit set. In the last case it should just abort the partially parsed message (as is indicated in the midi-spec btw: http://www.blitter.com/~russtopia/MIDI/~jglatt/tech/midispec/sysex.htm , last paragraph)


do
        {
            if (pendingBytes > 0 && *d >= 0x80)
            {
                if (*d >= 0xfa || *d == 0xf8)
                {
                    callback.handleIncomingMidiMessage (input, MidiMessage (*d, time));
                    ++d;
                    --numBytes;
                }
                else
                {
                    if (*d == 0xf7)
                    {
                        *dest++ = *d++;
                        pendingBytes++;
                        --numBytes;
                    }
                    
                    // any byte that has it's 8th bit set at this point is not-legal: stop parsing
                    else 
                    { 
                        int used = 0;
                        pendingBytes = 0; // don't process sysex after wierdness.

                        // try to interpret the rest of the bytes as a normal midi message
                        const MidiMessage m (d, numBytes, used, 0, time); 
                        if (used <= 0) 
                           break; // malformed message.. 
   
                        callback.handleIncomingMidiMessage (input, m); 

                        numBytes -= used; 
                        d += used;  
                    }
                       
                    break; // this break was missing before..
                }
            }
            else
            {
                *dest++ = *d++;
                pendingBytes++;
                --numBytes;
            }
        }

while (numBytes > 0);

So after an "illegal" byte is parsed during the sysex, I try to interpret the rest of the bytes as a normal midi message. For our app, this works. It's not dropping any messages. But, I'm not quite sure this is actually the right way to do it.

Ok, well I've made some changes which look ok to me, but I don't have any test data for it - let me know if you hit any problems with it!