XML file: get line number of XmlElement*?

Hi chaps & chapettes,

(If you're in a hurry, jump to my question in bold).

I'm in the process of writing a PresetValidator class for my MIDI Processor plugin. The user will be hand rolling their own XML file to define the desired plugin behaviour, so I will need to provide very clear validation errors.

I'm loading the XML file into an XmlDocument and then using XmlElement's methods to validate each tag in turn. e.g.

bool PresetValidator::isNoteMessage (XmlElement* x) {
    if (x->hasTagName ("noteMessage") &&
        x->getNumAttributes() == 2 &&
        x->getIntAttribute ("pitch") >= 0 &&
        x->getIntAttribute ("pitch") <= 127 &&
        x->getIntAttribute ("velocity") >= 1 &&
        x->getIntAttribute ("velocity") <= 127 &&
        x->getNumChildElements() == 0)
    {
        return true;
    }

    errors << "<" << x->getTagName() << "> failed validation. The expected form is" <<
           "<noteMessage pitch=\"integer 0 to 127\" velocity=\"integer 1 to 127\" />";
    numErrors++;
    return false;
}

I would like to know if it's possible to get the line number within the XmlDocument of a given XmlElement* so I can let the user know where the encountered error is within the XML file.

Either that, or can anyone suggest an alternative, clear way of pointing the user to the location of an error when validating an XML file?

I'm also open to constructive criticism about my above method - I think it reads fairly clearly, but I'm sure it's possible to do things a lot more efficiently! I'm pretty new to C++ and JUCE, so any advice would be welcomed.

Many thanks in advance for your suggestions!

 

Your code looks fine - I've seen much worse!

There's no line number info stored in the XmlElement structure - typically with XML the errors are syntactic, so line numbers are reported during parsing, but not retained afterwards. Very hard to go the opposite way and work backwards to a line number from the element.. You could write a parser that searches a file for tags and counts until it finds the index of the one that you're interested in, but that'd be quite a complicated task!

Many thanks once again Jules for your super impressive support - I'm in London, so hope to buy you a beer at some point!

Does that mean that I'm approaching the task of validating an XML file against a schema in the wrong way i.e. That I should be writing a parser which handles the raw data line by line (rather than XmlElement by XmlElement)?

Or is the general approach of traversing the DOM and using XmlElement's methods OK, and I should just point the user to the location of any element(s) failing validation based on their hierarchical location within the element tree (rather than the line number), perhaps using XmlElement's findParentElementOf method?

Seems like what you're doing is pretty sensible, it's just that I never added any way to get the line numbers per element.

I think that if I hit this use-case in my own code, I'd probably fix it by add an option to XmlDocument so that you can provide a callback that is called for each parsed element (passing the line number). Then you could implement a callback that looks for your target element and remembers its line. That'd probably be the way to solve this with the least code duplication.

Many thanks for the advice as always Jules, but I'm not sure I understand 100%. Did you mean:

A:

  • Add an optional callback parameter to XmlDocument::getDocumentElement
  • When first parsing a new XML preset file, give getDocumentElement the callback parameter 
  • For each element parsed, call the callback function and pass it the element's memory address and corresponding line number as parameters
  • The callback function stores in memory an object containing a LUT of XmlElement addresses and corresponding line numbers
  • When my PresetValidator::validate(XmlElement*) method is later called, it is able to get a line number for any XmlElement* from the LUT in memory

B:

Parse an XmlDocument using getDocumentElement as normal (no callback parameter), run the resulting XmlElement tree through my PresetValidator, then if/when an error is encountered:

  • Call getDocumentElement (again) on the offending XmlDocument, this time around give getDocumentElement the optional callback function parameter (and an XmlElement* we want the line number for?)
  • getDocumentElement iterates through the XML file's elements (again), this time calling the callback function for each element and passing as parameters the current XmlElement being parsed, its line number and the XmlElement that PresetValidator is looking for
  • the callback function tests each XmlElement for a match against the one we're looking for and then holds the corresponding line number (only) in memory. (I don't know how I would actually go about testing two XmlElements for a match if they aren't literally the same element with identical memory addresses).

C:

None of the above, you idiot!

You could re-export the XML file, adding a parameter of your choice to the error, then use this to find and show the user with whatever post-processing you fancy?  Highlighter pen in a dialog box. 

It's not a perfect solution but it's quick to try.

Yeah, I was roughly thinking like your B option. Would probably be a separate function rather than an optional parameter though, and it'd require a bit of care to avoiding adding any overhead to the normal parser to support something like this which is an incredibly rare edge-case.