Xml parsing performance


#1

Hi all,

I have a set of XML files that contain the saved data of my application.

They are files of about 400 kB each, and I have to parse them all at a certain point of my application.

Is it possible that the current parser takes almost 8 seconds to do the job?
Even taking out the logic of my program that works with the resulting tree of XmlElement objects, still the time taken is 5 seconds (I’m talking about Release builds).

Preloading the content of those files in a string and then parsing it gives no meaningful improvements (the parsing only of the mentioned string still takes almost 5 seconds).

The tests were done on Windows Vista. Surprisingly enough, running the same code on a equivalent Mac results in run times of 2 seconds or less. How is this possible?


#2

Try a release build. On windows the debug-mode memory allocators can make it quite slow at XML parsing.


#3

[quote]Even taking out the logic of my program that works with the resulting tree of XmlElement objects, still the time taken is 5 seconds (I’m talking about Release builds).
[/quote]


#4

Ok… Well I guess the difference between the mac and pc must be in the memory allocator performance, there’s not really anything else that would be different between them.

The XML parsing really is extremely efficient, I’ve optimised it heavily over the years, and doubt whether it’d be possible for anything to do the job significantly faster. If you profile it you’ll probably just find that it spends most of its time inside malloc.


#5

yfede, use a profiler like “very sleepy” (turn on debug symbols in release build), and see what’s time consuming, and maybe possible optimizable


#6

If you really really need performance, use the pugixml parser (search on google code), it’s very well done, and utterly fast, but it is using destructive parsing mode.

You’ll loose the Juce nice features however.


#7

I think I’ll go with a background thread that loads all of the XML files in memory, making them available for the instances of the application above in a “cached” manner