Not parsing an xml


#1

in my opinion the xml parse of juce is too much restrictive to what can xml file can be read, i know maybe it’s not xml standard, but from some kind of server (not my own one) i’m receiving this xml (excerpt):

<?xml version="1.0" encoding="utf-8" ?> <root> <test>ID &lt;&gt; 20</test> </root>

and this can’t be parsed correctly by XmlDocument unless i change the xml itself. this returns tag name missing… is a bug or a feature to not read some kind of documents ?


#2

That should really be parsed correctly… Try changing line 617 of juce_XmlDocument.cpp to

if (entity.startsWithChar (T('<')) && entity [1] != 0)


#3

gr8 that worked ! thanx !


#4

don’t know why, initially this seemed to work, but now the getAllSubText()
of the tag just return “> 20” not “ID <> 20”… what the hell is going on ? maybe is an encoding problem ?


#5

Try using LibXML2 library (http://www.xmlsoft.org/). LibXML2 is the best free multi-platform xml library available today. It’s written in C but you can download a C++ wrapper library separately or you can write your own one for your convenience.


#6

yeah sure. but actually i’m just trying to limit the number of dependancies that my apps require… there is also TinyXml classes to do that tho, but i really would like to use JUCE ones…


#7

[quote=“kraken”]don’t know why, initially this seemed to work, but now the getAllSubText()
of the tag just return “> 20” not “ID <> 20”… what the hell is going on ? maybe is an encoding problem ?[/quote]

It’s a slightly tricky issue about whether you should parse the XML inside an entity - in the past it didn’t, so if you had a bit of text like &somexml; which expands to a bit of XML, then you’d get the raw xml text when you call getAllSubText(). Then I changed it to parse the entity’s contents, but that meant that < was getting parsed and causing your error, because it looks like some broken xml. The tweak I suggested just stops it parsing single-quote tags.

What exactly is the xml that’s not working now?

With all due respect, I bet it’s a pain-in-the-ass to code with this, compared to my parser…


#8

yeah i understand.

this is the xml as i get it:

<?xml version="1.0" encoding="utf-8" ?>
<Project>
	<Layer>
		<name>ISOLE_LAGUNA</name>
		<table>UGN.ISOLE_LAGUNA</table>
		<shape_field>GEOM</shape_field>
		<where>ID &lt;&gt; 398</where>
		<shape_type>0</shape_type>
		<ident>ID</ident>
		<always_listed>Y</always_listed>
		<line_color>230</line_color>
		<line_thick>1</line_thick>
		<text_color>1</text_color>
		<text_height>10</text_height>
		<text_visible>N</text_visible>
		<text_font>SansSerif</text_font>
		<poly_fill_color>255</poly_fill_color>
		<poly_fill_style>1</poly_fill_style>
		<poly_fill_xor>N</poly_fill_xor>
	</Layer>
</Project>

#9

ah probably i’ve found:
XmlDocument @ line 645:

should be

or it will suppress any previous content… right ?
now is working ok, but maybe there are other special cases not handled correctly.


#10

Ah yes, I think you’re right about that!


#11

You’re not quite right. If you mean a simple XML parsing, I’m agree with you but not for 100%. Pay attention to the age of the library and to the time spent to test its functionality and usability. LibXML2 is much more than the simple XML parser as it may appear. It’s the system that provides a variety of tools to manage XML. What about XPath which I use intensively? What about XInclude? What about Schema? What about XPointer? What about XQuery? What about excellent recovery from errors? What about streaming XML? What about HTML support? What about easy XML restructuring? Can your XML system support all that features? TinyXML sure is not that one. For me, XML is not just a file to store and to retrieve my banal application settings. XML is the extremely flexible technology to use. And just a good C++ wrapper classes can help.

By the way, I see you use XML in some parts of your library. What about JUCE XML resource files? What about that to construct controls dynamically using XML data from memory or files? Binary data can be easily stored in and retrieved from XML as MIME64 encoded data inside the <![CDATA]???]]> tag. So, what? :slight_smile:


#12