Varying Success with URL::readEntireXmlStream


#1

As the subject says I’m having various (but mainly unsuccessful) results with URL::readEntireXmlStream. What I’m basically doing is trying to parse a webpage as XML then sort through it to extract various bits of information. What I don’t understand is that it works pretty well with certain pages (http://www.chemical-records.co.uk/sc/servlet/Info?Track=31R038) normally getting through on the 1st or second attempt but is never successful on other pages (http://www.chemical-records.co.uk/sc/servlet/Info?Track=LFTD004) even though they are basically the same pages just fed with different information from the database.

I tried readEntireTextStream as well to make sure it wasn’t the XML parsing thats failing and got the same results. Does anyone have any pointers they could throw my way on this? I’ve spent a whole day on it am and not much further along. Could it be that the page is not getting loaded quickly enough so the program just carries on without it? If so is there any way to pause execution for a while so it can download the information?

Again, any help or ideas much appreciated.

Dave.


#2

xml aint HTML you will get wierd results if the page is not formed well, even if it is there will be problems.