I’m working on a 3D application and I have to read big (50 MB) text files containing 3D model data. As the file format is line oriented I tried to do it the clean way (the juce way) and use InputStream::readNextLine(). It turns out to be very slow.
The profiler revealed that the program spends its time in String::operator+=, in fact 10 times more than in any other relevant functions, including the actual I/O operations.
As the lines in my file type are generally longer than 32 characters, I tried to recompile juce with s.preallocateStorage(256) instead of the original s.preallocateStorage(32) in InputStream::readNextLine, but it only gave a marginal improvement.
I think that inside InputStream::readNextLine the code should not use a juce String to accumulate the characters. Reference counting, nice allocation growing and in general “juceness” are not important here, the chosen simple C data type could be converted to a juce String at the end.
Jules, what do you think? If it is not possible in juce I will have to fall back to file block reading and good old character pointers …
Thanks Jules, that’s exactly what I needed here. With the 50 MB test file the the 40 sec read time came down to 5 secs
A minor comment: you should use tchar instead of char as the type of buffer (otherwise it did not compile for me). I hope you will include this in the next release, it’s already in my personal build of juce.
Thanks again, I love your reactivity, imagine I ask the same thing from the MFC team at Microsoft …
[quote=“Gwynhale”]Thanks Jules, that’s exactly what I needed here. With the 50 MB test file the the 40 sec read time came down to 5 secs
A minor comment: you should use tchar instead of char as the type of buffer (otherwise it did not compile for me). I hope you will include this in the next release, it’s already in my personal build of juce.
Thanks again, I love your reactivity, imagine I ask the same thing from the MFC team at Microsoft …[/quote]
There’s still an MFC team?
Bring me my broadsword! There is killing to be done.
I don’t understand here.
If a line only ends up with ‘\r’ it is still seen as a whole line ?
Shouldn’t it be ‘\r\n’ to end the line, thus a continue instead of break?
If there’s only a \r without a \n after it, then you’d still want that to be counted as a line, wouldn’t you? Some unix stuff might use a single \r as a new-line.