Compiling "big" files with Visual Studio


#1

I’m trying to compile a plugin with a “big” file (around 15 MB) but I’m getting this error:

fatal error C1060: compiler is out of heap space

I tried both using IntroJucer/ProJucer binary data and the old BinaryBuilder but got the same error. The .cpp file generated is around 57 MB (from that 15 MB file).

Under OSX everything is fine (as usual). I also tried to allocate less memory to Windows (using bcdedit /set IncreaseUserVa 3072) and sometimes the error I get is different:

Command line error D8040: error creating or communicating with child process

I’m lost. Is there anything else I can try?


#2

From what I understand BinaryBuilder won’t really help since it lumps all binary data into a single .cpp, while Projucer is a bit smarter and can break up binary data into multiple files if the files become too big. However, I think Projucer’s behavior is only beneficial if you have a lot of small files as opposed to a few big ones, as I don’t believe it will break up single big files into multiple .cpp files (which would solve your issue).

I suspect you’re the first to have major issues with this limitation because the JUCE embedded resource system is generally used for small chunks of data like samples, images, icons, etc. Have you considered whether storing the file separately on disk and accessing it from within your app may be more appropriate? You could even pull some fanciness like having it compressed/streamed/etc. instead of always resident in memory.


#3

Thanks for the reply. Yes, I just noticed that, one big file won’t be split into multiple .cpp.
Unfortunately I can’t really store it on disk, I really need to have it bundled with the plugin.
I’m thinking, is there’s a way to split it manually and then recombine it in the code somehow?
Or any other option I may try?


#4

What I would suggest as a workaround which I think should work (and how I think Projucer should implement this) is to make binary blobs nested arrays so that they can have their contents defined in separate files.

I.e. in BinaryData.h, we’d define the file as 5 contiguous chunks of 1024 bytes, to be declared elsewhere.
extern const char bigFile_bin[5][1024]; extern const int bigFile_binSize = 5002;

Then you would make a couple of .cpp files to specify the chunks of the file as bigFile_bin[0] in one .cpp, bigFile_bin[1] in another .cpp, etc. Then to access it I’m pretty sure you can just cast it bigFile_bin as a const char* and since the entire multidimensional array is stored contiguously it should “just work” so long as you respect the length parameter (since unless your data is exactly divisible into the sizes of your sub-array chunks, you’ll have leftover bytes at the end).

To actually make this I’d suggest taking the giant array the Projucer put out, splitting it up manually into separate files, and padding it with zeros at the end.


#5

Ok sounds reasonable, although I’m not really sure how to practically doing it. Even on my fast machine opening the huge .cpp takes a lot of time.
Anyway, I’ll experiment with this tomorrow. Thanks again.


#6

In this situation your choice of text editor is important. Pretty much anything will choke and die on a file that size except Sublime Text (from my own experience), which I can only assume is powered by magic.


#7

If the file is big because you’re adding many smaller files to the binary data, then the Projucer has a setting for this!
If you look on the top-level settings for the project, there’s a setting called “BinaryData.cpp size limit” - if you set that to e.g. 5MB then it’ll create multiple smaller binarydata.cpps.

(But if you try to put a single large 15MB file in there, it’ll obviously not be able to split that single C++ literal declaration across different source files, so this won’t be much help)


#8

Thanks Jules but yes I have a large 15 MB file that I need to split. Would be nice to have such feature in ProJucer.

Anyway, I managed to manually split the array but now I’m not really sure what to do. I have this:

extern const char bigFile_bin[2][7830077];
const int bigFile_binSize = 15660154;

In the two big .cpp I have this:

static const unsigned char temp_binary_data_1[] = {137,80,78,...}

Usually ProJucer adds something like:
const char* bigFile_bin = (const char*) temp_binary_data_1;

What should I do when the array is split into two?


#9

Does the /bigobj flag help? https://msdn.microsoft.com/en-us/library/ms173499.aspx


#10

Unfortunately no, just tried. Thanks anyway.


#11

There’s no point trying to split it like that - there’s no way I can think of to declare a single literal array that combines two arrays from other compile units.

You might as well physically split your binary file into several pieces and add them to the binary data, because however you do this you’re going to end up with several arrays containing chunks of it, which your app will need to stitch together somehow.


#12

I see…well it’s really strange that a 15 MB file can cause so much trouble with Visual Studio. Thanks anyway.


#13

GCC on Linux used to have similar trouble a few years ago (much more so even than MSVC).

It’s rarely a problem for programmers on Windows, because the standard way to include large chunks of data in an executable is using resources (with LoadResource and friends). The possible downside of this approach is that the address of the data blob is no longer a compile-time constant.

Another thing you can try: what happens if you define big datafiles as an array of uint64 instead of char? This would have only 1/8 of the amount of integer literals in that CPP file.


#14

I made it use chars rather than int64s to avoid endianness issues, since to do it with int64s would mean that whole source file would have to be twice as big and contain both big- and little-endian versions of the data.

Using new c++11 literal syntax it might actually be possible to write it more compactly as a big string literal with escaped character codes. Of course whether that’d help with Visual Studio depends on whether it’s the parser or the code generator that’s causing the algorithmic complexity problem…


#15

Is your resource already compressed? You could zip it first and add the zipped content as resource.


#16

And isn’t base64 encoding a more compact way of storing data than just a char array?


#17

No, it’s much, much worse!


#18

Well yes and no, yes the encoded size will be bigger but the textual representation is much smaller, so when embedding a resource in a cpp file base64 encoding seems a better choice.
Because for a char array a textual representation has to be made, first off every value will be separated by a ‘,’ so that’s a loss to begin with.

As a test i embedded a png in both char array format and the same png as base64 encoded.
The base64 encoded version uses a string literal to embed it in a .cpp file. The array version is 11kb in size the base64 one is 4kb in size.

And you know what’s cool, you don’t even need the extra size variable because the size is taken from the string. So the binarydata file looks cleaner as well.

I ran a test and decoding works fine, of course you need to decode the base64 string using a MemoryBlock before you do anything with it.


#19

@Rebbur I think that’s missing the point. The issue is not the size of the .cpp file, but how much memory the compiler is using when compiling that file. String literals won’t work for large files: MSVC also has a 64k size limit on string literals, and you cannot split it up in smaller chunks because string literals are always null-terminated.

@jules Yes the different endianness is annoying. Doubling the size of the .cpp may be a problem if you’re very low on disk space, but I expect for the compiler it won’t matter if the preprocessor strips away half of the file.

I’ve no idea what causes the memory usage either but it seems to be (in this particular case) increasing with the amount of integer literals. MSVC takes a similar amount of memory and time compiling either a char[] or uint64[] of length 20,000.

You can also keep the large files as separate files and make sure they get installed in the same folder as your DLL or EXE. Load them using File::getSpecialLocation(​File::currentExecutableFile).​getSiblingFile("myData.dat").