Best practice on reading only part of a large file while saving memory?

Hi guys,

I’m try to optimize my program regarding reading a large file. So basically I have a large file with size over 2GB. I only need to read roughly 100MB of data from it each time. The reading process is triggered by human choice, so it is not frequent at all.

The original design is to load the large file to memory (2GB) at the very beginning, which works fine. However, I still view it as a waste of memory and it should be optimized. So I tried loading the whole file (+2GB), and extracting what I need (+100MB) to another memory block, and then clear the whole file (-2GB). The final memory usage is about 100MB, but this method still introduces a peak memory usage at 2GB for a short time.

Is there any better method to avoid that 2GB peak memory usage?

Thanks,
Liang

The typical method of dealing with files larger than your ram budget is to use memory mapped files to access only the data you need from the file - telling the OS that the file is to be memory mapped means that when you access portions of your process memory space in that mapping, the OS will page in/out the pages of the file that you need to access - without consuming all available memory.

This is standard practice but I’m not sure if JUCE provides framework functions for doing this - wouldn’t surprise me if it did somewhere but I just haven’t encountered it yet myself …

See some details on the technique here:

Memory Mapped Files And Shared Memory For C++.

And here:

Thanks for your reply.

JUCE already has a class of MemoryMappedFile (classMemoryMappedFile). I guess using this class together with MemoryInputStream is the right way to practice memory mapped files with JUCE framework. I’m gonna give it a try :slight_smile:

Liang

2 Likes

Its a good practice to learn no matter which framework you’re using. Glad you found JUCE’s own classes for the subject …