Simultaneously editing and streaming a large file?


#1

Let's take an example case:

1 - A 100MB sample file (ie. too large to reasonably keep resident in memory) is selected by the user.  They start it playing.

2 - While it's playing, they apply a volume envelope to a 10MB area in the middle of the file.  That 10MB section must be read from disk (most likely in blocks), modified and written back to disk without needing to touch the unmodified 90MB.

To achieve part 2, I'd probably choose to open a single stream for read+write, but JUCE can't do this.  I could instead have two streams open on the same file (one for read, one for write) and then perform all editing via these streams, but something tells me that having a file open multiple times for both read and write is probably bad news (or at the very least, I'll be exposing myself to platform-specific behaviours).

Since this is a common use case, I'd like to ask what others have done to solve it?


#2

I can't help but wonder, why don't you just read the file into memory, perform your volume envelope task and then save it? Either to the same file or preferably a new to keep the original intact.

And, by the way, 100MB might have been considered a big file 10 years ago when people developed for 32bits win xp.


#3

Memory-mapping the file might be the way to go?


#4

100MB is still big on a mobile device in 2013!  I was also originally wondering if mmap() would do the job but after some pondering over the weekend and (admittedly) a quick peak at Audacity's source, I think this might be a nice way to do it:

 

- When a sample file is opened, the app builds a list of chunks, of eg. 1MB each.  This list of chunks always stays in RAM, and at this point each chunk simply points to an appropriate offset in the file.

- When a section of the file is edited, the affected chunks are read in, modified and written to temp files.  The chunk list is updated so that some chunks now point to the temp files and the remaining chunks still point to the original file.

- Note that the original file is not changed - modified chunks build up and should eventually be 'committed' (ie. written to the original file) when the user explicitly saves

- The chunk system lends itself well to efficient undo/redo, and is a natural place to also store thumbnail info etc.

 

Main downside : Playing the modified file (eg. auditioning edits prior to saving them) is a matter of following the modified chunk list, which is fine for streaming audio sources with linear access patterns but a PITA for synths which expect random access to resident samples.  Memory mapping might still come to the rescue here:

http://stackoverflow.com/questions/10454964/mapping-non-contiguous-blocks-from-a-file-into-contiguous-memory-addresses

.. however, crappy WinRT doesn't support memory mapping to explicit addresses and I've been trying not to 'design myself out of' supporting this platform although heaven knows why.


#5

If you have chunks of fixed size, you can easily compute an id an see if it needs to be streamed or if it's already there because modified.


#6

The blocks would start out at uniform size, but then certain editing operations (eg. delete/insert a section in the middle of the sample) would have to delete/merge/split blocks so they'll end up at odd sizes.

I think I'm just going to impose a size limit on samples used by the synth (eg. 4-8MB, it is a mobile app) and then keep the initial part of the sample (up to this size limit) always resident in memory in its 'linear' format (ie. without the synth's sample player needing to hop around different blocks).

Using mmap to rewire non contiguous blocks into contiguous memory would be very cool technique if it's possible, but it's not available on all the platforms I want to consider.