Heap memory recycling memory allocater

chkn · January 31, 2013, 12:57pm

In Multi-threaded applications, all threads using the same CriticalSection for locking the HEAP Memory, which can be a bottleneck and reason for spikes/clicks/pops/buffer-repeats while audio-playback.
This allocator recycles memory-Blocks deallocated by “delete” in its own cache per Thread
Its overloads the global new and delete operator (which is IMHO done by the linker, part of c++ standard), no header file is required
This is proof of concept, the caches won’t be cleaned up at the end, so it WILL LEAK AS HELL (But if a application already will be closed, this should makes no difference)
Of cause, if the allocator is enabled you won’t find your real memory leaks, because its a wrapper for all other new/deletes it could be use to create your own leak-detector

Would be cool to have something like an “OwningThreadLocalPointer”, which calls the a destructor when a thread gets closed

Proof of concept, only tested with MS VS2008

To enable it please define USE_CACHINGALLOCATER

Every memory block will be classified into its own size category.
Every size category/thread use its own cache
Size Categories 0 = 0…2 bytes
1 = 3…4 bytes
2 = 5…8 bytes
3 = 9…16 bytes
4 = 17…32 bytes
5 = 33…64 bytes
6 = 65…128 bytes
7 = 129…256 bytes
8 = 257…512 bytes
9 = 513…1024 bytes and so on…
-1= > 2<<CK_PREALLOCATOR_RECYCLE_MEMORYBLOCKS_UP_TO_SIZE_CATEGORY

Bigger MemoryBlocks will be just allocated by malloc/free

File:
https://github.com/jchkn/ckjucetools/blob/master/CachingAllocaterMultiThreaded/CachingAllocaterMultiThreaded.cpp

jules · January 31, 2013, 1:44pm

Interesting and fun to play with, but in my humble experience, I’ve found that

a) Whatever great idea you think you’ve had about improving malloc, someone else has had it before, and the default implementation probably already does it better than anything you could ever write.

b) When you actually hit a performance hot-spot where malloc really is causing trouble, then blaming malloc is the wrong reaction. Instead you’ll get much better results by blaming the code that calls malloc too often, and improving your algorithms at a higher and more localised level.

chkn · January 31, 2013, 2:03pm

thanks for your opinion

a) the problem isn’t that malloc isn’t fast, the problem is that it shares the same lock with all threads!(at least here in vs2008)
If thread A allocates 128 MB, and thread B only 16 bytes, thread B has to wait

b) The question is, why we get performance hot-spots where malloc causing trouble, mostly because of the lock!
All these techniques we use, to avoid malloc, are because of this single lock attiude.
And sometimes complexity goes way overhead, it can be painful to implement a “malloc”-free design, and it won’t look more elegant.

jules · January 31, 2013, 3:06pm

Yeah, I understood your post, but what I meant is that I’d be very surprised if many/most of the standard malloc implementations don’t already use thread-local tricks internally. It’s hardly a new idea!

Would be interested to see your benchmarks, but just think you’re being optimistic about how much improvement you can get!

TheVinn · January 31, 2013, 3:07pm

The Visual Studio runtimes (all versions) definitely use just a single global critical section, and not thread-local tricks.

mdsp · February 6, 2013, 11:42am

you should have a look at Google’s tcmalloc http://goog-perftools.sourceforge.net/doc/tcmalloc.html

dlmalloc http://g.oswego.edu/dl/html/malloc.html
and TLSF http://www.gii.upv.es/tlsf/ are also worth looking at

chkn · February 6, 2013, 1:12pm

Thanks, did you use one of them successfully on different platforms win/mac?
I ask cause I had a quick look on tcmalloc, i found it a little bit exorbitant and confusing for what i need, also it seems it had only rudimentary windows support.
Can’t think why my solution should’nt work as quick as the fastest allocaters. (assumed that local thread cached is filled.)

TheVinn · February 6, 2013, 2:57pm

Well, the interesting question is what happens when one thread allocates a piece of memory and another thread tries to delete it later?

chkn · February 6, 2013, 3:18pm

nothing, it will be cached in its own thread specific cache, if the cache is full it will be freed by c++ runtimes own free-function.
Every block of memory stores its size-Category in its header, so when the thread-cache release it, its completely independent.

If a application heavily frees memory-blocks from other threads, this may result in a little more usage of the original c++ malloc/free function. You can see this as a downside if you want.
If i have a little more time, i will add a little more statistical information, so it shows the “practical” usage of the Heap-Lock.
Also i will write a test benchmark-app, which do all the “bad things” (like new() on audio-thread/performance intrusion through Critical Sections), which measures the audio-drop outs, will see if the caching allocator makes any difference.
Todays machines are so damn fast, will be hart to measure a difference. If you already chosen a heap-lock free design you shouldn’t have any benefit.

mdsp · February 6, 2013, 4:12pm

[quote=“chkn”]Thanks, did you use one of them successfully on different platforms win/mac?
I ask cause I had a quick look on tcmalloc, i found it a little bit exorbitant and confusing for what i need, also it seems it had only rudimentary windows support.
Can’t think why my solution should’nt work as quick as the fastest allocaters. (assumed that local thread cached is filled.)[/quote]

I didn’t use tcmalloc or dlmalloc on real projetcs, but we’re using TLSF as a real-time memory allocator for Lua.

Topic		Replies	Views
Using "new" vs HeapBlocks General JUCE discussion	8	991	June 19, 2013
Std::malloc in HeapBlock General JUCE discussion	2	593	January 27, 2017
HeapBlock with size General JUCE discussion	1	328	April 11, 2010
A few c++ questions General JUCE discussion	5	1323	November 29, 2019
Are deallocations safe to do on a real-time thread?	6	519	October 20, 2020

Heap memory recycling memory allocater

Purchase

Discover

Learn

Support

About

Events

Heap memory recycling memory allocater

Related Topics

Purchase

Discover

Learn

Support

About

Events