10.3.15

tcmalloc

During last week I have got a pleasure to work with custom memory allocator from google perf tools: http://goog-perftools.sourceforge.net/. Its called tcmalloc, and let me write few words about it.

Tcmalloc is a replacement for glibc malloc, which today is widely used in most of standard new operator implementations. It's faster mainly because its lock free, and it's lock free because its multithreaded. Here we come to some troubles mainly because not all systems support this kind of MT that tcmalloc implementation is using. There is something called TLS (Thread Local Storage) and it could be that you system does not support TLS. If this is the case you should in my opinion try to find another solution, why ?
Indeed its possible to switch off TLS in config.h file before you run ./configure on tcmalloc that will definitely switch off TLS for you, but than you will not see any speed ups from tcmalloc, so there is no need to change it. Secondly its definitely designed for multithreaded environment, if you running on one simple thread.. well I will show you in next post some real results that maybe you can decide is it worth your time..

So firstly when to use it ? 
Only when you really have to. Indeed it sound nice to replace some basic function by another one and get 5% average speed up. But only in theory. In practice it may lead you to many memory disasters, especially when you are dealing with dynamically linked libraries. 
Why dynamically linked ? 
Well lets consider that you are working on a huge project with many libraries, its obvious that compiling your library with tcmalloc won't work correctly with other libraries. You will quickly see problems related to : Trying to free not exactly your memory (allocated by tcmalloc , but deleted by simple free or opposite).
There is a way to get rid of this and its called LD_PRELOAD, it's a kind of workaround to ensure that all libraries are loaded with tcmalloc even if they were not compiled with it. Sounds good ? Yes indeed you just need to do some basic export LD_PRELOAD=\path_to_lib.so and that is it. Rebuild. Have fun. This is currently the only possible way to make it work if you are working on a complex project with many libraries and your backend is linking to many stuff that is not on your side. LD_PRELOAD ensures that dlopen will take care of any dynamically linked lib and will replace malloc by tcmalloc. 

To sum up most important parts:
  • Its designed to excel multithreaded environments, but it may give also some good results in a simple one threaded application, mainly because it gives you possibility to setup page size manually, so you can configure it to your needs.
  • gperf tools provide more than only tcmalloc, but tcmalloc is a base component for the rest.
  • google folder in gperftools package is deprecated, please read .h comments carefully when you configure it, its quite common that people are linking to old deprecated google folder in the pkg.
  • if you are going to use tcmalloc with other tools from gperf , please pay attention there are a lot of things that can go wrong in this case especially in 64 bit environment (stack unwinding, dlopen bugs and so on..).
  •  the easiest way to use it is not to link it but use it with LD_PRELOAD, and for now the safest way also. It depends on your needs but in big projects it will usually be the safest way.
  • switch off TLS in config.h if your system is not supporting it. But if its not than why would you like to use tcmalloc ? 
  • DON'T use dlopen to load tcmalloc, it will not work properly.. firstly because you will be forced to switch off TLS support and secondly because you will quickly encounter free memory problems.
  • if you get an error like:
    ___tls_get_addr: symbol not found
    it means you are doomed, and your system is kinda old ;-) You can still fix it by turning off TLS support but well.. didnt I tell you that its not worth ? ;-)
 In next tcmalloc episode I'm going to present some real examples on performance improvements or not :-)

Rgds
$(TA)$

Comments $\TeX$ mode $ON$.

1 comment: