11.4.16

Shared Libraries from performance point of view... Introduction [Part 1]

Hi, its been a while since my last post here, but there was a good reason for not posting anything and this particularly means that i was spending time learning new things and meanwhile i was also able to check some of my shared libraries that I'm working with in terms of performance.

So let me present some of my thoughts, but firstly let me start with SO introduction, from high level to low level stuff.. . For all UNIX developers most of this stuff should be already known, but i hope even some of you will find it useful.

Let's start!

Static linking vs Dynamic linking





I could not make it more clear than those two simple pictures. So I hope we are all fine with quick introduction what dynamic linking is.. ;) Now let's go just a bit deeper (still floating! ;) ).

Dependent vs Runtime loaded library

Now here for new folks things are getting a bit more interesting. We can distinguish two types of dynamic libraries - dependent one and runtime loaded one. Let me show you a simple code to distinguish those two:

Dependent library
Runtime loaded library
Runtime loaded way of use
How runtime loaded libraries really works..

OK now you may be a bit confused. Let's try to clarify it, briefly as we are going to jump to much more interesting features. Runtime loaded means that Library is loaded on request dynamically, whereas dependent one are specified at link time for your product (but same reference symbol policy applies as on slide static vs dynamic linking). Runtime loaded libraries are more natural when we talk about dynamic objects. But you have to be aware that dynamic lib. does nto always mean this same. So keep it in mind it will be needed later.

What for we are using dynamic libraries at all ? 

Well apart from many different reasons that some people may like to tell you why dynamic is good, I just want to focus on the most important one from my perspective - hot loads - which means no more than that you do not have to recompile a whole project but its enought to provide a new lib that will be hot swaped with old one. Interesting, but also dangerous! In fact very dangerous.. but yet still used in some production envoronments (be aware we are talking here about pure dynamic ( i like to name them like that) runtime loaded libraries (dlopen!).

But as you may think (and you are right!) there is no hot swap without.. compatibility! and that compatibility is for many reasons crucial for any library developer - especially backward compatibility! So let's take a look what compatibility means for us.

Compatibility between client and Draw Library

       I think its clear enought, but let's make it a bit harder.. Can Client 1.1 use Draw 1.0 Library ? Normally no.. but if Client 1.1 has draw_polygon() declared as WEAK_IMPORT than it can! weak_import attribute tells us that this dependency is optional, but there is more, cause dynamic loader performs version compatibility test only with dependent libraries. Dynamic libraries opened at runtime with dlopen don't go throught this test! Ops.Moreover you can check functions against NULL to check if they are supported in the particular dynamic library that the code is currently running against. It seems like a lot of different possibilities! 

weak_import attribute example
In fact we are still in very basic parts of dynamic libraries, but that is good, we have to understand them correctly to go deeper into performance point of view. So if any concept that was mentioned before is not clear, please clarify it before going deeper. Ok? Ok let's take a look on something that from developer point of view is one of the most important part of any library, Library Interface.

Library Interface

Library Interface

Interface dependent vs runtime loaded

This should be clear, as library we are obliged to export interface for our clients (that in most cases is unknown for us). There are some basic rules for defining an Library Interface. FIrstly, functions on client side are referenced (linked) to implementation and that dependency may be resolved during link time or latter (lazily). Secondly we should export (i use this term intentionally) functions that are going to be interface and we should export only header files. Moreover we shoudl export wrappers only. Good example for such wrapper function may be a sort example.

Wrapper Example
What that means ? It means that internall specifix functions should be usually hidden, and wrappers should be able to choose which one is the best one to use for specified data (for example). Internally inside Library we should call internall functions directly (it will reduce overall performance). And to make it all clear and enjoyable we should define some export strategy for our library. Briefly what when and how we are going to export our interface - and that kind of crucial info we should provide as starting doc for any library developer that is going to contribute.

Library Dependencies

When you write your own library you are usually using also some other libraries that you depend on.
Very easy way to check on what kind of libraries you have a dependency is to use:

readelf -d yourlibrary.so 

that will produce for you output that partially looks like:


What is interesting from dependency point of view here is (NEEDED) that is nothing more than DT_NEEDED that we will discuss a bit latter on. NEEDED are libraries that your library is dependent on and that will be used when searching for symbol signature to resolve. In my case as you can see we have quite a lot depndencies for this particular library including libraries like libc libgcc and libstdc++ or even libpthread (that is 100% normal). Important note to remember:
More dependent libraries your library has, the longer it takes for your library to load! You should keep to minimum (only the ones that you really need) number of external references to symbols in dependent libraries. This practice will further optimize your library load time. There is one more thing when we discuss library dependencies, when we are loading dependencies dynamically some libraries may need to initialize firstly some values/allocators/whatever and each library developer has to take care about that part by specializng (if needed) library initializers and library finalizers (please pay attention that you should not name them _init nor _fini - those are system initializers that you may overwrite! and remember that you should never export them as part of your Library Interface.

How to specifce them ? Quite easy (internally inside functions its just printf example):

Library constructor (initializer)

Library destructor (finalizer)
All is clear for now ?
I Hope yes cause it was just a quick brief to the most basic concepts of shared libraries development. In Part 2 we are going to go deeper to interesting stuff - Dynamic Liner Startup Process! So fasten your belts before Part 2.

PS. I used some photos from google to present concepts that i'm talking about.
Credits goes to : http://www.vincehuston.org/dp/all_uml.html
https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/DynamicLibraries/100-Articles/DynamicLibraryDesignGuidelines.html and some that are not named because i have lost their websites ;) (ping me if you know the original source)