Migrating code to multi-core processors has become a major design decision for embedded developers. To help simplify migration efforts, operating system vendors have introduced solutions for asymmetric multiprocessing (AMP), symmetric multiprocessing (SMP), and bound multiprocessing (BMP). However, once the basic multiprocessing model is chosen, the real work begins. It isn’t enough to get software to run on a multi-core processor — the key to success is optimizing the software to make full use of all the processor’s cores.
This paper examines various techniques for optimizing code on multi-core processors. It addresses threading models for multiple concurrent tasks and parallel processing for increased performance. It discusses how to minimize lock contention with mutexes and semaphores by engineering the appropriate levels of lock granularity. Finally, the paper explores methodologies for resolving performance problems that result from inefficient use of CPU cache. [Continue reading →]