Embedded Design Handbook

ID 683689
Date 8/28/2023
Public
Document Table of Contents

4.4.3.3.2. Profiler Overhead

Using the GNU profiler impacts memory and processor cycles.

Memory

The impact of the profiling information on the .text section size is proportional to the number of small functions in the application. The code overhead—the size of the .text section—increases when the GNU profiler enables profiling, due to the addition of the nios2_pcsample() and mcount() functions. The GNU profiler implements the system timer with a call to nios2_pcsample(), and implements every function with a call to mcount(). The .text section increases by the additional function calls and by the sizes of these two functions.

To view the impact on the .text section, you can compare the sizes of the .text sections in the .objdump.

The GNU profiler uses buckets to store data on the heap during profiling. Each bucket is two bytes in size. Each bucket holds samples for 32 bytes of code in the .text section. The total number of profiler buckets allocated from the heap is when you divide the size of the .text section by 32. The heap memory that the GNU profiler buckets consume is therefore:

((.text section size) / 32) × 2 bytes

The GNU profiler measures all functions in the object code that the GNU profiler compiles with profiling information. This set of functions includes the library functions, which include the run-time library and the BSP.

Processor Cycles

The GNU profiler tracks each individual function with a call to mcount(). Therefore, if the application code contains many small functions, the impact of the GNU profiler on processor time is larger. However, the resolution of the profiled data is higher. To calculate the additional processor time consumed by profiling with mcount(), multiply the amount of time that the processor requires to execute mcount() by the number of run-time function calls in your application run.

On every clock tick, the processor calls the nios2_pcsample() function. To calculate the required additional processor time to perform profiling with nios2_pcsample(), multiply the time the processor requires to execute this function by the number of clock ticks that your application requires, which includes the time the mcount() calls and execution requires.

To calculate the number of additional processor cycles used for profiling, add the overhead you calculated for all the calls to mcount() to the overhead you calculated for all the calls to nios2_pcsample().