300x250 AD TOP

Search This Blog

Paling Dilihat

Powered by Blogger.

Wednesday, December 28, 2022

ESP32 Performance Profiling

The ESP32 is very capable but even the most capable of devices can get overwhelmed when using it extensively, so how can we find out what is taking so long or which function should we optimize to make things even better?



FreeRTOS Real Time Stats

FreeRTOS has a function vTaskGetRunTimeStats which can get statistics for tasks runtime, however the API takes some performance away. So it needs to be enabled in the configuration CONFIG_FREERTOS_GENERATE_RUN_TIME_STATS.

The following is an example from esp-idf examples.

Getting real time stats over 100 ticks
| Task | Run Time | Percentage
| stats | 938 | 0%
| IDLE | 403920 | 20%
| IDLE | 242954 | 12%
| spin3 | 225340 | 11%
| spin5 | 225360 | 11%
| spin6 | 225344 | 11%
| spin1 | 225392 | 11%
| spin4 | 225392 | 11%
| spin2 | 225360 | 11%
| esp_timer | 0 | 0%
| ipc1 | 0 | 0%
| ipc0 | 0 | 0%
Real time stats obtained

While this can help you zone in the task that takes the most time it won't help you find the slowest function or stack trace.

So when analyzing a potential performance issue, I'd use it as the first step to finding the tasks that take unusual amount of the CPU time.

Profilers

There are 2 general types of profilers, sampling profilers and tracing profilers. Sampling profiles capture the state of the program every x. Tracing Profilers inject hooks which are executed before and after each function.

While its possible to run both on ESP32, I've encountered problems trying to use a tracing profiler on ESP32 on PlatformIO since it uses the same build_flags -pg for both the bootloader and the application and it causes issues with missing _mcount function.

That leaves the sampling profiler option still available. But how to determine which function is currently running?

We can do it in two ways:

1. Sample each FreeRTOS task, since the stack pointer can be accessed we can check each stack pointer for the current PC (Program Counter) and determine which function is currently running. 

2. Sample the currently executing function, this can be done with interrupts since the interrupts share the currently executing task stack all we need to do is skip the counter's functions and the rest of the stack belongs to the currently running task. Since we have two cores we need to do it for both cores.

I've chosen to go with option no. 2 since it tells me more about what is currently running.

ESP32 Semihosting Profiler

It works by sampling the entire call stack and keeping statistics on the number of times a function was seen in the call stack, it then sends that information to the host computer though semihosting file system.

Once the sampling is done, the raw samples are processed to get the function name and locate the source line and the results are displayed and callgrind file is generated.

prvIdleTask tasks.c:3973  -> esp_vApplicationIdleHook freertos_hooks.c:63 : 783 307926512 76424201
vPortTaskWrapper port.c:131 -> prvIdleTask tasks.c:3973  : 783 307926512 76424201
esp_vApplicationIdleHook freertos_hooks.c:63 -> cpu_ll_waiti cpu_ll.h:183 : 781 307926512 76424201
vPortTaskWrapper port.c:131 -> spin_task4 real_time_stats_example_main.c:163 : 206 70213898 30992208
vPortTaskWrapper port.c:131 -> spin_task1 real_time_stats_example_main.c:151 : 204 70204906 31215652
vPortTaskWrapper port.c:131 -> spin_task5 real_time_stats_example_main.c:167 : 202 86176568 34547921
vPortTaskWrapper port.c:131 -> spin_task2 real_time_stats_example_main.c:155 : 201 97310444 38941256
vPortTaskWrapper port.c:131 -> spin_task6 real_time_stats_example_main.c:171 : 201 89345478 35511218
vPortTaskWrapper port.c:131 -> spin_task3 real_time_stats_example_main.c:159 : 201 68584391 30480011
spin_task4 real_time_stats_example_main.c:163 -> spin_task real_time_stats_example_main.c:143 : 134 62236086 27673084
...


As always you can find the fruits of my labor at my GitHub account.


Tags: , , ,

0 comments:

Post a Comment