[riot-devel] Changing Cortex-M PendSV priority and handling in mutex.c
pekka.nikander at aalto.fi
Mon Sep 17 14:25:53 CEST 2018
>> 1. Having PendSV at the same priority with the others, as today, causes instability with nRF52 SoftDevice
> I encountered similar trouble when integrating the softdevice the first
> time. Back then I used the SOFTDEVICE_PRESENT define to skip all RIOT
> specific ISR priority meddling. Is that not working anymore?
While porting to the nRF SDK 15.0 and SoftDevice 6.0, I needed to rework quite extensively what SOFTDEVICE_PRESENT means in detail terms. Among other details, I needed to change the RIOT CPU_DEFAULT_IRQ_PRIO to 6. Without that basically all attempts to use xtimer caused either blocking threads or hardfaults, sooner or later.
(Yes, I know, https://github.com/RIOT-OS/RIOT/pull/9473 is still pending. But it looks like that I will need to fight still quite some time with getting it right and making some smaller PRs before I can go back to it.)
>> 2. Having PendSV at the lowest priority increases performance
>> So, maybe it is time to revert that change?
> We can certainly reconsider.
> I think there's also the concept of ISR priority groups, with group
> priority and subpriority within groups. The former control preemption,
> the latter control the order of execution should the group priority be
> the same.
> Maybe this can be exploited, by putting all ISRs in the same group
> priority but give pendsv the lowest subpriority.
I changed to using ISR subpriorities of one bit, so that most RIOT interrupts at 6 and PendSV at 7 now have the same priority but PendSV is always executed in the end. So far the system seems stable. :-)
But I've been running the system with subpriorities now just for less than an hour. Hence, I cannot say much about the real stability yet. Already before adopting subpriorities, but with cpsid and cpsie in isr_svc(), the system was semistable, never crashing but still xtimer_*sleep() blocking, sometimes soon, sometimes only after hours.
While testing with various options and debugging quite a lot, I've seen the variants of xtimer_*sleep both to crash and block a lot. Hence, I strongly suspect that there are some latent bugs there. But I don't properly understand what is going on there. Most likely some interrupt landing in the middle of the PendSV handler somehow interacts with their mutex use, easily causing blocking and easily also corrupting the stacks.
My student had so big problems with xtimer_*sleep related instabilities that he reverted to use xtimer_set_msg instead.
If the current approach turns out to be stable, I'll update #9373 to use this at some point. That won't be enough to make it pass, since apparently there are still some bugs in it in IPSP as well.
More information about the devel