[riot-devel] Changing Cortex-M PendSV priority and handling in mutex.c

Nikander Pekka pekka.nikander at aalto.fi
Sat Sep 15 10:02:41 CEST 2018


Hi all,

In Cortex-M MCUs, the PendSV interrupt is used to implement the actual context switch.  The way that it is currently used with RIOT causes instability with nRF52 SoftDevice.  Fixing this seems to be quite easy (see below), but might cause instability elsewhere.  

My apology for this unfortunately quite long mail.  However, I think we need to understand the background and discuss the details.

Rationale and background
------------------------

The general background for PendSV is explained rather well starting from page 194 in Yiu's "The Definitive Guide to the ARM Cortex-M0" [1].  There, and also in e.g. at the ARM community forums [2], there is a clear recommendation of having PendSV at the lowest priority.  However, RIOT currently places PendSV at the *same* priority with other interrupts [3].

I am proposing that RIOT also adopts the recommended practice and -- perhaps after some further testing -- places, by default, PendSV at the lowest priority level.

As a second change, I propose that the order of lines 61 and 62 in `thread.c` are changed [3].  (See also my previous email on this list [4]).

There are two reasons for these changes:

1. Having PendSV at the same priority with the others, as today, causes instability with nRF52 SoftDevice

Exactly why PendPV at the same priority with other interrupts causes instability, I don't know.  It may be related to the fact that the Nordic SoftDevice uses an interrupt trampoline.  That is, the MCU interrupts don't land directly at RIOT, but the physical interrupt vectors point to a trampoline function inside the SoftDevice, which then calls the RIOT handler.  It may be that the SoftDevice uses PendSV also for some of its own functionality, too, perhaps not always calling RIOT, or first doing something and then calling RIOT.

In practical terms, I get HardFaults at irq_restore() or thread_yield_higher() around lines 61 or 62 in `mutex.c`, depending on their order.  Most of the time they stem from xtimer_usleep().  With the current order, with irq_restore() first and thread_yield_higher() second, the instability is much worse, i.e. much more crashes.  Independent of the order, the crashes are indeterministic, but relatively easy to trigger under a number of conditions that I don't fully understand yet.

2. Having PendSV at the lowest priority increases performance

The PendSV interrupt is designed to be used to perform the actual context switch just before returning from an interrupt to a normal thread context.  The basic idea is to switch the thread stack and context only when there are no more interrupts pending.  If there are still pending interrupts, running them may cause a higher priority thread to become ready to run.  If so, with the current RIOT practice, PendSV may first change the thread, then, without this thread getting any running time, the next pending interrupt may make another thread runnable and trigger another PendSV, causing another thread change.  The first thread change is wasted.

Kaspar and Oleg, you seem to have changed this in #3511 [5].  Looking at #3450 [6], I think the main problem was that PendSV calls sched_run().  But, looking at isr_pendsv(), I don't see it calling sched_run().  Only isr_svc() does.

So, maybe it is time to revert that change?

Proposed changes
----------------

1. Make it possible to have separate priority to PENDSV:

```
   diff --git a/cpu/cortexm_common/cortexm_init.c b/cpu/cortexm_common/cortexm_init.c
   index 1015d4e87..9350b32cb 100644
   --- a/cpu/cortexm_common/cortexm_init.c
   +++ b/cpu/cortexm_common/cortexm_init.c
   @@ -30,6 +30,22 @@
    #define DONT_OVERRIDE_NVIC 1
    #endif
    
   +/*
   + * According to the ARM Cortex-M documentation, the recommended best
   + * practice is to place the PendSV at the lowest interrupt priority.
   + * However, RIOT has traditionally placed it in the same priority with
   + * the other interrupts.
   + *
   + * For now, by default we preserve the traditional RIOT behaviour,
   + * but allow specific CPUs, boards, or apps to change this.
   + *
   + * It is anticipated that this may be changed to the lowest
   + * priority at some point, after sufficient testing.
   + */
   +#if !defined(CPU_PENDSV_IRQ_PRIO)
   +#define CPU_PENDSV_IRQ_PRIO CPU_DEFAULT_IRQ_PRIO
   +#endif
   +
    #include "cpu.h"
    
    /**
   @@ -46,8 +62,8 @@ extern const void *_isr_vectors;
    CORTEXM_STATIC_INLINE void cortexm_init_isr_priorities(void)
    {
        /* initialize the interrupt priorities */
   -    /* set pendSV interrupt to same priority as the rest */
   -    NVIC_SetPriority(PendSV_IRQn, CPU_DEFAULT_IRQ_PRIO);
   +    /* set pendSV interrupt to its own priority */
   +    NVIC_SetPriority(PendSV_IRQn, CPU_PENDSV_IRQ_PRIO);
        /* set SVC interrupt to same priority as the rest */
        NVIC_SetPriority(SVCall_IRQn, CPU_DEFAULT_IRQ_PRIO);
        /* initialize all vendor specific interrupts with the same value */
```

2. Fix the nRF52 SD instability

```
   diff --git a/cpu/nrf52/include/cpu_conf.h b/cpu/nrf52/include/cpu_conf.h
   index f8b4183b2..efb850ae8 100644
   --- a/cpu/nrf52/include/cpu_conf.h
   +++ b/cpu/nrf52/include/cpu_conf.h
   @@ -44,6 +44,7 @@ extern "C" {
     */
    #ifdef SOFTDEVICE_PRESENT
    #define CPU_DEFAULT_IRQ_PRIO            (6U)
   +#define CPU_PENDSV_IRQ_PRIO             (7U)
    #else
    #define CPU_DEFAULT_IRQ_PRIO            (2U)
    #endif
```

3. Change the order of when PendSV is triggered and the interrupts are enabled in `mutex.c`

```
   diff --git a/core/mutex.c b/core/mutex.c
   index e75c5882f..f7bde2364 100644
   --- a/core/mutex.c
   +++ b/core/mutex.c
   @@ -58,8 +58,8 @@ int _mutex_lock(mutex_t *mutex, int blocking)
            else {
                thread_add_to_list(&mutex->queue, me);
            }
   -        irq_restore(irqstate);
            thread_yield_higher();
   +        irq_restore(irqstate);
            /* We were woken up by scheduler. Waker removed us from queue.
             * We have the mutex now. */
            return 1; 
```

Do these make sense?  The first two are essential to fix the nRF52 SD instability, but of course the comment doesn't make sense just for that.

The third one may be more controversial, perhaps even for Cortex-M.  Furthermore, I don't understand what it would mean for other MCUs but nRF52, since I don't understand the details of their architectures.

--Pekka

[1] https://books.google.fi/books?id=5OZblBzjsJ0C&lpg=PA174&ots=m3bMkoQjNt&dq=cortex-m%20pendsv%20priority&hl=fi&pg=PA194#v=onepage&q&f=false
[2] https://community.arm.com/processors/f/discussions/9485/cortex-m-rtos-related-exceptions-and-concepts
[3] https://github.com/RIOT-OS/RIOT/blob/master/cpu/cortexm_common/cortexm_init.c#L38
[4] https://lists.riot-os.org/pipermail/devel/2018-September/005923.html
[5] https://github.com/RIOT-OS/RIOT/pull/3511
[6] https://github.com/RIOT-OS/RIOT/pull/3450



More information about the devel mailing list