[riot-devel] Current best practice for sleeping in a device driver?

Joakim Nohlgård joakim.nohlgard at eistec.se
Thu Sep 13 13:02:49 CEST 2018


Hi again Pekka,
see my replies inline below.

On Wed, Sep 12, 2018 at 8:56 AM, Nikander Pekka <pekka.nikander at aalto.fi> wrote:
> Thanks Joakim,
>
> No, I didn't know about cortexm_isr_end().  (See also in the end.)  Adding
> it, locking mutex twice and releasing it in the ISR handler works now.
>
> However, if I use the same mutex as the driver is using anyway, there is a
> potential conflict window:
>
> ```
>    void isr(void) {
>        mutex_unlock(&lock);
>        cortex_isr_end();
>    }
>
>    int blocking_drive_function(...) {
>        /* Block the driver from other threads */
>        mutex_lock(&lock);
>        ....
>        /* Prevent the interrupt from happening too early */
>        uint32_t irq_state = irq_disable();
>        ....
>        /* Wait for an interrupt */
>        irq_restore(irq_state);
>        // NOTE: Potential conflict window here
>        mutex_lock(&lock);
>        ...
>        /* Release the driver for other threads */
>        mutex_unlock(&lock);
>    }
> ```
>
> If the IRQ takes place early, it will be served between irq_restore() and
> mutex_lock().  The driver appears to work, since the ISR handler releases
> the mutex and the blocking function just reacquires the mutex without
> blocking.
>
> However, if a HIGHER priority thread is also ready to run and accesses the
> driver, it will get the mutex before the waiting thread, allowing two
> threads into the driver at the same time.  Not good.
>
> Of course, I can use a separate mutex and lock it twice.  But that is not
> beautiful.

A separate mutex to ensure mutual exclusion for the driver calls would
be preferable IMO. The ISR can only serve one thread at a time anyway,
so it does not seem like your driver will be usable from parallel
threads.

>
> Furthermore, locking a mutex twice from the same thread is not considered a
> standard practice.  Some mutex implementations even give you a panic if you
> do that.  Hence, I suspect not many people think of it as a potential
> solution.

The mutex here is not used as a traditional mutex, but rather as a
binary semaphore. Right now there is no semaphore API in RIOT, but
there may be a need for it.
A different approach to achieve a binary semaphore in RIOT is to use
thread flags (http://api.riot-os.org/group__core__thread__flags.html#gab3113f4c21484922b730e372eae6dc0a).
Then you will need to pass the PID number to the ISR instead of a
mutex pointer
See the Kinetis I2C driver for one example of using thread flags.

>
> Hence, I would suggest that we add a couple of new functions, for example:
>
>    void thread_sleep_keep_mutex(mutex_t *m);
>    void thread_wakeup_mutex(mutex_t *);
>
> The first one would basically just lock the mutex second time, but it could
> be called with interrupts already disabled.  It would enable interrupts
> while the thread is sleeping, and disable them again before returning, if
> they were disabled when called.  As an additional safeguard, it would check
> that the calling thread already possesses the mutex.
>
> The second one would wakeup one thread sleeping on the mutex.  In practice,
> it would be just syntactic sugar over mutex_unlock().
>
> If that makes sense, I can make an initial implementation and open a PR.
>
> Unfortunately I don't have time to properly test it or writing test cases.
> For that I would need some help.

I too have far too little time to spend on these things.

>
> Now, looking at _mutex_lock() in this light, I started to wonder if there is
> a similar potential conflict window:
>
> ```
>     else if (blocking) {
>         thread_t *me = (thread_t*)sched_active_thread;
>         DEBUG("PID[%" PRIkernel_pid "]: Adding node to mutex queue: prio: %"
>               PRIu32 "\n", sched_active_pid, (uint32_t)me->priority);
>         sched_set_status(me, STATUS_MUTEX_BLOCKED);
>         if (mutex->queue.next == MUTEX_LOCKED) {
>             mutex->queue.next = (list_node_t*)&me->rq_entry;
>             mutex->queue.next->next = NULL;
>         }
>         else {
>             thread_add_to_list(&mutex->queue, me);
>         }
>         irq_restore(irqstate);
> ------>
>         thread_yield_higher();
>         /* We were woken up by scheduler. Waker removed us from queue.
>          * We have the mutex now. */
>         return 1;
> ```
>
> Could bad things happen there between irq_restore() and
> thread_yield_higher(), as the interrupts may be enabled before PendSV is set
> and the thread stack changed (on Cortex-M)?  I am not sure, but I suspect
> so.

I did not have time to investigate this, but will there really be an
issue when the current thread is already blocked inside the critical
section with IRQs disabled?

>
> AFAICS, there is at least a potential performance issue here, since if there
> is an interrupt there that does not send PendSV, the usual PendSV interrupt
> optimisation will not take place.

Not sure what you mean by the "usual PendSV interrupt optimization".
It was a while since I looked at the scheduler code.

>
> BTW, w.r.t. cortexm_isr_end(), I didn't even think of needing to set PendSV
> that since in my Cortex-M scheduler (several years ago) I set the PendSV
> flag directly in the scheduler and synchronisation primitives themselves,
> whenever needed.

Platform specific optimized synchronization primitives could possibly
yield a performance boost, but we need to make sure that there are
comprehensive regression tests that can detect any deviations from the
generic implementation behavior. This may take significant effort to
implement...

>
> Indeed, looking at the code, I think the `sched_context_switch_request`
> should be replaced a suitable inline function.
>
> Should I open an issue on replacing `sched_context_switch_request` with
> inline functions for setting, clearing, and testing it?  Once the inline
> functions are in place, the code could then be optimised for various CPUs,
> allowing e.g. the Cortex-M support to get rid of cortexm_isr_end().

Getting rid of cortexm_isr_end would help reducing hard to spot
mistakes in the periph drivers, but this is a very important piece of
code and any changes must be made carefully and with confidence. I am
not sure how to create an automatic test for this, since there are so
many different drivers which affect this flag and all are related to
specific hardware modules in the CPU.

/Joakim

>
> --Pekka
>
>
> On 11.9.2018, at 22:34, Joakim Nohlgård <joakim.nohlgard at eistec.se> wrote:
>
> Hi again,
> Are you on a cortex m platform? Did you remember to add cortexm_isr_end() at
> the end of the ISR function?
> /Joakim
>
> Den tis 11 sep. 2018 20:21Nikander Pekka <pekka.nikander at aalto.fi> skrev:
>>
>> Hi Joakim,
>>
>> I tried also exactly that, but for some reason I couldn't make it work.
>> When I tried to unlock it in the ISR, the thread waiting for the mutex for
>> the second time still didn't acquire it, but was blocked again in the
>> scheduler.  I didn't dive deep into the scheduler to see why.  Maybe I did
>> some mistake.
>>
>> Could you please point to some example code?
>>
>> --Pekka
>>
>> On 11.9.2018, at 20:48, Joakim Nohlgård <joakim.nohlgard at eistec.se> wrote:
>>
>> Hi Pekka,
>> What we have done in several other places is to use a mutex. The device
>> driver blocking code will attempt to lock the mutex twice, and the ISR will
>> unlock it to let the thread continue.
>> Does that work in your use case?
>>
>> Best regards, Joakim
>>
>> Den tis 11 sep. 2018 16:40Nikander Pekka <pekka.nikander at aalto.fi> skrev:
>>>
>>> Hi,
>>>
>>> Sorry for my ignorance, but what is the current best practice to sleep in
>>> a blocking device driver, waiting for an interrupt to take place?  I
>>> couldn't find anything reasonable.
>>>
>>> In many other context I've seen people to use `sleep` and `wakeup`, and
>>> even went that far that I tried to introduce a variant of `thread_sleep`
>>> that can be entered with interrupts disabled (see below), but couldn't get
>>> it working.
>>>
>>> Here is an excerpt of my current code:
>>> ```
>>> void isr_saadc(void) {
>>>     NRF_SAADC->EVENTS_END = 0;
>>>     NRF_SAADC->INTENCLR = SAADC_INTEN_DONE_Msk;
>>>     thread_wakeup(sleeper);
>>> }
>>>
>>> int adc_sample(adc_t line, adc_res_t res)
>>> {
>>>     ...
>>>     /* enable EVENTS_END interrupt */
>>>     const uint32_t irqs = irq_disable();
>>>     sleeper = thread_getpid();
>>>     NRF_SAADC->INTENSET = SAADC_INTEN_END_Msk; /* Disabled at ISR handler
>>> */
>>>
>>>     ...
>>>
>>>     /* Wait for the interrupt */
>>>     thread_sleep_with_interrupts_disabled(irqs);
>>> ```
>>>
>>> This doesn't work.  The thread doesn't wake up, even though the interrupt
>>> does take place and `thread_wakeup` states that the thread _is_ sleeping
>>> when called:
>>> ```
>>> Created thread idle. PID: 1. Priority: 15.
>>> Created thread main. PID: 2. Priority: 7.
>>> main(): This is RIOT! (Version:
>>> 2018.10-devel-585-gd2de4-dhcp-85-86.eduroam.aalto.fi-feature-refactor-ble-core)
>>> Entering main function.
>>> Entering main loop.
>>> thread_wakeup: Trying to wakeup PID 2...
>>> thread_wakeup: Thread is sleeping.
>>> thread_wakeup: Trying to wakeup PID 2...
>>> thread_wakeup: Thread is sleeping.
>>> ```
>>>
>>> And then nothing happens.
>>>
>>> Here is the tentative patch to introduce
>>> `thread_sleep_with_interrupts_disabled`:
>>> ```
>>> diff --git a/core/include/thread.h b/core/include/thread.h
>>> index 64a828ab3..4abdb3d25 100644
>>> --- a/core/include/thread.h
>>> +++ b/core/include/thread.h
>>> @@ -399,6 +399,20 @@ int thread_getstatus(kernel_pid_t pid);
>>>   */
>>>  void thread_sleep(void);
>>>
>>> +/**
>>> + * @brief Puts the current thread into sleep mode with the interrupts
>>> already disabled.
>>> + *
>>> + * @param[in] interrupt_state  Saved interrupt state from irq_disable().
>>> + *
>>> + *          Places the calling thread to sleep and re-enables interrupts
>>> to
>>> + *          the `interrupt_state` before calling thread_yield_higher().
>>> + *          Meant to be called from drivers that want to be waken up
>>> + *          from an interrupt routine.
>>> + *          The calling thread to be woken up externally.
>>> + *
>>> + */
>>> +void thread_sleep_with_interrupts_disabled(int interrupt_state);
>>> +
>>>  /**
>>>   * @brief   Lets current thread yield.
>>>   *
>>> diff --git a/core/thread.c b/core/thread.c
>>> index 6f7a71362..05400a0af 100644
>>> --- a/core/thread.c
>>> +++ b/core/thread.c
>>> @@ -62,9 +62,12 @@ void thread_sleep(void)
>>>      }
>>>
>>>      unsigned state = irq_disable();
>>> +    thread_sleep_with_interrupts_disabled(state);
>>> +}
>>> +
>>> +void thread_sleep_with_interrupts_disabled(int state) {
>>>      sched_set_status((thread_t *)sched_active_thread, STATUS_SLEEPING);
>>>      irq_restore(state);
>>>      thread_yield_higher();
>>>  }
>>>
>>>  int thread_wakeup(kernel_pid_t pid)
>>>
>>> ```
>>>
>>> --Pekka Nikander
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel at riot-os.org
>>> https://lists.riot-os.org/mailman/listinfo/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel at riot-os.org
>> https://lists.riot-os.org/mailman/listinfo/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel at riot-os.org
>> https://lists.riot-os.org/mailman/listinfo/devel
>
> _______________________________________________
> devel mailing list
> devel at riot-os.org
> https://lists.riot-os.org/mailman/listinfo/devel
>
>
>
> _______________________________________________
> devel mailing list
> devel at riot-os.org
> https://lists.riot-os.org/mailman/listinfo/devel
>


More information about the devel mailing list