[riot-notifications] [RIOT-OS/RIOT] nrfmin can get stuck and never reach RX (while TX works) (#10878)

pystub notifications at github.com
Sat Jan 26 17:52:22 CET 2019

#### Description

nrfmin radio driver can become stuck in a state where it can no longer receive packets over the radio.

This issue triggers sporadically and can happen within seconds or hours of operation. Increased number of nodes concurrently trying to send seems to increase the chance of it.

My original research showed when driver is in this state, there is a packet received, but it appears to be never passed on and cleared, so it prevents the driver from switching periphery into receive mode. I.e. second condition fails:


For more info please see attempted fixes at the end.

#### Steps to reproduce the issue

1. Use nrfmin, 6LoWPAN, and GNRC in a program that periodically (broad|multi)casts.
2. Leave two or more nodes on.

#### Expected results

Packets should be received until conditions become unsatisfactory for chip operation or radio connection.

#### Actual results

Packets can be sent over the radio, but receiving does not work.

#### Versions

(a few week old version of RIOT; will be filled in later)

My particular NRF51 chip **is not affected by** PAN 20 (which means `STATE` register can be used in my case)

#### Attempted fixes

I attempted to fix this by making certain code run in the interrupt handler (assuming that's the most correct place in effort to avoid blowing something else's stack):

void isr_radio(void)
    if (NRF_RADIO->EVENTS_END == 1) {
        NRF_RADIO->EVENTS_END = 0;
        /* did we just send or receive something? */
        if (state == STATE_RX || rx_buf.pkt.hdr.len > 0) {
            /* drop packet on invalid CRC */
            if ((NRF_RADIO->CRCSTATUS != 1) || !(nrfmin_dev.event_callback)) {
                rx_buf.pkt.hdr.len = 0;
                NRF_RADIO->TASKS_START = 1;
            } else {
                rx_lock = 0;
                nrfmin_dev.event_callback(&nrfmin_dev, NETDEV_EVENT_ISR);
        if (state == STATE_TX) {


This greatly improved reliability, yet now sending function would sometimes spinlock waiting for `state` to change from `STATE_TX`:


After a few attempts I managed to avoid that by adding code that waits for `EVENT_READY` after triggering `TASK_TXEN` and `TASK_RXEN`:



Changed to:

        NRF_RADIO->TASKS_RXEN = 1;
        while (NRF_RADIO->EVENTS_READY == 0) {}

With these changes, after a lot of time, system sometimes hard faults. I was unable to recover the details because UART disconnected prior to that.

#### Other observations

`cortexm_isr_end()` is docummented as to be called at the end of every interrupt routine, but in this driver, in `isr_radio`, line https://github.com/RIOT-OS/RIOT/blob/3e6336ce89d64d58ab07764ef7f65fc86800cb85/cpu/nrf5x_common/radio/nrfmin/nrfmin.c#L315 can return early and not fufill that specification. Should this be corrected?

Also, when `NRF_RADIO->POWER` is set to `0` (off), it appears to reset some of the registers. When waking periphery up, `nrfmin_init` should probably be called again.

<!-- Thanks for contributing! -->

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.riot-os.org/pipermail/notifications/attachments/20190126/ed1d268a/attachment-0001.html>

More information about the notifications mailing list