[riot-notifications] [RIOT-OS/RIOT] gnrc_ipv6_nib: fix acquire race on gnrc_ipv6_nib_get_next_hop_l2addr() (#16450)
notifications at github.com
Wed May 5 18:45:11 CEST 2021
The RIOT community cares a lot about code quality.
Therefore, before describing what your contribution is about, we would like
you to make sure that your modifications are compliant with the RIOT
coding conventions, see https://github.com/RIOT-OS/RIOT/wiki/Coding-conventions.
### Contribution description
When two threads use `gnrc_ipv6_nib_get_next_hop_l2addr()` to determine a next hop (e.g. when there is both an IPv6 sender and a 6LoWPAN fragment forwarder), a race condition may happen, where one thread acquires the NIB and the other acquires the network interface resulting in a deadlock. By releasing the NIB (if acquired) before trying to acquire the network interface and re-acquiring the NIB after the network interface is acquired, this is fixed.
An example for how the race condition might come to pass is that first, the IPv6 thread, acquires both the `netif` and the NIB for a look-up of an off-link (global) address:
and then continues to https://github.com/RIOT-OS/RIOT/blob/6e4434a76049356d5c24be606084cd38abc6d432/sys/net/gnrc/network_layer/ipv6/nib/nib.c#L261-L269
However, when the 6LoWPAN thread (with a fragment forwarding module like `gnrc_sixlowpan_frag_minfwd` or `gnrc_sixlowpan_frag_sfr` via the VRB [here]()) during that also makes a lookup but with a specific `netif != NULL` (same as the new `netif` in IPv6) in L198, it will get stuck on L199, since the IPv6 thread still hold the lock on the NIB, while then waiting on the `netif` in 268. I did not really observe this with the link-local variant
but in theory it can also happen there.
Put here the description of your contribution:
- describe which part(s) of RIOT is (are) involved
- if it's a bug fix, describe the bug that it solves and how it is solved
- you can also give more information to reviewers about how to test your changes
### Testing procedure
Both `tests/gnrc_ipv6_nibu` and `tests/gnrc_ipv6_nib_6ln` should still pass and `gnrc_networking` should still work. I found this by running experiments very similar to https://github.com/5G-I3/IEEE-LCN-2019/, but with a higher send interval (250-750ms), for SFR with congestion control. Without this fix I ran into the race condition quite often with low congestion windows. With this fix, I was not able to reproduce the race condition yet.
Details steps to test your contribution:
- which test/example to compile for which board and is there a 'test' command
- how to know that it was not working/available in master
- the expected success test output
### Issues/PRs references
Examples: Fixes #1234. See also #5678. Depends on PR #9876.
Please use keywords (e.g., fixes, resolve) with the links to the issues you
resolved, this way they will be automatically closed when your pull request
is merged. See https://help.github.com/articles/closing-issues-using-keywords/.
You can view, comment on, or merge this pull request online at:
-- Commit Summary --
* gnrc_ipv6_nib: fix acquire race on gnrc_ipv6_nib_get_next_hop_l2addr()
-- File Changes --
M sys/net/gnrc/network_layer/ipv6/nib/nib.c (22)
-- Patch Links --
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the notifications