<h3>Contribution description</h3>
<p>When two threads use <code>gnrc_ipv6_nib_get_next_hop_l2addr()</code> to determine a next hop (e.g. when there is both an IPv6 sender and a 6LoWPAN fragment forwarder), a race condition may happen, where one thread acquires the NIB and the other acquires the network interface resulting in a deadlock. By releasing the NIB (if acquired) before trying to acquire the network interface and re-acquiring the NIB after the network interface is acquired, this is fixed.</p>
<p>An example for how the race condition might come to pass is that first, the IPv6 thread, acquires both the <code>netif</code> and the NIB for a look-up of an off-link (global) address:</p>
<p><div class="border rounded-1 my-2">
  <div class="f6 px-3 py-2 lh-condensed border-bottom color-bg-secondary">
    <p class="mb-0 text-bold">
      <a href="https://github.com/RIOT-OS/RIOT/blob/6e4434a76049356d5c24be606084cd38abc6d432/sys/net/gnrc/network_layer/ipv6/nib/nib.c#L198-L199">RIOT/sys/net/gnrc/network_layer/ipv6/nib/nib.c</a>
    </p>
    <p class="mb-0 color-text-tertiary">
        Lines 198 to 199
      in
      <a data-pjax="true" class="commit-tease-sha" href="/RIOT-OS/RIOT/commit/6e4434a76049356d5c24be606084cd38abc6d432">6e4434a</a>
    </p>
    </div>
    <div itemprop="text" class="blob-wrapper blob-wrapper-embedded data">
    <table class="highlight tab-size mb-0 js-file-line-container" data-tab-size="8" data-paste-markdown-skip="">

        <tbody><tr class="border-0">
          <td id="L198" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="198"></td>
          <td id="LC198" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line"> <span class="pl-c1">gnrc_netif_acquire</span>(netif); </td>
        </tr>

        <tr class="border-0">
          <td id="L199" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="199"></td>
          <td id="LC199" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line"> <span class="pl-c1">_nib_acquire</span>(); </td>
        </tr>
    </tbody></table>
  </div>
</div>
</p>
<p>and then continues to <div class="border rounded-1 my-2">
  <div class="f6 px-3 py-2 lh-condensed border-bottom color-bg-secondary">
    <p class="mb-0 text-bold">
      <a href="https://github.com/RIOT-OS/RIOT/blob/6e4434a76049356d5c24be606084cd38abc6d432/sys/net/gnrc/network_layer/ipv6/nib/nib.c#L261-L269">RIOT/sys/net/gnrc/network_layer/ipv6/nib/nib.c</a>
    </p>
    <p class="mb-0 color-text-tertiary">
        Lines 261 to 269
      in
      <a data-pjax="true" class="commit-tease-sha" href="/RIOT-OS/RIOT/commit/6e4434a76049356d5c24be606084cd38abc6d432">6e4434a</a>
    </p>
    </div>
    <div itemprop="text" class="blob-wrapper blob-wrapper-embedded data">
    <table class="highlight tab-size mb-0 js-file-line-container" data-tab-size="8" data-paste-markdown-skip="">

        <tbody><tr class="border-0">
          <td id="L261" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="261"></td>
          <td id="LC261" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line"> <span class="pl-k">if</span> ((netif != <span class="pl-c1">NULL</span>) && (netif-><span class="pl-smi">pid</span> != (<span class="pl-k">int</span>)route.<span class="pl-smi">iface</span>)) { </td>
        </tr>

        <tr class="border-0">
          <td id="L262" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="262"></td>
          <td id="LC262" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line">     <span class="pl-c"><span class="pl-c">/*</span> drop pre-assumed netif <span class="pl-c">*/</span></span> </td>
        </tr>

        <tr class="border-0">
          <td id="L263" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="263"></td>
          <td id="LC263" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line">     <span class="pl-c1">gnrc_netif_release</span>(netif); </td>
        </tr>

        <tr class="border-0">
          <td id="L264" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="264"></td>
          <td id="LC264" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line"> } </td>
        </tr>

        <tr class="border-0">
          <td id="L265" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="265"></td>
          <td id="LC265" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line"> <span class="pl-k">if</span> ((netif == <span class="pl-c1">NULL</span>) || (netif-><span class="pl-smi">pid</span> != (<span class="pl-k">int</span>)route.<span class="pl-smi">iface</span>)) { </td>
        </tr>

        <tr class="border-0">
          <td id="L266" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="266"></td>
          <td id="LC266" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line">     <span class="pl-c"><span class="pl-c">/*</span> get actual netif <span class="pl-c">*/</span></span> </td>
        </tr>

        <tr class="border-0">
          <td id="L267" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="267"></td>
          <td id="LC267" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line">     netif = <span class="pl-c1">gnrc_netif_get_by_pid</span>(route.<span class="pl-smi">iface</span>); </td>
        </tr>

        <tr class="border-0">
          <td id="L268" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="268"></td>
          <td id="LC268" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line">     <span class="pl-c1">gnrc_netif_acquire</span>(netif); </td>
        </tr>

        <tr class="border-0">
          <td id="L269" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="269"></td>
          <td id="LC269" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line"> } </td>
        </tr>
    </tbody></table>
  </div>
</div>
</p>
<p>However, when the 6LoWPAN thread (with a fragment forwarding module like <code>gnrc_sixlowpan_frag_minfwd</code> or <code>gnrc_sixlowpan_frag_sfr</code> via the VRB <a href="">here</a>) during that also makes a lookup but with a specific <code>netif != NULL</code> (same as the new <code>netif</code> in IPv6) in L198, it will get stuck on L199, since the IPv6 thread still hold the lock on the NIB, while then waiting on the <code>netif</code> in 268. I did not really observe this with the link-local variant</p>
<p><div class="border rounded-1 my-2">
  <div class="f6 px-3 py-2 lh-condensed border-bottom color-bg-secondary">
    <p class="mb-0 text-bold">
      <a href="https://github.com/RIOT-OS/RIOT/blob/6e4434a76049356d5c24be606084cd38abc6d432/sys/net/gnrc/network_layer/ipv6/nib/nib.c#L210-L215">RIOT/sys/net/gnrc/network_layer/ipv6/nib/nib.c</a>
    </p>
    <p class="mb-0 color-text-tertiary">
        Lines 210 to 215
      in
      <a data-pjax="true" class="commit-tease-sha" href="/RIOT-OS/RIOT/commit/6e4434a76049356d5c24be606084cd38abc6d432">6e4434a</a>
    </p>
    </div>
    <div itemprop="text" class="blob-wrapper blob-wrapper-embedded data">
    <table class="highlight tab-size mb-0 js-file-line-container" data-tab-size="8" data-paste-markdown-skip="">

        <tbody><tr class="border-0">
          <td id="L210" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="210"></td>
          <td id="LC210" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line"> <span class="pl-k">if</span> (!<span class="pl-c1">ipv6_addr_is_link_local</span>(dst) && (iface != <span class="pl-c1">0</span>)) { </td>
        </tr>

        <tr class="border-0">
          <td id="L211" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="211"></td>
          <td id="LC211" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line">     <span class="pl-c"><span class="pl-c">/*</span> release preassumed interface <span class="pl-c">*/</span></span> </td>
        </tr>

        <tr class="border-0">
          <td id="L212" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="212"></td>
          <td id="LC212" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line">     <span class="pl-c1">gnrc_netif_release</span>(netif); </td>
        </tr>

        <tr class="border-0">
          <td id="L213" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="213"></td>
          <td id="LC213" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line">     netif = <span class="pl-c1">gnrc_netif_get_by_pid</span>(iface); </td>
        </tr>

        <tr class="border-0">
          <td id="L214" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="214"></td>
          <td id="LC214" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line">     <span class="pl-c1">gnrc_netif_acquire</span>(netif); </td>
        </tr>

        <tr class="border-0">
          <td id="L215" class="blob-num border-0 px-3 py-0 color-bg-primary js-line-number" data-line-number="215"></td>
          <td id="LC215" class="blob-code border-0 px-3 py-0 color-bg-primary blob-code-inner js-file-line"> } </td>
        </tr>
    </tbody></table>
  </div>
</div>
</p>
<p>but in theory it can also happen there.</p>

<h3>Testing procedure</h3>
<p>Both <code>tests/gnrc_ipv6_nibu</code> and <code>tests/gnrc_ipv6_nib_6ln</code> should still pass and <code>gnrc_networking</code> should still work. I found this by running experiments very similar to <a href="https://github.com/5G-I3/IEEE-LCN-2019/">https://github.com/5G-I3/IEEE-LCN-2019/</a>, but with a higher send interval (250-750ms), for SFR with congestion control. Without this fix I ran into the race condition quite often with low congestion windows. With this fix, I was not able to reproduce the race condition yet.</p>

<h3>Issues/PRs references</h3>
<p>None.</p>


<hr>

<h4>You can view, comment on, or merge this pull request online at:</h4>
<p>  <a href='https://github.com/RIOT-OS/RIOT/pull/16450'>https://github.com/RIOT-OS/RIOT/pull/16450</a></p>

<h4>Commit Summary</h4>
<ul>
  <li>gnrc_ipv6_nib: fix acquire race on gnrc_ipv6_nib_get_next_hop_l2addr()</li>
</ul>

<h4>File Changes</h4>
<ul>
  <li>
    <strong>M</strong>
    <a href="https://github.com/RIOT-OS/RIOT/pull/16450/files#diff-fba26fd610698922256809a70531c3b18dc624a771563ea8887a08bbf762c457">sys/net/gnrc/network_layer/ipv6/nib/nib.c</a>
    (22)
  </li>
</ul>

<h4>Patch Links:</h4>
<ul>
  <li><a href='https://github.com/RIOT-OS/RIOT/pull/16450.patch'>https://github.com/RIOT-OS/RIOT/pull/16450.patch</a></li>
  <li><a href='https://github.com/RIOT-OS/RIOT/pull/16450.diff'>https://github.com/RIOT-OS/RIOT/pull/16450.diff</a></li>
</ul>

<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">—<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/RIOT-OS/RIOT/pull/16450">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/ABE7WYFV57N7TAIKTBZJLSTTMFYZPANCNFSM44FLYDFQ">unsubscribe</a>.<img src="https://github.com/notifications/beacon/ABE7WYGSDFGCIGUSHKJHRDDTMFYZPA5CNFSM44FLYDF2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4NCAR2MQ.gif" height="1" width="1" alt="" /></p>
<script type="application/ld+json">[
{
"@context": "http://schema.org",
"@type": "EmailMessage",
"potentialAction": {
"@type": "ViewAction",
"target": "https://github.com/RIOT-OS/RIOT/pull/16450",
"url": "https://github.com/RIOT-OS/RIOT/pull/16450",
"name": "View Pull Request"
},
"description": "View this Pull Request on GitHub",
"publisher": {
"@type": "Organization",
"name": "GitHub",
"url": "https://github.com"
}
}
]</script>