[riot-notifications] [RIOT-OS/RIOT] cpu/esp8266: Improvements to the `esp_wifi` netdev driver (#10862)

Gunar Schorcht notifications at github.com
Fri Jan 25 14:57:43 CET 2019


@smIng @MichelRottleuthner Thanks for testing. I know how frustrating testing can be when problems  only occur sporadically and are not reproducable.

> 1. ESP completely freezes, no ping reply, shell not responding, neither crash nor reboot though

The Shell freezes? Do you mean the esp8266 shell? I have never seen this.

> 2. ESP crashing with kernel panic and rebooting

This corresponds to problem 3 described in issue #10861. I inspected the code to figure out why this happens. Unfortunately, it happens at different addresses. I have tried to catch the exception, however, there is no OCD that is working satisfying. Thus, I was not able to catch the exception.That is, all I can do is guessing.

> 3. ESP disconnects and not able to reconnect again
a) [esp_wifi] disconnected ... reason 2
b) [esp_wifi] disconnected ... reason 8
(when pinging in both directions) I got same as (3a) but ESP prints the following without ever recovering:

3.b) is an intended disconnect by `esp_wifi` when the send function blocks and stands for leave association. This is a try by `esp_wifi` to release buffers and to reinitialize the WiFi interface. Unfortunately, it is not really clear what really happens and there is no other way known to do that. Sometimes it works, sometimes not.

3.a) comes from the SDK and stands for authentication expired. This may happen either after an intended disconnect or after the AP sent a de-authentication message (reason 7). I can observe with `airmon` and `wireshark` that the PSK authentication procedure is started by the AP with `key message (1 of 4)` but the eps8266 does not answer with `key message (2 of 4)` to continue the authentication.

> 4. (when pinging in both directions) I got same as (3a) but ESP prints the following without ever recovering:
`# E:M 48`

This is an error message from the SDK and occurs when the memory is exhausted. Nobody knows exactly what this error message stands for. My guess is that this messages stands for "Error in Memory management only 48 bytes left" or something like that. Once 3.a) occured, in a noticeable number of cases, reconnecting fails continuously until the memory is exhausted. According to heap statistics, a lot of memory (several kBytes) is allocated with each reconnect, but this memory is not released the reconnect fails. This seems to be a known bug in the SDK.

The situation is quite frustrating and after some days of testing and trying, I'm not knowing much more that before.

The big question is, how reliable is it under normal network conditions. Most of these problems only occur on heavy network load. Aaaaannnnnddd, some of them also occur in the `master` version, at least 1., 3.a) and 3.b). I'm not really sure what to do, but normally I would prefer stability over performance.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/RIOT-OS/RIOT/pull/10862#issuecomment-457580504
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.riot-os.org/pipermail/notifications/attachments/20190125/8dd950fe/attachment.html>


More information about the notifications mailing list