[riot-devel] Odd problems with xtimer

Michael Andersen michael at steelcode.com
Wed Feb 24 19:20:12 CET 2016


On Wed, Feb 24, 2016 at 5:56 AM, malo <malo at 25cmsquare.io> wrote:

> Hello Kaspar,
> sorry for "out out order" message, but I was not in the list when this thread was started.
>
> Adding my two cents:
> the current xtimer is "just" not thread safe and should not be used/set from multiple threads (hope we agree on this). the question is, if there is somewhere in the riot core single xtimer_t* set from multiple threads?
> Or was just userspace misuse...
> Hope the second is the case:)
>
>
It was my code that initially discovered it. I was quite deliberate about
not accessing the same object from two threads. In fact the timer that was
found to be in the list twice was one created by riot to wait for message
with timeout. Not ruling out user error, but it would have to be a more
subtle user error :-)

I've been busy on other things, so I have not had time to dig into where
this is coming from, but a temporary hotfix that stopped my mesh from
crashing was to put a modified version of remove_timer  inside the critical
section of add timer. At least that way the critical section does not
contain the spin wait. Not advocating this as the solution, it's just a
hack until I get time to improve it.

>
> As for returning error on xtimer_set - what you would do with the return value?
> maybe something like?:
> while (true)
> {
>   if (!xtimer_set())
>   {
>     yield();
>   }
>   else
>   {
>     break;
>   }
> }
> quite ugly...
>
>
> Sure there should be check in _add_timer_to_long_list and _add_timer_to_list for duplicates and assert in the case that trying to add pointer which is already in - just to warn user that is doing something bad.
>
> wbr
> malo
>
> > <https://github.com/RIOT-OS/RIOT/blob/master/sys/xtimer/xtimer_core.c#L209-L211> would
> > end with list_head equal to the timer (assuming no other timer has the
> > same time), and then the next two lines would basically link the timer
> > to itself.
> >
> > I could be wrong though, that is just a guess.
>
> I think your analysis is correct, I managed to create a test case that
> shows pretty much the behaviour you're describing.
>
> Guarding most of xtimer_set() (using disableIRQ/restoreIRQ) fixes the
> problem, but disables interrupts for the backoff spin loop.
>
> While hanging within xtimer is probably the worst, I'm not sure what
> would be the best semantic for concurrently setting the same timer object:
>
> - return an error (cleanest, but currently xtimer_set() never returns an
> error)
> - first xtimer_set() wins (easy to implement by somehow tagging the
> timer object, but probably unexpected)
> - second xtimer_set() wins (very hard to do as xtimer_set() can be
> called in ISR context, and there's no way to wait for, e.g., a mutex)
> - guard the whole timer setting procedure (would disable interrupts
> while spinning)
> - ?
>
> Opinions?
>
> Kaspar
>
>
> _______________________________________________
> devel mailing list
> devel at riot-os.org
> https://lists.riot-os.org/mailman/listinfo/devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.riot-os.org/pipermail/devel/attachments/20160224/53959fef/attachment-0001.html>


More information about the devel mailing list