[riot-notifications] [RIOT-OS/RIOT] sys: Added simple memory barrier API (#11438)

Marian Buschsieweke notifications at github.com
Fri Apr 26 09:45:06 CEST 2019


@kaspar030, @jcarrano: I'd like to share a little experiment:

`foo.c`:
```C
#include <stdint.h>
#include <stdio.h>

extern int bar(void);
typedef union {
    uint32_t u32;
    uint8_t u8[4];
} foo_t;

foo_t a = { .u32 = 0 };

int main(void)
{
    a.u8[0] = 0xa0;
    int i = bar();
    a.u8[1] = 0xa1;
    a.u8[2] = 0xa2;
    a.u8[3] = 0xa3;

    printf("a = %x, i = %d\n", (unsigned)a.u32, i);
    return 0;
}
```

`bar.c`:
```C
int bar(void) {
    return 0x1337;
}
```

Compilation with:
```
gcc -c -O3 -o foo.o foo.c
gcc -c -O3 -o bar.o bar.c
gcc -o no_lto foo.o bar.o
```

and with:
```
gcc -flto -c -O3 -o foo.o foo.c
gcc -flto -c -O3 -o bar.o bar.c
gcc -flto -O3 -o lto foo.o bar.o
```

Partial output of `objdump -d no_lto` (everything but `main` and `bar` omitted):
```
0000000000001060 <main>:
    1060:	48 83 ec 08          	sub    $0x8,%rsp
    1064:	c6 05 05 30 00 00 a0 	movb   $0xa0,0x3005(%rip)        # 4070 <a>
    106b:	e8 80 01 00 00       	callq  11f0 <bar>
    1070:	ba a2 a3 ff ff       	mov    $0xffffa3a2,%edx
    1075:	c6 05 f5 2f 00 00 a1 	movb   $0xa1,0x2ff5(%rip)        # 4071 <a+0x1>
    107c:	48 8d 3d 7d 0f 00 00 	lea    0xf7d(%rip),%rdi        # 2000 <_fini+0xde8>
    1083:	66 89 15 e8 2f 00 00 	mov    %dx,0x2fe8(%rip)        # 4072 <a+0x2>
    108a:	8b 35 e0 2f 00 00    	mov    0x2fe0(%rip),%esi        # 4070 <a>
    1090:	89 c2                	mov    %eax,%edx
    1092:	31 c0                	xor    %eax,%eax
    1094:	e8 87 ff ff ff       	callq  1020 <printf at plt>
    1099:	31 c0                	xor    %eax,%eax
    109b:	48 83 c4 08          	add    $0x8,%rsp
    109f:	c3                   	retq

00000000000011f0 <bar>:
    11f0:	b8 37 13 00 00       	mov    $0x1337,%eax
    11f5:	c3                   	retq
```

Partial output of `objdump -d lto` (everything but `main` omitted, `bar` no longer present):
```
0000000000001060 <main>:
    1060:	48 83 ec 08          	sub    $0x8,%rsp
    1064:	ba 37 13 00 00       	mov    $0x1337,%edx
    1069:	be a0 a1 a2 a3       	mov    $0xa3a2a1a0,%esi
    106e:	31 c0                	xor    %eax,%eax
    1070:	48 8d 3d 89 0f 00 00 	lea    0xf89(%rip),%rdi        # 2000 <_fini+0xe0f>
    1077:	c7 05 ef 2f 00 00 a0 	movl   $0xa3a2a1a0,0x2fef(%rip)        # 4070 <a>
    107e:	a1 a2 a3 
    1081:	e8 9a ff ff ff       	callq  1020 <printf at plt>
    1086:	31 c0                	xor    %eax,%eax
    1088:	48 83 c4 08          	add    $0x8,%rsp
    108c:	c3                   	retq
```

Without LTO the call to `bar()` is effectively a compiler barrier. With LTO `bar()` gets inlined and the accesses to `a` get reordered across what has been the call to `bar()` and combined to a single 32 bit store.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/RIOT-OS/RIOT/pull/11438#issuecomment-486961340
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.riot-os.org/pipermail/notifications/attachments/20190426/fba1ee69/attachment-0001.html>


More information about the notifications mailing list