在 kernel include file (include/linux/compiler.h) 裡它們是這樣定義的:
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
這個意思是怎樣呢? 用 google 找了一下, 原來是說當 define likely 時, 預期判斷式的結果會是 x, 這樣 compiler 會把 x 為 true 的內容翻議之後放的比較近一點, 如此一來 instruction cache hit rate 會提高一點; 而 unlikely 就是預期 x 不為真, 這樣的話 compiler 就會把不為 true 的內容擺的比較近一點.
直接看例子比較快, 如以下的 c code:
int main(void)
{
int x, y, z;
x=1;
if (x)
{
y=x+1;
printf("Y: %d\n", y);
}
else
{
y=x+2;
printf("Y: %d\n", y);
}
printf("X: %d, Y: %x\n", x, y);
return 0;
}
我用 arm compiler 去 compile 後看到的 assembly code:
main.o: file format elf32-littlearm
Disassembly of section .text:
00000000
main():
0: e92d4030 stmdb sp!, {r4, r5, lr}
4: e3a04001 mov r4, #1 ; 0x1
8: e0845004 add r5, r4, r4
c: e1a01005 mov r1, r5
10: e59f0018 ldr r0, [pc, #24] ; 30 <.text+0x30>
14: ebfffffe bl 0
18: e1a01004 mov r1, r4
1c: e1a02005 mov r2, r5
20: e59f000c ldr r0, [pc, #12] ; 34 <.text+0x34>
24: ebfffffe bl 0
28: e3a00000 mov r0, #0 ; 0x0
2c: e8bd8030 ldmia sp!, {r4, r5, pc}
30: 00000000 andeq r0, r0, r0
34: 00000008 andeq r0, r0, r8
unlikely:
main3.o: file format elf32-littlearm
Disassembly of section .text:
00000000
main():
0: e92d4010 stmdb sp!, {r4, lr}
4: e3a00001 mov r0, #1 ; 0x1
8: ebfffffe bl 0
c: e3a04002 mov r4, #2 ; 0x2
10: e3500000 cmp r0, #0 ; 0x0
14: e1a01004 mov r1, r4
18: 03a04003 moveq r4, #3 ; 0x3
1c: 01a01004 moveq r1, r4
20: e59f001c ldr r0, [pc, #28] ; 44 <.text+0x44>
24: 059f0018 ldreq r0, [pc, #24] ; 44 <.text+0x44>
28: ebfffffe bl 0
2c: e1a02004 mov r2, r4
30: e3a01001 mov r1, #1 ; 0x1
34: e59f000c ldr r0, [pc, #12] ; 48 <.text+0x48>
38: ebfffffe bl 0
3c: e3a00000 mov r0, #0 ; 0x0
40: e8bd8010 ldmia sp!, {r4, pc}
44: 00000000 andeq r0, r0, r0
48: 00000008 andeq r0, r0, r8
likely:
main2.o: file format elf32-littlearm
Disassembly of section .text:
00000000
main():
0: e92d4010 stmdb sp!, {r4, lr}
4: e3a00001 mov r0, #1 ; 0x1
8: ebfffffe bl 0
c: e3a04002 mov r4, #2 ; 0x2
10: e3500000 cmp r0, #0 ; 0x0
14: e1a01004 mov r1, r4
18: 03a04003 moveq r4, #3 ; 0x3
1c: 01a01004 moveq r1, r4
20: e59f001c ldr r0, [pc, #28] ; 44 <.text+0x44>
24: 059f0018 ldreq r0, [pc, #24] ; 44 <.text+0x44>
28: ebfffffe bl 0
2c: e1a02004 mov r2, r4
30: e3a01001 mov r1, #1 ; 0x1
34: e59f000c ldr r0, [pc, #12] ; 48 <.text+0x48>
38: ebfffffe bl 0
3c: e3a00000 mov r0, #0 ; 0x0
40: e8bd8010 ldmia sp!, {r4, pc}
44: 00000000 andeq r0, r0, r0
48: 00000008 andeq r0, r0, r8
怪怪的, 不管是 likely 或 unlikely 翻出來的 code 幾乎一樣, 後來換用 MIPS 跟 PowerPC compiler 都依樣... 看來 embedded system 的 compiler optimization 都沒做好 !?