14 Oct 2015
7 mins readassignment-with-use or inline assignment
I am still astonished by some persistent folklore and “rules of thumb” that find their origin in a so ancient time that most of people forget why they follow this rule.
Recently, I have heard one of those rules of thumb:
If you want to keep object references into the registers you need to perform inline assignment in a if statement for example:
Cell[] as; if ((as = cells) != null) { // ... }
which is claimed to be more efficient than:
Cell[] as = cells; if (as != null) { // ... }
To back this statement, the JDK source code is used as best practices to be followed. More precisely in
LongAdder
class:public void add(long x) { Cell[] as; long b, v; int m; Cell a; if ((as = cells) != null || !casBase(b = base, b + x)) { boolean uncontended = true; if (as == null || (m = as.length - 1) < 0 || (a = as[getProbe() & m]) == null || !(uncontended = a.cas(v = a.value, v + x))) longAccumulate(x, null, uncontended); } }
Nice example of inline assignment written by Doug Lea (or at least approved by him). Why not doing like him, after all, he knows what is doing, right?
I have searched for explanation about this style, and found this thread where Rémi Forax found this code “not very java-ish”. Aurélien Gonnay noticed also a difference of style between JDK code and JSR one on the method sumThenReset
.
So let’s simplify this code but keeping the inline assignement style. I get a similar example from the previous thread with Rémi Forax:
Object[] cs; int n; Object a;
if ((cs = cells) != null && (n = cs.length) > 0) {
if ((a = cs[(n - 1) & h]) != null) {
System.out.println(a);
}
}
A more Java-ish code will be the following:
Object[] cs = cells;
if (cs != null) {
int n = cs.length;
if (n > 0){
Object a = cs[(n - 1) & h];
if (a != null) {
System.out.println(a);
}
}
}
Firstly, we compare the generated byte code for inline assignments:
0: aload_0
1: getfield #3 // Field cells:[Ljava/lang/Object;
4: dup
5: astore_1
6: ifnull 38
9: aload_1
10: arraylength
11: dup
12: istore_2
13: ifle 38
16: aload_1
17: iload_2
18: iconst_1
19: isub
20: getstatic #4 // Field h:I
23: iand
24: aaload
25: dup
26: astore_3
27: ifnull 38
30: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
33: ldc #6 // String foo
35: invokevirtual #7 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
38: return
with no inline assignments:
0: aload_0
1: getfield #3 // Field cells:[Ljava/lang/Object;
4: astore_1
5: aload_1
6: ifnull 38
9: aload_1
10: arraylength
11: istore_2
12: iload_2
13: ifle 38
16: aload_1
17: iload_2
18: iconst_1
19: isub
20: getstatic #4 // Field h:I
23: iand
24: aaload
25: astore_3
26: aload_3
27: ifnull 38
30: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
33: ldc #6 // String foo
35: invokevirtual #7 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
38: return
Here a diff as summary:
3,4c3,4
< 4: dup
< 5: astore_1
---
> 4: astore_1
> 5: aload_1
8,9c8,9
< 11: dup
< 12: istore_2
---
> 11: istore_2
> 12: iload_2
18,19c18,19
< 25: dup
< 26: astore_3
---
> 25: astore_3
> 26: aload_3
Differences are only spotted at lines 4, 11 and 25. Instead of dup
+ store
for inline assignment we get store
+ load
which semantically is the same. But is it more optimized with the first form? Let’s go check the assembly:
# {method} {0x000000000a462440} 'bench_c' '()V' in 'main/java/TestJIT'
# [sp+0x30] (sp of caller)
[Verified Entry Point]
0x000000000221adc0: mov DWORD PTR [rsp-0x6000],eax
0x000000000221adc7: push rbp
0x000000000221adc8: sub rsp,0x20 ;*synchronization entry
; - main.java.TestJIT::bench_c@-1 (line 16)
0x000000000221adcc: mov r11d,DWORD PTR [rdx+0xc] ;*getfield cells
; - main.java.TestJIT::bench_c@1 (line 16)
0x000000000221add0: mov r10d,DWORD PTR [r11+0xc] ;*arraylength
; - main.java.TestJIT::bench_c@10 (line 16)
; implicit exception: dispatches to 0x000000000221ae35
0x000000000221add4: test r10d,r10d
0x000000000221add7: jle 0x000000000221ae15 ;*ifle
; - main.java.TestJIT::bench_c@13 (line 16)
0x000000000221add9: mov ebp,r10d
0x000000000221addc: dec ebp
0x000000000221adde: movabs r8,0xd5e7f050 ; {oop(a 'java/lang/Class' = 'main/java/TestJIT')}
0x000000000221ade8: and ebp,DWORD PTR [r8+0x6c] ;*iand
; - main.java.TestJIT::bench_c@23 (line 17)
0x000000000221adec: cmp ebp,r10d
0x000000000221adef: jae 0x000000000221ae06
0x000000000221adf1: mov ebp,DWORD PTR [r11+rbp*4+0x10]
;*aaload
; - main.java.TestJIT::bench_c@24 (line 17)
0x000000000221adf6: test ebp,ebp
0x000000000221adf8: jne 0x000000000221ae29 ;*ifnull
; - main.java.TestJIT::bench_c@27 (line 17)
0x000000000221adfa: add rsp,0x20
0x000000000221adfe: pop rbp
0x000000000221adff: test DWORD PTR [rip+0xffffffffffce51fb],eax # 0x0000000001f00000
; {poll_return}
0x000000000221ae05: ret
Compared to no inline assignments:
# {method} {0x000000000ac52548} 'bench_java' '()V' in 'main/java/TestJIT'
# [sp+0x30] (sp of caller)
[Verified Entry Point]
0x0000000002a09ec0: mov DWORD PTR [rsp-0x6000],eax
0x0000000002a09ec7: push rbp
0x0000000002a09ec8: sub rsp,0x20 ;*synchronization entry
; - main.java.TestJIT::bench_java@-1 (line 24)
0x0000000002a09ecc: mov r11d,DWORD PTR [rdx+0xc] ;*getfield cells
; - main.java.TestJIT::bench_java@1 (line 24)
0x0000000002a09ed0: mov r10d,DWORD PTR [r11+0xc] ;*arraylength
; - main.java.TestJIT::bench_java@10 (line 26)
; implicit exception: dispatches to 0x0000000002a09f35
0x0000000002a09ed4: test r10d,r10d
0x0000000002a09ed7: jle 0x0000000002a09f15 ;*ifle
; - main.java.TestJIT::bench_java@13 (line 27)
0x0000000002a09ed9: mov ebp,r10d
0x0000000002a09edc: dec ebp
0x0000000002a09ede: movabs r8,0xd5e7efd8 ; {oop(a 'java/lang/Class' = 'main/java/TestJIT')}
0x0000000002a09ee8: and ebp,DWORD PTR [r8+0x6c] ;*iand
; - main.java.TestJIT::bench_java@23 (line 28)
0x0000000002a09eec: cmp ebp,r10d
0x0000000002a09eef: jae 0x0000000002a09f06
0x0000000002a09ef1: mov ebp,DWORD PTR [r11+rbp*4+0x10]
;*aaload
; - main.java.TestJIT::bench_java@24 (line 28)
0x0000000002a09ef6: test ebp,ebp
0x0000000002a09ef8: jne 0x0000000002a09f29 ;*ifnull
; - main.java.TestJIT::bench_java@27 (line 29)
0x0000000002a09efa: add rsp,0x20
0x0000000002a09efe: pop rbp
0x0000000002a09eff: test DWORD PTR [rip+0xfffffffffe5860fb],eax # 0x0000000000f90000
; {poll_return}
0x0000000002a09f05: ret
Here are the commands to diff without the addresses:
sed -e "s/0x[0-9a-f]*/0/g" inlineAssign1.txt > inlineAssign1_sed.txt
sed -e "s/0x[0-9a-f]*/0/g" inlineAssign2.txt > inlineAssign2_sed.txt
diff -w inlineAssign1_sed.txt inlineAssign2_sed.txt
1c1
< # {method} {0} 'bench_c' '()V' in 'main/java/TestJIT'
---
> # {method} {0} 'bench_java' '()V' in 'main/java/TestJIT'
7c7
< ; - main.java.TestJIT::bench_c@-1 (line 16)
---
> ; - main.java.TestJIT::bench_java@-1 (line 24)
10c10
< ; - main.java.TestJIT::bench_c@1 (line 16)
---
> ; - main.java.TestJIT::bench_java@1 (line 24)
13c13
< ; - main.java.TestJIT::bench_c@10 (line 16)
---
> ; - main.java.TestJIT::bench_java@10 (line 26)
17c17
< ; - main.java.TestJIT::bench_c@13 (line 16)
---
> ; - main.java.TestJIT::bench_java@13 (line 27)
23c23
< ; - main.java.TestJIT::bench_c@23 (line 17)
---
> ; - main.java.TestJIT::bench_java@23 (line 28)
29c29
< ; - main.java.TestJIT::bench_c@24 (line 17)
---
> ; - main.java.TestJIT::bench_java@24 (line 28)
33c33
< ; - main.java.TestJIT::bench_c@27 (line 17)
---
> ; - main.java.TestJIT::bench_java@27 (line 29)
You can check by yourself, line by line: each instruction is exactly the same on both version, only the addresses are different and the comment because, in my test, the method name is not the same. Conclusion? Inline assignments have no value in term of performance, JIT is sufficiently smart to handle this.
Note: I have used JDK 1.8.0_40 for this article.
Thanks to Aurélien Gonnay for the review.