It's All Relative

Performance Minded

14 Oct 2015

7 mins read

assignment-with-use or inline assignment

I am still astonished by some persistent folklore and “rules of thumb” that find their origin in a so ancient time that most of people forget why they follow this rule.

Recently, I have heard one of those rules of thumb:

If you want to keep object references into the registers you need to perform inline assignment in a if statement for example:

Cell[] as;
if ((as = cells) != null) {
  // ...
}

which is claimed to be more efficient than:

Cell[] as = cells;
if (as != null) {
  // ...
}

To back this statement, the JDK source code is used as best practices to be followed. More precisely in LongAdder class:

    public void add(long x) {
        Cell[] as; long b, v; int m; Cell a;
        if ((as = cells) != null || !casBase(b = base, b + x)) {
            boolean uncontended = true;
            if (as == null || (m = as.length - 1) < 0 ||
                (a = as[getProbe() & m]) == null ||
                !(uncontended = a.cas(v = a.value, v + x)))
                longAccumulate(x, null, uncontended);
        }
    }

Nice example of inline assignment written by Doug Lea (or at least approved by him). Why not doing like him, after all, he knows what is doing, right?

I have searched for explanation about this style, and found this thread where Rémi Forax found this code “not very java-ish”. Aurélien Gonnay noticed also a difference of style between JDK code and JSR one on the method sumThenReset.

So let’s simplify this code but keeping the inline assignement style. I get a similar example from the previous thread with Rémi Forax:

Object[] cs; int n; Object a;
if ((cs = cells) != null && (n = cs.length) > 0) {
    if ((a = cs[(n - 1) & h]) != null) {
        System.out.println(a);
    }
}

A more Java-ish code will be the following:

Object[] cs = cells;
if (cs != null) {
    int n = cs.length;
    if (n > 0){
        Object a = cs[(n - 1) & h];
        if (a != null) {
            System.out.println(a);
        }
    }
}

Firstly, we compare the generated byte code for inline assignments:

 0: aload_0
 1: getfield      #3                  // Field cells:[Ljava/lang/Object;
 4: dup
 5: astore_1
 6: ifnull        38
 9: aload_1
10: arraylength
11: dup
12: istore_2
13: ifle          38
16: aload_1
17: iload_2
18: iconst_1
19: isub
20: getstatic     #4                  // Field h:I
23: iand
24: aaload
25: dup
26: astore_3
27: ifnull        38
30: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;
33: ldc           #6                  // String foo
35: invokevirtual #7                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
38: return

with no inline assignments:

 0: aload_0                                                                                            
 1: getfield      #3                  // Field cells:[Ljava/lang/Object;                               
 4: astore_1                                                                                           
 5: aload_1                                                                                            
 6: ifnull        38                                                                                   
 9: aload_1                                                                                            
10: arraylength                                                                                        
11: istore_2                                                                                           
12: iload_2                                                                                            
13: ifle          38                                                                                   
16: aload_1                                                                                            
17: iload_2                                                                                            
18: iconst_1                                                                                           
19: isub                                                                                               
20: getstatic     #4                  // Field h:I                                                     
23: iand                                                                                               
24: aaload                                                                                             
25: astore_3                                                                                           
26: aload_3                                                                                            
27: ifnull        38                                                                                   
30: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;              
33: ldc           #6                  // String foo                                                    
35: invokevirtual #7                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V      
38: return                                                                                             

Here a diff as summary:

3,4c3,4        
<  4: dup      
<  5: astore_1 
---            
>  4: astore_1 
>  5: aload_1  
8,9c8,9        
< 11: dup      
< 12: istore_2 
---            
> 11: istore_2 
> 12: iload_2  
18,19c18,19    
< 25: dup      
< 26: astore_3 
---            
> 25: astore_3 
> 26: aload_3

Differences are only spotted at lines 4, 11 and 25. Instead of dup + store for inline assignment we get store + load which semantically is the same. But is it more optimized with the first form? Let’s go check the assembly:

  # {method} {0x000000000a462440} 'bench_c' '()V' in 'main/java/TestJIT'
  #           [sp+0x30]  (sp of caller)
[Verified Entry Point]
  0x000000000221adc0: mov    DWORD PTR [rsp-0x6000],eax
  0x000000000221adc7: push   rbp
  0x000000000221adc8: sub    rsp,0x20           ;*synchronization entry
                                                ; - main.java.TestJIT::bench_c@-1 (line 16)

  0x000000000221adcc: mov    r11d,DWORD PTR [rdx+0xc]  ;*getfield cells
                                                ; - main.java.TestJIT::bench_c@1 (line 16)

  0x000000000221add0: mov    r10d,DWORD PTR [r11+0xc]  ;*arraylength
                                                ; - main.java.TestJIT::bench_c@10 (line 16)
                                                ; implicit exception: dispatches to 0x000000000221ae35
  0x000000000221add4: test   r10d,r10d
  0x000000000221add7: jle    0x000000000221ae15  ;*ifle
                                                ; - main.java.TestJIT::bench_c@13 (line 16)

  0x000000000221add9: mov    ebp,r10d
  0x000000000221addc: dec    ebp
  0x000000000221adde: movabs r8,0xd5e7f050      ;   {oop(a 'java/lang/Class' = 'main/java/TestJIT')}
  0x000000000221ade8: and    ebp,DWORD PTR [r8+0x6c]  ;*iand
                                                ; - main.java.TestJIT::bench_c@23 (line 17)

  0x000000000221adec: cmp    ebp,r10d
  0x000000000221adef: jae    0x000000000221ae06
  0x000000000221adf1: mov    ebp,DWORD PTR [r11+rbp*4+0x10]
                                                ;*aaload
                                                ; - main.java.TestJIT::bench_c@24 (line 17)

  0x000000000221adf6: test   ebp,ebp
  0x000000000221adf8: jne    0x000000000221ae29  ;*ifnull
                                                ; - main.java.TestJIT::bench_c@27 (line 17)

  0x000000000221adfa: add    rsp,0x20
  0x000000000221adfe: pop    rbp
  0x000000000221adff: test   DWORD PTR [rip+0xffffffffffce51fb],eax        # 0x0000000001f00000
                                                ;   {poll_return}
  0x000000000221ae05: ret    

Compared to no inline assignments:

  # {method} {0x000000000ac52548} 'bench_java' '()V' in 'main/java/TestJIT'
  #           [sp+0x30]  (sp of caller)
[Verified Entry Point]
  0x0000000002a09ec0: mov    DWORD PTR [rsp-0x6000],eax
  0x0000000002a09ec7: push   rbp
  0x0000000002a09ec8: sub    rsp,0x20           ;*synchronization entry
                                                ; - main.java.TestJIT::bench_java@-1 (line 24)

  0x0000000002a09ecc: mov    r11d,DWORD PTR [rdx+0xc]  ;*getfield cells
                                                ; - main.java.TestJIT::bench_java@1 (line 24)

  0x0000000002a09ed0: mov    r10d,DWORD PTR [r11+0xc]  ;*arraylength
                                                ; - main.java.TestJIT::bench_java@10 (line 26)
                                                ; implicit exception: dispatches to 0x0000000002a09f35
  0x0000000002a09ed4: test   r10d,r10d
  0x0000000002a09ed7: jle    0x0000000002a09f15  ;*ifle
                                                ; - main.java.TestJIT::bench_java@13 (line 27)

  0x0000000002a09ed9: mov    ebp,r10d
  0x0000000002a09edc: dec    ebp
  0x0000000002a09ede: movabs r8,0xd5e7efd8      ;   {oop(a 'java/lang/Class' = 'main/java/TestJIT')}
  0x0000000002a09ee8: and    ebp,DWORD PTR [r8+0x6c]  ;*iand
                                                ; - main.java.TestJIT::bench_java@23 (line 28)

  0x0000000002a09eec: cmp    ebp,r10d
  0x0000000002a09eef: jae    0x0000000002a09f06
  0x0000000002a09ef1: mov    ebp,DWORD PTR [r11+rbp*4+0x10]
                                                ;*aaload
                                                ; - main.java.TestJIT::bench_java@24 (line 28)

  0x0000000002a09ef6: test   ebp,ebp
  0x0000000002a09ef8: jne    0x0000000002a09f29  ;*ifnull
                                                ; - main.java.TestJIT::bench_java@27 (line 29)

  0x0000000002a09efa: add    rsp,0x20
  0x0000000002a09efe: pop    rbp
  0x0000000002a09eff: test   DWORD PTR [rip+0xfffffffffe5860fb],eax        # 0x0000000000f90000
                                                ;   {poll_return}
  0x0000000002a09f05: ret    

Here are the commands to diff without the addresses:

sed -e "s/0x[0-9a-f]*/0/g" inlineAssign1.txt > inlineAssign1_sed.txt
sed -e "s/0x[0-9a-f]*/0/g" inlineAssign2.txt > inlineAssign2_sed.txt
diff -w inlineAssign1_sed.txt inlineAssign2_sed.txt
1c1
<   # {method} {0} 'bench_c' '()V' in 'main/java/TestJIT'
---
>   # {method} {0} 'bench_java' '()V' in 'main/java/TestJIT'
7c7
<                                          ; - main.java.TestJIT::bench_c@-1 (line 16)
---
>                                          ; - main.java.TestJIT::bench_java@-1 (line 24)
10c10
<                                          ; - main.java.TestJIT::bench_c@1 (line 16)
---
>                                          ; - main.java.TestJIT::bench_java@1 (line 24)
13c13
<                                          ; - main.java.TestJIT::bench_c@10 (line 16)
---
>                                          ; - main.java.TestJIT::bench_java@10 (line 26)
17c17
<                                          ; - main.java.TestJIT::bench_c@13 (line 16)
---
>                                          ; - main.java.TestJIT::bench_java@13 (line 27)
23c23
<                                          ; - main.java.TestJIT::bench_c@23 (line 17)
---
>                                          ; - main.java.TestJIT::bench_java@23 (line 28)
29c29
<                                          ; - main.java.TestJIT::bench_c@24 (line 17)
---
>                                          ; - main.java.TestJIT::bench_java@24 (line 28)
33c33
<                                          ; - main.java.TestJIT::bench_c@27 (line 17)
---
>                                          ; - main.java.TestJIT::bench_java@27 (line 29)

You can check by yourself, line by line: each instruction is exactly the same on both version, only the addresses are different and the comment because, in my test, the method name is not the same. Conclusion? Inline assignments have no value in term of performance, JIT is sufficiently smart to handle this.

Note: I have used JDK 1.8.0_40 for this article.

Thanks to Aurélien Gonnay for the review.