Why should I initialise variables close to where they are used?

August 29, 2024

You have all heard that you should initialise close to where you first use your variables instead of at the top of the class, since it's better to understand, less likely to break when modified, etc, but what other benefits will you get? We will look at the compiler to understand why you should care. But first we need to understand what your compiler is doing.

Escape Analysis in Compilers:

Escape Analysis is an optimisation technique that allows your compiler to determine whether a variable is accessed, outside its scope.

In other words: your compiler will check your variables and look for those which have only local access. If accesses are only local, then this is an opportunity for optimisation of the code via memory management. So what exactly is happening to these local variables?

Stack Allocations:

A variable found to be local, without accessing outside its scope, can be allocated on the stack instead of the heap. Fun fact: accessing data on the L1 cache takes 0.5 nanoseconds (ns), L2 cache 7ns, while from main memory it can take much longer (orders of magnitude more).

Allocation avoidance:

A variable that is created but never used can be allocated on the stack as well, and in some cases discarded, making memory optimisation possible.

Note: Pointer analysis is also a key aspect of compiler optimisations, not treated here but very important, and is used for dead code elimination, inlining, and other optimisations.

So with the above to guide us what can be done better in this code?

public class DijkstraAlgo {
    private PriorityQueue<Vertex> pq; // we can move this inside the function computePath to make it local

    public void computePath(Vertex startVertex){

        startVertex.setDistance(0);
        pq = new PriorityQueue<>();
        pq.add(startVertex);

        while(!pq.isEmpty()){

            Vertex actualVertex = pq.poll();

            for (Edge edge : actualVertex.getAdjacencyList()) {
                Vertex u = edge.getStartVertex();  // can be removed since its not used
                Vertex v = edge.getTargetVertex();

                double d = actualVertex.getDistance() + edge.getWeight();
                
                if (d < v.getDistance()){
                    pq.remove(v);
                    v.setDistance(d);
                    v.setPredecessor(actualVertex);
                    pq.add(v);
                }
            }
         }
    }

    public List<Vertex> getShortestPathTo(Vertex targetVertex) {
        List<Vertex> shortestPath = new ArrayList<>();

        for (Vertex vertex = targetVertex; vertex!=null; vertex = vertex.getPredecessor()){
            shortestPath.add(vertex);
        } 
        
        Collections.reverse(shortestPath);
        return shortestPath;
    }
    
}

So to play along with the compiler we can move the priority queue inside the function and remove the variable u. This gives the compiler less work to do. This is a simple example but demonstrates working alongside your compiler. But note: Objects like pq (prio queue) will be on the heap since its java has a Garbage Collector (GC). (sad face)

Is there a tool though that could allow us to see what the compiler is doing? Yes!

JitWatch - the only way I know how to study the JVM compiler:

Java has a JIT compiler, called C1 and C2, which are two compilers that optimise your code. The C1 will pass over your code to do optimisations in a left to right fashion. There will however be inefficient parts of your code that the C1 could not optimise. So the C2 compiler passes over your code, hopping back and forth, optimising your code where possible. To visualise this process you will need a software called JitWatch, which will take a piece of JVM code, and visualise the compilation in a GUI. It is built to run on several platforms, so go ahead and get it at GitHub.

Download JitWatch on GitHub at link.

What does this look like?

In the figure below you will see a piece of code for a sorting algorithm called Sort.java. It's part of the samples in the Jitwatch tool. Below the code, there is a window that shows the code compiled to Byte code then to Assembly.

Why should you care? Well now you know what your code will look like to the machine it's running on. You get to see the load, the return, the goto, and several other semantic aspects of your imperative code.

Is this low level? For sure.

Where is it useful? For all those who think Java is a slow language, well I have news for you, this JIT is fast as a bullet train.

Note: To see the image in High Res, please click on the image.

JITWatch tool (click on image to see it in High Res)

Take away:

In summary it is useful to follow the variable initialisation recommendations of a great book like "Code Complete", specifically "Part 3 Variables", since initialisation of variables close to their first use creates a better experience for other developers, allows easier modification of code, but also allows us developers to work alongside our compilers, seeing compilers more like a team mate who needs some help, instead of a machine that just compiles or rejects code.

Code Complete: a Handbook for software construction (ISBN-13 ‏ : ‎ 978-0735619678)

Search This Blog

Concurrency Corner

Why should I initialise variables close to where they are used?

Comments

Post a Comment

Popular posts from this blog

HTAP Databases and the processes that keep them going - Part 1 of 4

HTAP databases - lets get distributed - Part 2 of 4

Cache and Buffer Pool Manager