Helping the Compiler

While compilers are good at optimizing code, they are not magic wands. It is the engineer’s job to write code that the compiler can reason with and better apply it’s optimizations.

“You don’t have to be an engineer to be a racing driver, but you do have to have Mechanical Sympathy.”

- Jackie Stewart

Reducing order dependence

If the compiler cannot discern that the order of operations is unimportant, then it cannot do work ahead of time. When translating code to an intermediate form, some compilers will use Static Single Assignment (SSA) form. SSA is the idea that a variable is never modified after initialization and instead new variables are created for any mutations.

Here is an example of a function that is not conductive to SSA:

func processDataBefore(data []int) int {
  result := 0

  // Temporary variable reused across loop iterations
  temp := 0

  for i := 0; i < len(data); i++ {
    // The use of 'temp' is separated from its definition,
    // and its value changes for each iteration.
    temp = data[i] * 2

    if data[i] % 2 == 0 {
      result += temp - 5
    } else {
      result += temp + 3
    }
  }
  return result
}

This code can be re-written to be more SSA conducive:

func processDataAfter(data []int) int {
  result := 0

  for i := 0; i < len(data); i++ {
    // The lifetime of 'currentValueDoubled' is confined to this iteration.
    currentValueDoubled := data[i] * 2

    if data[i] % 2 == 0 {
      result += currentValueDoubled - 5
    } else {
      result += currentValueDoubled + 3
    }
  }

  return result
}

Reducing memory dependency

Linked lists are expensive due to memory dependency. Pointer driven tree algorithms are slow due to memory look-ups are chained together. Using a wide node algorithm like a B-Tree or a B*-Tree can make map/sets significantly faster. This also goes for using pointer based composition. Requiring memory lookups for each component restricts the compiler from being able to work ahead of time.

Write buffer awareness

The same considerations need to be taken for writing as well as reading. Keep data contiguous and isolate mutable and immutable values. Keeping transforms simple also allows the compiler to bypass the cache entirely. This data streaming can be valuable in avoiding the pollution of the cache.

Aliasing

References