Polytope Model

6700417 · on Dec 2, 2019

The ideas here go back a long way. As mentioned on Wikipedia, the Polly project for LLVM uses this approach.

There is also work for MLIR to use the polyhedral model (https://github.com/tensorflow/mlir/blob/master/g3doc/Rationa...).

macawfish · on Dec 2, 2019

This is really cool and good to know about. I hadn't heard of this technique before, but have caught some interest in SPIRV lately while learning about WebGPU. Well I just noticed that Polly is used in the LLVM SPIRV backend... Makes sense! https://github.com/KhronosGroup/LLVM-SPIRV-Backend/blob/mast...

I'm very excited about the potential for writing GPU code from whatever high level language is convenient. (Which will not be WSL, by the way).

jdoerfert · on Dec 2, 2019

It would be news to me that the PolyhedralInfo interface is actually used anywhere. As the file comment says, this was "work in progress" a few years ago and it shares the same drawbacks as Polly (unfortunately).

Polyhedral development in LLVM stagnated a while ago, maybe we find the people and time to actually land the PolyhedralValueInfo (see for example this talk: https://www.youtube.com/watch?v=xSA0XLYJ-G0).

Cladode · on Dec 2, 2019

Albert Cohen from Google gave a really good lecture on polyhedral compilation in the real world at PLISS [1] this year: slides [2], lecture videos [3, 4].

A core problem of the polyhedral approach, is that the thing that makes it so appealing (it being expressive enough to enable the many interesting program transformations), is costly for large programs making, the overall optimisation process and hence the compiler hard to scale. Naturally, this difficulty makes for interesting research problems.

[1] https://pliss.org/

[2] https://pliss2019.github.io/albert_cohen_slides.pdf

[3] https://www.youtube.com/watch?v=mt6pIpt5Wk0

[4] https://www.youtube.com/watch?v=3TNT5rFVTUY

aasasd · on Dec 2, 2019

> no iteration of the inner loop depends on the previous iteration's results

> a[i - j][j] = ... + a[i - j][j - 1]

Not seeing how the iteration for j doesn't depend on the result for j - 1 if it's invoked right there in the expression.

ulrikrasmussen · on Dec 2, 2019

The example is super confusing because they define the alternative coordinate system (i', j') in the previous paragraph, but then proceed to use i and j (unprimed) in the code example, presumably because C does not allow for primes in variable names. p and q would have been better.

But, note that if you execute the inner loop for two consecutive values, j' and j' + 1, then you get the following computations:

a[i' - j'][j'] <- a[i' - j' - 1][j' ] + a[i' - j' ][j' - 1]

a[i' - j' - 1][j' + 1] <- a[i' - j' - 2][j' + 1] + a[i' - j' - 1][j']

You can see from the above that the left hand side of the first line does not actually appear on the right hand side of the second. This is because both indexes into a vary as j' changes.

It's a cool optimization, but I think the article fails to clearly explain the general method of how a compiler would arrive there.

Edit: I completely screwed up the indices in my initial post. They should be fixed now.

Sean1708 · on Dec 2, 2019

  In [1]: N = 5
  
  In [2]: def skewed():
     ...:     for p in range(2, 2*N - 1):
     ...:         tmin = max(1, p - (N - 1))
     ...:         tmax = min((p + 1) // 2, N - 1)
     ...:         for t in range(tmin, tmax + 1):
     ...:             print("p: {} t: {} | a[{}][{}] = a[{}][{}] + a[{}][{}]".format(
     ...:                 p,
     ...:                 t,
     ...:                 p - t,
     ...:                 t,
     ...:                 p - t - 1,
     ...:                 t,
     ...:                 p - t,
     ...:                 t - 1,
     ...:             ))
  
  In [3]: def unskewed():
     ...:     for i in range(1, N):
     ...:         for j in range(1, min(i+2, N)):
     ...:              print("i: {} j: {} | a[{}][{}] = a[{}][{}] + a[{}][{}]".format(
     ...:                  i,
     ...:                  j,
     ...:                  i,
     ...:                  j,
     ...:                  i - 1,
     ...:                  j,
     ...:                  i,
     ...:                  j - 1,
     ...:              ))
  
  In [4]: unskewed()
  i: 1 j: 1 | a[1][1] = a[0][1] + a[1][0]
  i: 1 j: 2 | a[1][2] = a[0][2] + a[1][1]
  i: 2 j: 1 | a[2][1] = a[1][1] + a[2][0]
  i: 2 j: 2 | a[2][2] = a[1][2] + a[2][1]
  i: 2 j: 3 | a[2][3] = a[1][3] + a[2][2]
  i: 3 j: 1 | a[3][1] = a[2][1] + a[3][0]
  i: 3 j: 2 | a[3][2] = a[2][2] + a[3][1]
  i: 3 j: 3 | a[3][3] = a[2][3] + a[3][2]
  i: 3 j: 4 | a[3][4] = a[2][4] + a[3][3]
  i: 4 j: 1 | a[4][1] = a[3][1] + a[4][0]
  i: 4 j: 2 | a[4][2] = a[3][2] + a[4][1]
  i: 4 j: 3 | a[4][3] = a[3][3] + a[4][2]
  i: 4 j: 4 | a[4][4] = a[3][4] + a[4][3]
  
  In [5]: skewed()
  p: 2 t: 1 | a[1][1] = a[0][1] + a[1][0]
  p: 3 t: 1 | a[2][1] = a[1][1] + a[2][0]
  p: 3 t: 2 | a[1][2] = a[0][2] + a[1][1]
  p: 4 t: 1 | a[3][1] = a[2][1] + a[3][0]
  p: 4 t: 2 | a[2][2] = a[1][2] + a[2][1]
  p: 5 t: 1 | a[4][1] = a[3][1] + a[4][0]
  p: 5 t: 2 | a[3][2] = a[2][2] + a[3][1]
  p: 5 t: 3 | a[2][3] = a[1][3] + a[2][2]
  p: 6 t: 2 | a[4][2] = a[3][2] + a[4][1]
  p: 6 t: 3 | a[3][3] = a[2][3] + a[3][2]
  p: 7 t: 3 | a[4][3] = a[3][3] + a[4][2]
  p: 7 t: 4 | a[3][4] = a[2][4] + a[3][3]
  p: 8 t: 4 | a[4][4] = a[3][4] + a[4][3]

Actually writing it out like that really helped me grok what was going on, personally. If you look at the unskewed function then you can see that for every i there is a value for j that relies on the output from a previous j, e.g. the i: 1 j: 2 iteration uses a[1][1] which is the output from the i: 1 j: 1 iteration. But you can't find a similar example in the skewed function.

rustybolt · on Dec 2, 2019

This is hard to grasp without a diagram. The detailed example is probably easier to understand.

The result computed in the current iteration is a[i - j][j]. The result computed in the iteration before that is a[i - (j - 1)][j - 1].

aasasd · on Dec 2, 2019

I guess the page would benefit from listing the entire thing in the replacement code, instead of just the inner line. Then non-mathematical people like me would maybe have a chance at getting it without ramping up caffeine dosage to dangerous levels, so as to see the missing parts in the fabric of mathematical possibilities.

Reventlov · on Dec 2, 2019

That's fun, a few years ago in one of my course our teacher explained this model to us. He then basically said to us « well we're like 50 to work on this in the whole world, so it's a really small community and everyone knows everyone » (as everywhere, I guess).

And, indeed, he is in the community page ( Christophe Alias )

etaioinshrdlu · on Dec 2, 2019

This might be the most interesting compiler optimization technique I've ever heard of.