Jump to content
  • Advertisement
Sign in to follow this  
jameszhao00

Packing Order Lookup Table Optimization

This topic is 2604 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

[size="5"]Background:
I'm currently working on some SIMD related stuff and a critical operation is 'packing' all elements in an 8-wide simd variable that satisfy a mask into memory. So let's say I have

simd = [x y A d e B k j]
mask = [0 0 1 0 0 1 0 0]
memory = [A B]



[size="5"]The allowed operations:

simd (operator) simd
or
(operator) simd

Standard math
+/-/*/and/or/!/abs

Permute
8 wide SIMD has 2 lanes.
lanes = [1 1 1 1 2 2 2 2].

In a permute, an output slot's data can come from any input slot, if that slot combination doesn't cross lanes
For example,

in = [1 2 3 4 5 6 7 8]
out = [_ _ _ _ _ _ _ _]


The 1st to 4th slot of out can be any one of [1 2 3 4].
The 5th to 8th slot can be any one of [5 6 7 8]
The selection parameter is an int [A(2 bits) B(2 bits) C(2 bits) D(2 bits)] where each element specifies what slot I select from. The 2 bits means that the left side's permute pattern has to match the right hand permute pattern. For example,

in = [0 5 1 5 2 5 3 5]
selection = [0 0 2 2]
out = [0 0 1 1 2 2 3 3]


Cross
Example,

in = [1 2 3 4 5 6 7 8]
out = [_ _ _ _ _ _ _ _]


The 1st to 4th slot can be [1 2 3 4] or [5 6 7 8]. The 5th to 8th slot can be [1 2 3 4] or [5 6 7 8].

[size="5"]An intuitive way to do the packing
(From http://stackoverflow...-gather-indices)
1. Define a lookup table that
a) maps from a mask to a permute / cross operation
2. At runtime, compute the mask, lookup a permute/cross parameter, permute and then cross.

a = data
b = permute(a, K)
c = cross(a, b, L)



[size="5"]My current lookup table generation is as follows:
1. Let simd in = [Left Right] = [L0 L1 L2 L3 R0 R1 R2 R3]
2. Compute a permute selection that moves all valid elements in Left to as left as possible
3. Compute a permute selection that
a. moves 4 - length(valid elements in Left) valid elements in Right to length(valid elements in Left)
b. moves the rest in Right to as left as possible

For example, in = [0 A B 0 0 C D E]
1. Desired Left is [A B 0 0]
2/3. Desired Right is [E 0 C D]
When crossed, we get [A B C D E 0 0 0]


I can generate the parameter lookup table, but this is currently 2^8 * 3 entries.
1. A permute parameter for Left and Right
2. A mask -> length lookup table. E.g. [0 0 1 0 0 1 0 0] = 2

Are there any patterns that I can exploit to reduce the number of entries?

Share this post


Link to post
Share on other sites
Advertisement
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!