# Transposing an NxM matrix

This topic is 2604 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I'm trying to transpose an NxM matrix, and I can do it iteratively, but I would like better performance. I saw mention of a cache-oblivious algorithm for doing this on wikipedia. It seemed to claim that one could break up a matrix into smaller ones and transpose those. But that doesn't make any sense. Just because all the submatrices are transposed, doesn't mean the full is. Sure, it'd be ok for going down the diagonal, but everything else would be off.

Is there anyone familiar with this that wouldn't mind explaining how this would work?

##### Share on other sites
transposition means replace rows with cells and cells with rows

to

it all depends on how you describe matrix :

- as an array of array

- or pointers

-or other

you can even make one structure that is row and cell

but then you have 2x more data

##### Share on other sites
@_____: From context, it looks like Homer is representing the matrix as a linear array with index arithmetic, like what C/C++ do under-the-hood for 2d arrays allocated on the stack.

@Homer: I think the idea is that, once you've transposed the scalar elements within the blocks, you can block-transpose those submatrices, and this is faster because those block copies are fast -- in symbols, something very roughly like this(link)*, but take this with a grain of salt as this isn't a subject I've spent any real time looking at.

* I'd like to embed this equation as an image, but keep getting "You are not allowed to use that image extension on this board" errors. Anybody know a workaround?

##### Share on other sites
Can you shed some more light on the problem - i.e. why you need to transpose arbitrary matrices?

The fastest way I can think of would be to not actually transform any data at all, but transform your array-lookup algorithm instead.template<class T, int N, int M, bool RowMajor, bool TransposedStorage> class MatrixAccesesor { public: MatrixAccesesor(T* data) : data(data) {} T& operator()( int n, int m ) { int row = TransposedStorage ? m : n; int column = TransposedStorage ? n : m; int rows = TransposedStorage ? M : N; int columns = TransposedStorage ? N : M; if(RowMajor) return data + column + row*columns; else return data + row + column*rows; } private: T* data; }; void foo() { int data[24][48]; MatrixAccesesor<int, 24,48, true,false> a(data); a(0,2) = 42; MatrixAccesesor<int, 48,24, true,true > b(data); assert( b(2,0) == 42 ); }Thanks haegarr for fixing my typo

##### Share on other sites

@_____: From context, it looks like Homer is representing the matrix as a linear array with index arithmetic, like what C/C++ do under-the-hood for 2d arrays allocated on the stack.

You're correct.

@Hodgman It does have to be an arbitrary matrix, but I do have a large fixed size NxM matrix I'm trying to transform.

[color=#1C2837][size=2]The fastest way I can think of would be to not actually transform any data at all, but transform your array-lookup algorithm instead.[/quote]
[color=#1C2837][size=2]I might end up messing with this. Interesting idea.

##### Share on other sites

[color="#1c2837"]The fastest way I can think of would be to not actually transform any data at all, but transform your array-lookup algorithm instead.

[color="#1c2837"]I might end up messing with this. Interesting idea.
[/quote]

Actually, C++ has support for this type of thing in its valarray class. I've never actually used it myself, but it seems to fit your situation quite well:
#include <iostream> #include <valarray> int main() { std::valarray<double> v(30); for (int i = 0; i < 30; ++i) v = static_cast<double>(i); size_t lengths1[] = {5,6}; size_t strides1[] = {6,1}; std::gslice mygslice1(0, std::valarray<size_t>(lengths1,2), std::valarray<size_t>(strides1,2)); std::valarray<double> matrix1 = v[mygslice1]; size_t lengths2[] = {6,5}; size_t strides2[] = {1,6}; std::gslice mygslice2(0, std::valarray<size_t>(lengths2,2), std::valarray<size_t>(strides2,2)); std::valarray<double> matrix2 = v[mygslice2]; for (size_t i=0; i<matrix1.size(); ++i) std::cout << matrix1 << ' '; std::cout << '\n'; for (size_t i=0; i<matrix2.size(); ++i) std::cout << matrix2 << ' '; std::cout << '\n'; } 

##### Share on other sites

The fastest way I can think of would be to not actually transform any data at all, but transform your array-lookup algorithm instead.

...with the obvious caveat that if you need to do a lot of work with the matrix, the time spent transposing it now might buy you enough cache coherency later to be a net win.

@homer: What's the application? In a lot of linear algebra applications you can avoid explicitly forming transposes (e.g., in least squares, a Grammian matrix G = A^T A shows up, but you can work with its QR decomposition A = Q R instead).

• 48
• 12
• 10
• 10
• 9
• ### Forum Statistics

• Total Topics
631379
• Total Posts
2999666
×