What is vectorization?

Started by
5 comments, last by flangazor 19 years ago
I've been seeing a bit lately about compiler optimizations and vectorization and all of that, and so I was wondering - what is vectorization? I just want a definition of what it is or what it does. Thanks in advance!
- fyhuang [ site ]
Advertisement
I think it's to do with using SIMD (Single Instruction, Multiple Data) instructions, so you have a 4 or so values, and you use them like a vector, performing one operation on all the values.

But I could well be wrong. Someone else feel free to back me up or tell me I'm talking rubbish [smile]
I think Evil Steve is right: there is a compiler, VectorC, that IIRC focuses right on this kind of optimizations...
The GCC docs talk about loop vectorization, what with vectorizable loops and unvectorizable loops, so I think it has more to do with loop optimization than SIMD?

Cheers!
- fyhuang [ site ]
Quote:Original post by fyhuang
The GCC docs talk about loop vectorization, what with vectorizable loops and unvectorizable loops, so I think it has more to do with loop optimization than SIMD?

Cheers!


Loop vectorization would be an attempt to make use of SIMD to optimize loops. In particular, if the calculation done in a given loop iteration is independant of the result from previous iterations, it may unroll the loop to a depth of, say, 4, and do the four unrolled copies in parallel via a set of SIMD instructions to a vector processor.
Generally speaking it is any technique that means that several bits of data are dealt with in parallel (ie you work with vectors of data) - so as Evil Steve said SIMD (of course there is also something called MIMD as well!). It could be seperate cores working on bits of adjacent data, or it could be a single core working on several bits of data at once.

Edit: of course threads don't count!
There's a lot of kinda true statements in this thread.

Vectorization [pdf] is the conversion of loops from a non-vectored form to a vectored form. A vectored form is one that has the same operation happening on all members of a range without dependencies. A lot of documents I've read imply that the ranges should be in contiguous memory areas but I think the only reason for that is because it makes it easy for the processor to know where to apply the computations next. Conceptually, it could be any set of data. The other suspect is that the only builtin data collection type in C and Fortran are arrays which have contiguous memory.

What it does with the vectored form depends on the target machine. On processors with large pipelines, like Sparcs, the vectorization means the pipeline can be used in full -resulting in a massive win. Also, since there are no dependencies the order does not matter. Therefore, the processing can be mapped out to several ALU's (or "cores" is they are on the same chip). If the processing is large enough and can be mapped out to several processors effectively then it is called parallelisation.

This topic is closed to new replies.

Advertisement