Jump to content
  • Advertisement
Sign in to follow this  
dpadam450

Array vs Vector Discussion :)

This topic is 2894 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I saw someones code at work and wondered why they were using a vector instead of an array since we knew the size, so I almost asked the question on here about speed of access but I pretty much knew the answer anyway.

So this is not a question but to beginners, if you know the size of something before the start of your application, use an array. They are always the fastest, and vectors use an underlying array as well, but they are a lot slower to access.

int array[];
vector<int> vector;

array//not a function call, no stack invoked
vector//function call, stack invoked, variables/stack frame saved

array wins 31ms to 1280 ms

#include <vector>
#include <iostream>
#include <windows.h>
using namespace std;

#define XSIZE 10000000
vector<int> vecNumbers;
int arrayNumbers[XSIZE];
int main()
{
vecNumbers.reserve(10000000);
for(int i = 0; i < XSIZE; i++)
{
vecNumbers.push_back(0);
arrayNumbers = 0;
}

float start_time = GetTickCount();
for(int i = 0; i < XSIZE;i++)
{
arrayNumbers++;
}
cout << GetTickCount() - start_time << endl;


start_time = GetTickCount();
for(int i = 0; i < XSIZE;i++)
{
vecNumbers++;
}
cout << GetTickCount() - start_time;

system("pause");
return 0;
}






[Edited by - dpadam450 on October 18, 2010 6:01:33 PM]

Share this post


Link to post
Share on other sites
Advertisement
I find those numbers highly suspect unless you compiled in debug mode. The subscript operator will likely be inlined in release.

Share this post


Link to post
Share on other sites
I wouldn't be surprised if the release version just optimized both code blocks out, since they're not really doing anything.

Share this post


Link to post
Share on other sites
Quote:
Original post by Rycross
I find those numbers highly suspect unless you compiled in debug mode. The subscript operator will likely be inlined in release.


Or MSVC in release if SECURE_SCL hasn't been disabled.

Share this post


Link to post
Share on other sites
Well still, how many times do we need to run in debug mode? You still need fast enough debug code.

Release with size = 100,000,000
time = 100ms vs 200ms

yea release is fairly negligible, but if it were a 2D or 3D array/vector it would most likely increase in time gap as well.

Also sometimes people forget to reserve their vectors, so millions of items will be slow if you forget to ask for your vector to be big enough to hold all of those items.

Share this post


Link to post
Share on other sites
The purpose behind a debug build is to catch potential errors. In that case using a raw array rather than a vector is counterproductive since a vector can do nice things like bounds checking. However, there are classes for compile time constant arrays that would do a bit better than vectors performance-wise in release, but do checking in debug, such as boost::array.

Share this post


Link to post
Share on other sites
I haven't messed with boost, any other advantages to boost::array other than bounds checks?

Quote:

since a vector can do nice things like bounds checking


It only does that if you use an iterator though. Maybe .at() checks bounds too?

Share this post


Link to post
Share on other sites
Quote:
Original post by bobofjoe
I wouldn't be surprised if the release version just optimized both code blocks out, since they're not really doing anything.


Yeah, I had to add in an extra cout utilizing the array to avoid it being optimized out by gcc. Gcc didn't seem to want to optimize out the std::vector altogether.

Results on doing the same test on 10000000 elements?

57-60 ms vs 61-64 ms. In other words, basically the same.


I did change the post-increment operator to the pre-increment operator. I suspect that gcc would optimize out the copy and call to operator = for vector, but decided to change it just to be sure.

Quote:
Original post by dpadam450
Well still, how many times do we need to run in debug mode? You still need fast enough debug code.


Debug mode is a concern and one of the reasons EA wrote EASTL (note that they implemented their own containers rather than defaulted to plain arrays), but its still better to go with a vector until you are having a problem. As others have pointed out, part of the point of debug is to make sure you aren't doing things wrong, so using a safe container with a speed hit and runtime checks in debug is preferable to extra speed, unless you're hitting serious bottlenecks.


Quote:
Original post by dpadam450
Also sometimes people forget to reserve their vectors, so millions of items will be slow if you forget to ask for your vector to be big enough to hold all of those items.


If those developers can't use std::vector correctly, what makes you think they'll use raw arrays correctly?

[Edited by - Rycross on October 18, 2010 4:59:58 PM]

Share this post


Link to post
Share on other sites
Quote:

Well still, how many times do we need to run in debug mode? You still need fast enough debug code.

The checks can be disabled in debug mode if you like.


Ok, lets look at the assembly. Here is the source I used:

#include <vector>
#include <cstdio>
#include <windows.h>

using namespace std;

const unsigned int XSIZE = 10000000;

int main()
{
vector<int> vecNumbers;
int arrayNumbers[XSIZE];
vecNumbers.reserve(10000000);
for(int i = 0; i < XSIZE; i++)
{
vecNumbers.push_back(0);
arrayNumbers = 0;
}

DWORD start_time = GetTickCount();
for(int i = 0; i < XSIZE;i++)
{
arrayNumbers++;
}
DWORD elapsed = GetTickCount() - start_time;
printf("array: %d\n", elapsed);

start_time = GetTickCount();
for(int i = 0; i < XSIZE;i++)
{
vecNumbers++;
}
elapsed = GetTickCount() - start_time;
printf("vector: %d\n", elapsed);

system("pause");
return 0;
}



Assembler output:

; 19 :
; 20 : DWORD start_time = GetTickCount();

mov edi, DWORD PTR __imp__GetTickCount@0
call edi
mov ebx, eax

; 21 : for(int i = 0; i < XSIZE;i++)

xor eax, eax
npad 4
$LL6@main:

; 22 : {
; 23 : arrayNumbers++;

inc DWORD PTR _arrayNumbers$[esp+eax*4+40000056]
inc eax
cmp eax, 10000000 ; 00989680H
jb SHORT $LL6@main

; 24 : }
; 25 : DWORD elapsed = GetTickCount() - start_time;

call edi
sub eax, ebx

; 26 : printf("array: %d\n", elapsed);

push eax
push OFFSET $SG-11
call _printf
add esp, 8

; 27 :
; 28 : start_time = GetTickCount();

call edi
mov ebx, eax

; 29 : for(int i = 0; i < XSIZE;i++)

xor eax, eax
$LL3@main:

; 30 : {
; 31 : vecNumbers++;

inc DWORD PTR [esi+eax*4]
inc eax
cmp eax, 10000000 ; 00989680H
jb SHORT $LL3@main

; 32 : }
; 33 : elapsed = GetTickCount() - start_time;

call edi
sub eax, ebx

; 34 : printf("vector: %d\n", elapsed);

push eax
push OFFSET $SG-12
call _printf

; 35 :
; 36 : system("pause");

push OFFSET $SG-13
call _system
add esp, 12 ; 0000000cH



To simplify, I replaced ostream with printf (ostream adds lots of gunk to asm output). I also stack allocated both objects (I had to increase the stack size to accomodate the large array).

Just to highlight, for an array:

; 23 : arrayNumbers++;

inc DWORD PTR _arrayNumbers$[esp+eax*4+40000056]
inc eax
cmp eax, 10000000 ; 00989680H
jb SHORT $LL6@main

; 24 : }

Whereas for a vector:

; 31 : vecNumbers++;

inc DWORD PTR [esi+eax*4]
inc eax
cmp eax, 10000000 ; 00989680H
jb SHORT $LL3@main

I'm no assembly wizard, but this seems pretty comparable. On my machine I get near identical times, maybe a millisecond favouring one over the other, mine runs in ~16 milliseconds which is too close to the scheduler quantum to be a good measurement.

Upping the size by 10 and looping 10 times shows that the skew is slight in each direction, sometimes favouring vector and other times favouring the array (here both stack and a global array were used, out of curiousity):
Quote:

array: 202 <-- winner
vector: 219
static array: 405
array: 250
vector: 249
static array: 234 <-- winner
array: 219 <-- winner
vector: 296
static array: 296
array: 297
vector: 218 <-- winner
static array: 219
array: 203
vector: 187 <-- winner
static array: 219
array: 202
vector: 188 <-- winner
static array: 202
array: 219
vector: 203
static array: 202 <-- winner
array: 203
vector: 187 <-- winner
static array: 203
array: 203
vector: 202 <-- winner
static array: 203
array: 187 <-- winner (joint)
vector: 187 <-- winner (joint)
static array: 219

In the above run, there is a good mix of winners, the differences more likely due to external issues such as the cache or external processes than any intristic benefit of one technique over the other.

The source for that last test is (aside from the "winner" annotations):

#include <vector>
#include <cstdio>
#include <windows.h>

using namespace std;

const unsigned int XSIZE = 100000000;

int staticArray[XSIZE];

int main()
{
vector<int> vecNumbers(XSIZE, 0);
int arrayNumbers[XSIZE] = {};

DWORD start_time, elapsed;

for(int j = 0 ; j < 10 ; ++j) {

start_time = GetTickCount();
for(int i = 0; i < XSIZE;i++)
{
arrayNumbers++;
}
elapsed = GetTickCount() - start_time;
printf("array: %d\n", elapsed);



start_time = GetTickCount();
for(int i = 0; i < XSIZE;i++)
{
vecNumbers++;
}
elapsed = GetTickCount() - start_time;
printf("vector: %d\n", elapsed);

start_time = GetTickCount();
for(int i = 0; i < XSIZE;i++)
{
staticArray++;
}
elapsed = GetTickCount() - start_time;
printf("static array: %d\n", elapsed);
}

system("pause");
return 0;
}



Again: stack size is increased to handle the insane array.

My conclusion: bullshit

Share this post


Link to post
Share on other sites
Quote:
Original post by dpadam450
I haven't messed with boost, any other advantages to boost::array other than bounds checks?

boost::array supplies several advantages. For one thing it's a first class type unlike raw arrays, and you can do operations like assigning arrays with operator= in the intuitive manner. boost::array also provides a interface similar to that of the standard library containers. Ex:you can use begin() and end() with boost array to pass to standard library algorithms without manually doing the pointer arithmetic.

Quote:

It only does that if you use an iterator though. Maybe .at() checks bounds too?

No, pretty much every vector implementation will do bounds checks for you in debug mode using operator[]. at() is defined to do bounds checks even in release mode: it will throw an exception if you try to access an invalid element.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!