Sign in to follow this  

C++ Gem...getting size of array....

This topic is 3709 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Thought I would post this little C++ template 'Gem' for comment. Where you need to get the size of an array, as in:
size_t num_elements = sizeof(some_array)/sizeof(array_element);
You can use a function template approach, as:
template<typename T, std::size_t N>
inline std::size_t containerSize( T (&)[N] )
{ 
	return N;
}

// Then wherever you need an array size, just...
size_t num_elements = containerSize(some_array);
Template power! --random

Share this post


Link to post
Share on other sites
I generally find this technique to be too dangerous to use.

It will only work for arrays -- that is, not arrays that have decayed. It can be pretty easy for an uninitiated programmer to use it when it isn't appropriate. It is very easy for any programmer to leave the use in place when refactoring the code such that it's no longer valid.

The situations in which I use arrays that this technique is compatible with are so rare in C++ that I've never bothered to employ it in practice.

Share this post


Link to post
Share on other sites
The technique itself is dangerous -- this specific implementation is a common way to attempting to protect against the obvious misuses of the technique.

The general uselessness of the technique stems from the fact that it only applies when you already know the size -- it's primary advantage is that it helps prevent bugs when you change the init size of the array but fail to change any loops or other code that gate on the size of the array. But as SiCrane said, in those cases you can generally use something like boost::array, which is still a nondynamic array and have an even stronger model of the concept, as well as other advantages.

Share this post


Link to post
Share on other sites
Quote:
Original post by rozz666
Why do you think it's dangerous? This funtion works only for constant size arrays.
Sometimes it's better/faster to use a local array than dynamic, so this function can be useful.

When? Really, if you can demonstrate any practical instance where an array is measurably faster than a properly-used std::vector, I would be amazed.

If you're talking about allocating a small array on the stack, okay, sure, that can be harmless. But then the size is readily available.

Share this post


Link to post
Share on other sites
Quote:
Original post by drakostar
Quote:
Original post by rozz666
Why do you think it's dangerous? This funtion works only for constant size arrays.
Sometimes it's better/faster to use a local array than dynamic, so this function can be useful.

When? Really, if you can demonstrate any practical instance where an array is measurably faster than a properly-used std::vector, I would be amazed.

If you're talking about allocating a small array on the stack, okay, sure, that can be harmless. But then the size is readily available.


Yes, I was thinking about allocating on the stack.

When in a practical situation size of an array (except char *) is not available?

This template can be useful anywhere you use sizeof(v) / sizeof(v[0]), e.g.
You have an array int buf[BUFSIZE1]. But then you decided it has to be int buf[BUSIZE1 + BUFSIZE2]. You don't have to change the rest of the code, unless you need to add more functionality.

Share this post


Link to post
Share on other sites
Quote:
Original post by rozz666
This template can be useful anywhere you use sizeof(v) / sizeof(v[0])...

Which is only in places where you already know the size of the array, because you defined it in the same scope. It's quicker to just store the dimension you statically requested.

Cross scope, or worse, function boundary and all that info is done.

Quote:
You have an array int buf[BUFSIZE1]. But then you decided it has to be int buf[BUSIZE1 + BUFSIZE2].

Then adjust your local size variable and stop writing hacky code.

Share this post


Link to post
Share on other sites
I'm currently playing around with this in an error code system...using a map to store the messages by index, creating error states and then using these to make error policies (ie FatalError, TolerableError, etc.). The beauty of using this combo (in addition to efficiency) is that the only place that needs extending is the errors[] array, the template automatically sets the correct size, so the entire setup is data driven.


// Returns the number of elements, pass an array.
template<typename T, std::size_t N>
inline std::size_t containerSize( T (&)[N] )
{
return N;
}

// Error messages...
struct ErrorMessages
{
struct MapThunk
{
const int idx;
const char * message;

operator std::pair<const int,const std::string>()
{
return std::make_pair(idx,message);
}
};

static MapThunk errors[];
static const std::size_t size;
};

// Create map, control and add codes here...
ErrorMessages::MapThunk ErrorMessages::errors[] = {
{1,"This is error 1"},
{2,"This is error 2"},
{3,"This is error 3"},
{4,"This is error 4"},
};

// Get codes container size...
const std::size_t ErrorMessages::size = containerSize(ErrorMessages::errors);

// Define the codemap...
struct ErrorMap
{
typedef std::map<const int,const std::string> MapType;
ErrorMap() :
data(ErrorMessages::errors,ErrorMessages::errors+ErrorMessages::size)
{ }

MapType data;
};

// Message accessors...
struct Messages
{
static std::string get(const int idx)
{
if (map.data.find(idx) == map.data.end())
throw std::runtime_error("Illegal error message index");
return map.data[idx];
}

static ErrorMap map;
};

// Initialize the message map...
ErrorMap Messages::map;

// Error states...
template <const int idx,typename T=std::ostream>
struct ErrorStates
{
static void die()
{
T stream;
stream << "\nFATAL ERROR: " + Messages::get(idx) + ", exiting..." << std::endl;
exit (1);
}

static void survive()
{
T stream;
stream << "\nWARNING: " + Messages::get(idx) + ", continuing..." << std::endl;
// Continue...
}
};

// Error Policies...


--random

Share this post


Link to post
Share on other sites
Yes I have tried this technique before, but the amount of time and thought it took to decide whether the array was a worthy choice for this technique was more than it took to create and use a size variable for the array. A size variable is also universal across both static and dynamic arrays. However, my application domain hasn't required the use of C++ in quite some time so this has been a non-issue lately.

Share this post


Link to post
Share on other sites
Quote:
Original post by Oluseyi
Quote:
Original post by rozz666
This template can be useful anywhere you use sizeof(v) / sizeof(v[0])...

Which is only in places where you already know the size of the array, because you defined it in the same scope. It's quicker to just store the dimension you statically requested.


There are situations where this is a useful technique. The most common place I find it useful is where I have a static constant array whose size is implicit from the number of initializers. In that case it's handy to be able to get the size of the array using this technique without having to duplicate information by creating a size variable that must be updated if you add elements to the array. e.g.


namespace
{
const char* someConstantStrings[] = { "apple", "orange", "pear" };
}

void doSomethingWithStrings()
{
for (int i = 0; i < lengthof(someConstantStrings); ++i)
{
doSomethingWithString(someConstantStrings[i]);
}
}




[edit] Which now I look at it is pretty much exactly the use the OP was putting it to in his error example above.

Share this post


Link to post
Share on other sites
Personally, I still find having a macro to do: sizeof( array ) / sizeof( array[ 0 ] ) is better than using the template function (unfortunately).


int arrayTest[ ] =
{
4,
9,
7,
3
};

STATIC_ASSERT( containerSize( arrayTest ) == 4 ); // error C2975: expected compile-time constant expression
STATIC_ASSERT( sizeof( arrayTest ) / sizeof( arrayTest[ 0 ] ) == 4 ); // ok.

void Foo( )
{
int someArray[ containerSize( arrayTest ) ];
// error C2057: expected constant expression
// error C2466: cannot allocate an array of constant size 0
// error C2133: 'arraysize' : unknown size

int someArray[ sizeof( arrayTest ) / sizeof( arrayTest[ 0 ] ); // ok.
}



Share this post


Link to post
Share on other sites
Quote:
Original post by mattnewport
There are situations where this is a useful technique. The most common place I find it useful is where I have a static constant array whose size is implicit from the number of initializers. In that case it's handy to be able to get the size of the array using this technique without having to duplicate information by creating a size variable that must be updated if you add elements to the array. e.g.

You should have just determined the size right after initialization, using this little "technique." Your example is contrived, by needing to have the array global with respect to the function definition. And since you know fully well how much we bash on global variables due to ownership problems (shoving it in an anonymous namespace doesn't make it any less global). So... no. Still close to useless.

Share this post


Link to post
Share on other sites
Quote:
Original post by Oluseyi
You should have just determined the size right after initialization, using this little "technique."

I disagree. Determining the size after initialization just introduces another variable with redundant information - the compiler knows the size of the array and with this technique you can refer to it directly, so why introduce another name with the possibility (even if slight) of the use of the name getting out of sync with the quantity you care about (the size of a fixed array).
Quote:
Your example is contrived, by needing to have the array global with respect to the function definition. And since you know fully well how much we bash on global variables due to ownership problems (shoving it in an anonymous namespace doesn't make it any less global). So... no. Still close to useless.

The example isn't contrived, I've used a utility implementation of exactly this technique on many occasions in code that looks a lot like that example. I don't know why you're dragging global variables into the conversation, all that's required for this technique is that the compiler knows the array size at the point of use. For the sake of simplicity my example used an array in an anonymous namespace but the array could equally be a local variable in a function, a local static in a function or a class static.

The benefit of this technique in my opinion is that it follows the 'Don't Repeat Yourself' principle - the size of the array is implicit in the array definition and if the compiler knows the size at the point you wish to refer to it there is no benefit and some potential for error involved if you introduce a new variable to hold the size of the array.

Share this post


Link to post
Share on other sites
Oops!

Just noticed that I could eliminate a variable and a few lines of code using this approach...


// Returns the number of elements, pass an array.
template<typename T, std::size_t N>
inline std::size_t containerSize( T (&)[N] )
{
return N;
}

// Error messages...
struct ErrorMessages
{
struct MapThunk
{
const int idx;
const char * message;

operator std::pair<const int,const std::string>()
{
return std::make_pair(idx,message);
}
};

static MapThunk errors[];
};

// Create map, control and add codes here...
ErrorMessages::MapThunk ErrorMessages::errors[] = {
{1,"This is error 1"},
{2,"This is error 2"},
{3,"This is error 3"},
{4,"This is error 4"},
};

// Define the codemap...
struct ErrorMap
{
typedef std::map<const int,const std::string> MapType;
ErrorMap() :
data(ErrorMessages::errors,ErrorMessages::errors+containerSize(ErrorMessages::errors))
{
}

MapType data;
};

// Message accessors...
struct Messages
{
static std::string get(const int idx)
{
if (map.data.find(idx) == map.data.end())
throw std::runtime_error("Illegal error message index");
return map.data[idx];
}

static ErrorMap map;
};

// Initialize the message map...
ErrorMap Messages::map;

// Error states...
template <const int idx,typename T=std::ostream>
struct States
{
static void die()
{
T stream;
stream << "\nFATAL ERROR: " + Messages::get(idx) + ", exiting..." << std::endl;
exit (1);
}

static void survive()
{
T stream;
stream << "\nWARNING: " + Messages::get(idx) + ", continuing..." << std::endl;
// Continue...
}
};

// Error Policies...


--random

Share this post


Link to post
Share on other sites

typedef std::map<const int,const std::string> MapType;

namespace Messages
{
const MapType & Error()
{
static MapType the_map;

if (the_map.empty())
{
boost::assign::insert (the_map)
(1,"This is error 1")
(2,"This is error 2")
(3,"This is error 3")
(4,"This is error 4");
}

return the_map;
}
}

struct ErrorMap
{
ErrorMap() : data(Messages::Error()) {}
MapType data;
};

Share this post


Link to post
Share on other sites
Quote:
Original post by random_thinker
No matter how compact or efficient the code looks, someone here always finds a shorter method. I suppose that if one day I post code and no one responds, it must be pretty good!


... Or it COULD be that I (or someone else) discovered that the code could be replaced with nothing at all, and that I (or someone else) could demonstrate this in a Zen-like manner by not posting ;)

Share this post


Link to post
Share on other sites
Oh, I get it, you get rated up for those posts you never write, too? [wink]

Generally, there's a couple of thin lines between 'short enough to be obscure', 'elegantly short' and 'too verbose'. The only verbosity that deserves to be accepted, though, is a self-documented one.

Though I always enjoy those if (a == true) return true; else if (a == false) return false; else return false;

Share this post


Link to post
Share on other sites
Quote:
When? Really, if you can demonstrate any practical instance where an array is measurably faster than a properly-used std::vector, I would be amazed.

Prepare to be amazed by two examples :)
1) a routine that needs small amounts of temporary storage but cannot share it between calls due to reentrancy requirements. (dynamic allocation is more expensive than grabbing some memory from the stack)
2) bin- aka counting sort, where you'd want to resize the destination container without initializing its values (EASTL provides this capability, but you are talking about std::vector)


Quote:
Personally, I still find having a macro to do: sizeof( array ) / sizeof( array[ 0 ] ) is better than using the template function (unfortunately).

The former has a huge problem: if you refactor the code such that "array" is now a pointer, it'll still compile but do the wrong thing. After encountering such a bug and spending quite a while auditing the remainder of the code, the following came to mind:


// (function taking a reference to an array and returning a pointer to
// an array of characters. it's only declared and never defined; we just
// need it to determine n, the size of the array that was passed.)
template<typename T, size_t n> char (*ArraySizeDeducer(T (&)[n]))[n];

// (although requiring C++, this method is much better than the standard
// sizeof(name) / sizeof(name[0]) because it doesn't compile when a
// pointer is passed, which can easily happen under maintenance.)
#define ARRAY_SIZE(name) (sizeof(*ArraySizeDeducer(name)))

You still get a compile-time constant but are insured against the above bug.

Share this post


Link to post
Share on other sites
If I'm writing my own code, then I don't end up needing to do sizeof(ary_)/sizeof(ary_[0]), but I find that I end up having to do this constantly when writing C++ code that talks to Windows because the Win32 API is a C API and doesn't understand std::vector or any other real container other than a pointer to the first element of the array and a count of the number of elements in the array. I'll use std::vector when the array is dynamic, but when its a static array of elements, I'll just allocate the elements on the stack and use the sizeof business.

The reason for something like this:


#define NUM_OF(ary_) (sizeof(ary_)/sizeof(ary_[0]))


is so that the number of elements in the array are defined once and only once. If I change the number of elements in the array later, I don't have to chase down all the places where the array and its size are used and update them. Using NUM_OF eliminates the redundancy of the array size. Further, although I may need to supply an explicit count to an API function, I may not have an explicit count anywhere in my source code because I created the array on the stack with an aggregate initializer and an unspecified size:


int data[] = { 1, 2, 3, 4 };
::SomeAPIThatLikesCstyleArrays(data, NUM_OF(data));


To me, this is very idiomatic, and the std::vector form feels much more "busy" because I have to fill the array one element at a time:


std::vector<int> data;
data.push_back(1);
data.push_back(2);
data.push_back(3);
data.push_back(4);
::SomeAPIThatLikesCstyleArrays(&data[0], data.size());


Its also less efficient (not that this demonstrates a huge difference) because the compiler can't determine that my array elements are just a pile of constants and put that in the initialized data segment, boiling down the two lines of code in the first case into just a call to the C API function with the appropriate pointer.

Share this post


Link to post
Share on other sites
Quote:
Original post by Jan Wassenberg
Quote:
When? Really, if you can demonstrate any practical instance where an array is measurably faster than a properly-used std::vector, I would be amazed.

Prepare to be amazed by two examples :)
1) a routine that needs small amounts of temporary storage but cannot share it between calls due to reentrancy requirements. (dynamic allocation is more expensive than grabbing some memory from the stack)
2) bin- aka counting sort, where you'd want to resize the destination container without initializing its values (EASTL provides this capability, but you are talking about std::vector)


Quote:
Personally, I still find having a macro to do: sizeof( array ) / sizeof( array[ 0 ] ) is better than using the template function (unfortunately).

The former has a huge problem: if you refactor the code such that "array" is now a pointer, it'll still compile but do the wrong thing. After encountering such a bug and spending quite a while auditing the remainder of the code, the following came to mind:


// (function taking a reference to an array and returning a pointer to
// an array of characters. it's only declared and never defined; we just
// need it to determine n, the size of the array that was passed.)
template<typename T, size_t n> char (*ArraySizeDeducer(T (&)[n]))[n];

// (although requiring C++, this method is much better than the standard
// sizeof(name) / sizeof(name[0]) because it doesn't compile when a
// pointer is passed, which can easily happen under maintenance.)
#define ARRAY_SIZE(name) (sizeof(*ArraySizeDeducer(name)))

You still get a compile-time constant but are insured against the above bug.

That's great and solves the case where I'm using the constant for the array size in my Foo( ) function. However it doesn't work with the file scope static asserts.

Share this post


Link to post
Share on other sites
Quote:
Original post by SiCrane
Or just use boost::array


If you don't need dynamic sizing, then just use boost::array. Seriously. It's going to be zero-overhead on non-braindead compilers (because it's just an array wrapped in a struct), and using them consistently means you avoid pointer decay. It's also a lot easier to wrap your head around the syntax for them while *knowing* you aren't going to suffer pointer decay. And you can use them with C APIs just fine; ask them for the .size() (Edit: free, of course - deduced from the template) and pass that along with the address-of the zeroth element.

Quote:
Original post by Jan Wassenberg
2) bin- aka counting sort, where you'd want to resize the destination container without initializing its values (EASTL provides this capability, but you are talking about std::vector)


Um... .reserve()?

Share this post


Link to post
Share on other sites
Quote:
That's great and solves the case where I'm using the constant for the array size in my Foo( ) function. However it doesn't work with the file scope static asserts.

hm. You'll either want to upgrade your compiler to VC2005, or replace the implementation of STATIC_ASSERT. BOOST_STATIC_ASSERT works fine, as do either of the following:


// generate a symbol containing the line number of the macro invocation.
// used to give a unique name (per file) to types made by cassert.
// we can't prepend __FILE__ to make it globally unique - the filename
// may be enclosed in quotes. PASTE3_HIDDEN__ is needed to make sure
// __LINE__ is expanded correctly.
#define PASTE3_HIDDEN__(a, b, c) a ## b ## c
#define PASTE3__(a, b, c) PASTE3_HIDDEN__(a, b, c)
#define UID__ PASTE3__(LINE_, __LINE__, _)
#define UID2__ PASTE3__(LINE_, __LINE__, _2)

/**
* compile-time debug_assert. causes a compile error if the expression
* evaluates to zero/false.
*
* no runtime overhead; may be used anywhere, including file scope.
* especially useful for testing sizeof types.
*
* this version has a more descriptive error message, but may cause a
* struct redefinition warning if used from the same line in different files.
*
* note: alternative method in C++: specialize a struct only for true;
* using it will raise 'incomplete type' errors if instantiated with false.
*
* @param expression that is expected to evaluate to non-zero at compile-time.
**/

#define cassert(expr) struct UID__ { unsigned int CASSERT_FAILURE: (expr); }

/**
* compile-time debug_assert. causes a compile error if the expression
* evaluates to zero/false.
*
* no runtime overhead; may be used anywhere, including file scope.
* especially useful for testing sizeof types.
*
* this version has a less helpful error message, but redefinition doesn't
* trigger warnings.
*
* @param expression that is expected to evaluate to non-zero at compile-time.
**/

#define cassert2(expr) extern char CASSERT_FAILURE[1][(expr)]



Quote:
If you don't need dynamic sizing, then just use boost::array. Seriously. It's going to be zero-overhead on non-braindead compilers

Let's give both sides of the story - boost et al. are fine and good, but come at the cost of seriously bloating compile times. While clever template code can be written more cheaply than to-the-metal C code, you end up paying for it at every recompile. "80%" of development costs being spent on maintenance, large projects may find it cheaper in the long run to invest a bit more development effort into simple code.

Quote:
Um... .reserve()?

Nope! Note use of the word "resize". Bin sort requires random access into the (uninitialized) output array; it stores items at their final index.

Share this post


Link to post
Share on other sites
Quote:
Original post by Jan Wassenberg
Quote:
If you don't need dynamic sizing, then just use boost::array. Seriously. It's going to be zero-overhead on non-braindead compilers

Let's give both sides of the story - boost et al. are fine and good, but come at the cost of seriously bloating compile times. While clever template code can be written more cheaply than to-the-metal C code, you end up paying for it at every recompile. "80%" of development costs being spent on maintenance, large projects may find it cheaper in the long run to invest a bit more development effort into simple code.


I'm not aware of boost::array pulling in any other dependencies, nor can I think of any that would make sense (except perhaps the Boost assertions). But if you're paranoid about these things, you can write it yourself, too - for this particular case, it's very simple and I can hardly imagine it making a significant impact on compile time.

Quote:
Quote:
Um... .reserve()?

Nope! Note use of the word "resize". Bin sort requires random access into the (uninitialized) output array; it stores items at their final index.


So you want to write to the uninitialized locations (obviously you aren't going to read from them before writing) but not sequentially? How are you going to know which elements are still garbage at the end so as to pick up the sorted results? (Or are there somehow not any such locations? I'm not familiar with this algorithm...)

Share this post


Link to post
Share on other sites

This topic is 3709 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this