pointer _from_ member [c++]

Started by
15 comments, last by NotAYakk 17 years, 6 months ago
Is there any legal way to obtain a pointer to an object, having a poointer to one of its member variables.

struct A : public //...
{
   // ...
   int  m_Member;
   // ...
};

int *pMemb = //...
A *pA = ??? ( pMemb );
I know that there's some pointer twiddling, like below (recall old offsetof macro), but is this reliable?

(A*)(void*)
(
   ((char*)(void*)pMemb) -
      (size_t)( ((char*)(void*)&((A*)0)->m_Member) - ((char*)0) )
)
Advertisement
typedef int A::*IntMember;
IntMember member=&A::m_Member;
A a;
A* pa=&a
(a.*member)=3;
cout<<pa->*member;

try this way :)
Veni Vidi Vici
No, there is no safe, general way to do this in C++.

Pointers to members don't allow you to recover an object; that is, if you have a pointer-to-member-of-A you cannot get the specific A that pointer-to-member-of-A came from, because it didn't come from a specific A. Note that lexchou's example initialized "member" without an instance of A. Furthermore, to do anything useful with a pointer-to-member-of-A you need an A, which is what you want to extract; so pointers-to-members cannot solve the problem.

Without pointers to members, you must rely on disgusting and unsafe (usually undefined-behavior-invoking) pointer trickery that requires the function that is using this trickier pathologically coupled to A (that is, it knows about and relies on A's internal layout without having access to or being a part of A).

For example, given a pointer-to-int that I know "know" points somewhere instance an instance of A, how do I found out where the beginning of A is? I need to know which specific member the pointer points at. I need to know if A has a vtable. I need to know if A is part of an inheritance chain (especially multiple-inheritance).

Why do you think you need to do this?
Original post by jpetrie
No, there is no safe, general way to do this in C++.

... ...


C++ provides a way to handle the member (function) pointer, and that hides the details like vtable, memory align etcs. just like I wrote up stairs

if you uses code to get the offset of a field like:

#define OFFSET(T,F) ((int)&(((T*)NULL)->F))
cout<<OFFSET(A,m_Member)<<endl;
cout<<OFFSET(A,m_Member2);

and the offset of the field will be valid while you direct access through pointer+offset, because the compiler will handle the right offset via this code no matter what the memory layout is or what cookies the compiler filled.
Veni Vidi Vici
Quote:Original post by lexchou
C++ provides a way to handle the member (function) pointer, and that hides the details like vtable, memory align etcs. just like I wrote up stairs

if you uses code to get the offset of a field like:

#define OFFSET(T,F) ((int)&(((T*)NULL)->F))
cout<<OFFSET(A,m_Member)<<endl;
cout<<OFFSET(A,m_Member2);

and the offset of the field will be valid while you direct access through pointer+offset, because the compiler will handle the right offset via this code no matter what the memory layout is or what cookies the compiler filled.

Actually, that's not at all legal, as you're playing around with bad pointers (which is strictly prohibited by the standard). How the compiler implements offsetof is an implementation detail that the standard (neither C nor C++ states how it operates, just the result of it, which is simply "an integer constant expression that has type size_t, the value of which is the offset in bytes, to the structure member (designated by member-designator), from the beginning of its structure (designated by type).") does not state, except that it will be a macro and that it can only accept a restricted set of types, specifically, POD structures or POD unions (18.1 -5). Nothing more. If you're class is NOT a POD structure or union, then you cannot use offsetof and expect defined behavior.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

Well, here's the thing, lexchou.

First, you did not provide a solution to the original question. A pointer-to-member-of-A does not carry with it any information about which specific A it was constructed from, because, again, it was not constructed from a specific A, which means it cannot be used to recover a pointer to A from a regular pointer or a pointer-to-member (which is what the original poster wanted). Your own example illustrates this quite clearly, since you a) construct a pointer-to-member-of-A without a specific instance of A, and b) use that pointer-to-member-of-A on a specific instance of A, failing to solve the original poster's problem.

Second, your follow-up code
#define OFFSET(T,F) ((int)&(((T*)NULL)->F))cout<<OFFSET(A,m_Member)<<endl;cout<<OFFSET(A,m_Member2);


is not reliable; in fact, it is guaranteed to produce undefined behavior. Just because it might work in practice doesn't mean it is correct. Even it were correct, it still does not answer the original poster's question since it provides you with (what you hope to be) a byte offset from the beginning of an object of type A to a given member of A; but this is not the data that the original poster has.

Let's examine the ways you've created undefined behavior.

First and foremost, you've relied upon the value-representation (i.e., the bits) of a pointer being well-defined as byte addresses. They are not. The standard explictly states that the value representation is up to the implementation, and thus unreliable.

Second, you assume that the null pointer is 0. It isn't, neccessarily. A constant integral expression that evaluates to zero will be converted to the null pointer value, but that value is implementation defined and need not neccessarily have an all-zero bit pattern. It could easily, in some places, be a segment:offset style form with the bitpattern (0x77770000), perhaps, in which case your offset will be quite amusing. And useless as an offset.

Finally, you deference null. The standard draft I have here is wishy-washy on this at the moment; a number of places indicate that dereferencing null is undefined, but the definition of unary operator * does not, as you might expect. Supposedly this was cleared up for the 2003 draft, or will be cleared up for the 0x draft -- the resolution I'm looking at was dated 2000, and I don't have a newer copy of the standard handy so I can't be sure.. However, the proposed clarification is that the dereference is legal... but the subsequent use of the value is undefined. And you use the value. So either way, you've done something naughty.

I will reiterate. Given a pointer to a type, and only a pointer to a type, you cannot safely and sanely discover if what is pointed to is a member of a type A, nor can you recover a pointer to the containing instance of type A even if you make the assumption that what is pointed to is a member.

...and the fact that one would need to do so indicates a design flaw.
Quote:Original post by jpetrie
For example, given a pointer-to-int that I know "know" points somewhere instance an instance of A, how do I found out where the beginning of A is? I need to know which specific member the pointer points at. I need to know if A has a vtable. I need to know if A is part of an inheritance chain (especially multiple-inheritance).


Why is that?
I'm not attempting to count the offset on my fingers, addding sizes of preceeding members, or something that crazy.

Do I need to know any of such stuff when doing:
A * pA = ...;
int* pMemb = &pA->m_Member;
Even when A is stuffed with padding/vtables/hidden-pointers-to-whatever, that code is realiable, and is just adding some offset to (&a).
Event when A is polymorfically some B, that has multiple inheritance and what-not, does that code need to care about it? Not a bit. (a) is a perfectly good instance of A for all the compiler cares at that moment.

If I can do this operation safely anywhere, why can't I do the exact reverse?
int *pMemb = ...;
A *pA = something<A, &A::m_Member>(pMemb);

Quote:Original post by jpetrie
Why do you think you need to do this?

I'm creating an intrusive container. Just for practice and my own curiosity, so don't go nuts. [grin]

So what's exactly bad in the code I originally provided?

Quote:Original post by jpetrie
Finally, you deference null. The standard draft I have here is wishy-washy on this at the moment; a number of places indicate that dereferencing null is undefined, but the definition of unary operator * does not, as you might expect. Supposedly this was cleared up for the 2003 draft, or will be cleared up for the 0x draft -- the resolution I'm looking at was dated 2000, and I don't have a newer copy of the standard handy so I can't be sure.. However, the proposed clarification is that the dereference is legal... but the subsequent use of the value is undefined. And you use the value. So either way, you've done something naughty.


Which chapter?
Quote:Original post by jpetrie
Well, here's the thing, lexchou.
... ...

*Thanks for your reply, hehe
the first reply of this post is my solution to the original question o(>_<)o

>>First, you did not provide a solution to the original question. A pointer-to-member-of-A does not carry with it any information about which specific A it was constructed from, because, again, it was not constructed from a specific A, which means it cannot be used to recover a pointer to A from a regular pointer or a pointer-to-member (which is what the original poster wanted). Your own example illustrates this quite clearly, since you a) construct a pointer-to-member-of-A without a specific instance of A, and b) use that pointer-to-member-of-A on a specific instance of A, failing to solve the original poster's problem.
*A pointer-to-member-of-A indeed does not carry with any information about the struct A's instance, but, it is the meta data just like in Delphi's RTTI, .net's reflection, that can be used with a struct's instance to access the concrete member, although it is a naughty way.


>>Second, your follow-up code
*may be the code I wrote is wrong, but this trick used in many places, the first place I saw it is in Matthew Wilson's <Imperfect C++> #- -

>>is not reliable; in fact, it is guaranteed to produce undefined behavior. Just because it might work in practice doesn't mean it is correct. Even it were correct, it still does not answer the original poster's question since it provides you with (what you hope to be) a byte offset from the beginning of an object of type A to a given member of A; but this is not the data that the original poster has.

*hehe, i thought if i have an offset of a member in a struct, i can access the member in any instance of that kind of struct. the post's title is pointer_from_member, and the pointer_from_member=pointer_of_struct+offset_to_member, that's what I though.



>>First and foremost, you've relied upon the value-representation (i.e., the bits) of a pointer being well-defined as byte addresses. They are not. The standard explictly states that the value representation is up to the implementation, and thus unreliable.
*I relied on a POD type, that's because the poster used a POD type #-_-

>> Second, you assume that the null pointer is 0. It isn't, neccessarily. A constant integral expression that evaluates to zero will be converted to the null pointer value, but that value is implementation defined and need not neccessarily have an all-zero bit pattern. It could easily, in some places, be a segment:offset style form with the bitpattern (0x77770000), perhaps, in which case your offset will be quite amusing. And useless as an offset.
*haha, I'm wrong #-_-, i've never been programming in that sorts of platform, just heard of it before


>>Finally, you deference null. The standard draft I have here is wishy-washy on this at the moment; a number of places indicate that dereferencing null is undefined, but the definition of unary operator * does not, as you might expect. Supposedly this was cleared up for the 2003 draft, or will be cleared up for the 0x draft -- the resolution I'm looking at was dated 2000, and I don't have a newer copy of the standard handy so I can't be sure.. However, the proposed clarification is that the dereference is legal... but the subsequent use of the value is undefined. And you use the value. So either way, you've done something naughty.
*even though I deferenced a null pointer, but that just a syntax form, actually I didn't access the memory arround the null pointer, just let the compiler translate that to an offset.

Veni Vidi Vici
Quote:
Why is that?
I'm not attempting to count the offset on my fingers, addding sizes of preceeding members, or something that crazy.

Do I need to know any of such stuff when doing:
A * pA = ...;
int* pMemb = &pA->m_Member;
Even when A is stuffed with padding/vtables/hidden-pointers-to-whatever, that code is realiable, and is just adding some offset to (&a).
Event when A is polymorfically some B, that has multiple inheritance and what-not, does that code need to care about it? Not a bit. (a) is a perfectly good instance of A for all the compiler cares at that moment.


When you compute pMemb, you are taking the address of an instance of pA->m_mMember, which means A is a complete type and that the compiler has all of the information available to it (size of A, location of potential v-table, et cetera), and you also know where pA is, so the compiler can calculate everything it needs. Now, you want to do something like:
void recover_object_from_member_pointer(int *memberPtr){A *theA = owner_from_member_of< A >(memberPtr);    // do something with theA...}


correct? We want where A starts. We know where our member pointer is. What we don't know is the offset of the member from the start of an instance of A (i.e., if A had two int members, which one is memberPtr pointing to?), so we have too many unknowns in our equation. If we knew the offset, we could solve, but we don't know the offset because offsetof is for POD types, so in general, we cannot extract that information, so in general, we cannot solve the problem.

In other words, we've lost information by not knowing where the A instance starts, and the language does not provide us with the functionality to safely recover that lost information.

Quote:
I'm creating an intrusive container.

How do you mean "intrusive"? Can you clarify that? Any definition I can think of
doesn't require you to be doing what you want to be doing.

Quote:
So what's exactly bad in the code I originally provided?

The same problems that lexchou's OFFSET sample has.

Quote:
Which chapter?

The draft I was referring to mentioned that dereferencing null is undefined in the section on references (decl.ref) where it discusses null references. The section on unary operators (expr.unary.op) is where unary * is discussed and where it does not state that indirection through null is illegal, as you might expect. The 2000 discussion on active issues where the proposed clarification was discussed is here. Again, I am unclear whether or not this clarification was adopted into the 2003 standard, if it was delayed until 0x, or if it was rejected. I don't have access to a new copy of the standard where I am at the moment. However, whether or not the indirection yeilds undefined behavior, the rest of the code snippet produces it, so it's somewhat of a non-issue.

Quote:
the first reply of this post is my solution to the original question o(>_<)o

But it doesn't answer the original question. A pointer-to-member-of-A doesn't help. Your example created a pointer-to-member-of-A, created a new instance of A, and used that pointer-to-member-of-A to access a field of the new A. That's not what the original poster wanted, which was to user a pointer that happened to point to a member of A and recover the address of the specific A that contains that address.

Quote:
hehe, i thought if i have an offset of a member in a struct, i can access the member in any instance of that kind of struct. the post's title is pointer_from_member, and the pointer_from_member=pointer_of_struct+offset_to_member, that's what I though.

While true, "pointer from member" can be interpreted in a variety of ways, and the way you've interpreted it here is not the way that the original poster wanted it interpreted.

Quote:
I relied on a POD type, that's because the poster used a POD type #-_-

The type of the pointed-to object doesn't matter, POD or not. The bit-pattern of the pointer itself is not guaranteed to be in a form for which your math is valid (again, it could be a segment:offset form)

Quote:
even though I deferenced a null pointer, but that just a syntax form, actually I didn't access the memory arround the null pointer, just let the compiler translate that to an offset.

It is still going to produce undefined behavior. You can't argue with the standard.
Quote:Original post by jpetrie
Now, you want to do something like:
void recover_object_from_member_pointer(int *memberPtr){A *theA = owner_from_member_of< A >(memberPtr);    // do something with theA...}


correct?


Nope. It should be:
A *theA = owner_from_member_of<A, &A::m_member>(memberPtr);template<typename T, typename TMemb, TMemb T::* Member>T* owner_from_member_of(TMemb * ptr){  T *t = ...  return t;};


We do know which member we have in mind.
So there should be no information lost in the process.

Quote:Original post by jpetrie
Quote:
I'm creating an intrusive container.

How do you mean "intrusive"? Can you clarify that? Any definition I can think of
doesn't require you to be doing what you want to be doing.


For example, in standard list, we have a list node, which is implemented somewhat like this:
template<typename T>struct list_node{  list_node<T> *prev;  list_node<T> *next;  T Data;};


So dereferencing an iterator (being a wrapped list_node<T>*) is a matter of applying pointer_to_member operator (&list_node<T>::Data) on the list_node pointer.

Whereas in an intrusive list, we are keeping prev/next pointers inside the class T.

struct ilist_node{  ilist_node *prev;  ilist_node *next;};struct A{  // ...  ilist_node  m_Node;  // ...};typedef  ilist<A, &A::m_Node>  my_list;


And here's the problem with dereferencing an iterator (being a wrapped ilist_node*, additionally templated with <A, &A::m_Node>) - getting A* from ilist_node*.

Quote:Original post by jpetrie
Quote:
So what's exactly bad in the code I originally provided?

The same problems that lexchou's OFFSET sample has.


That is:

Quote:Original post by jpetrie
First and foremost, you've relied upon the value-representation (i.e., the bits) of a pointer being well-defined as byte addresses. They are not. The standard explictly states that the value representation is up to the implementation, and thus unreliable.


I don't get it. Really, I don't.

Quote:Original post by jpetrie
Second, you assume that the null pointer is 0. It isn't, neccessarily. A constant integral expression that evaluates to zero will be converted to the null pointer value, but that value is implementation defined and need not neccessarily have an all-zero bit pattern. It could easily, in some places, be a segment:offset style form with the bitpattern (0x77770000), perhaps, in which case your offset will be quite amusing. And useless as an offset.


My version does not assume it. Accoording to this FAQ:
Quote:According to the language definition, an ``integral constant expression with the value 0'' in a pointer context is converted into a null pointer at compile time

So I get the actual conversion from the compiler. I'm not doing any tricks to actually get _zero_ pointer.

Quote:Original post by jpetrie
Finally, you deference null.[...]


Quote:Original post by jpetrie
Quote:
Which chapter?

[...]


The discussion clearly concludes that it is the standard itself that has been mis-worded on that matter. And that we shouldn't be paranoid about it.

Additionally, the following code compiles just fine on VS2005:
struct A{  virtual ~A(void) {};  // insert anything  int  m_Member;  // insert some more};template<int N>struct blah{  char space[N];};blah< (int) ( ((char*)(void*)&((A*)0)->m_Member) - ((char*)0) ) >  bb;

(It doesn't compile on gcc 3.4.6 nor gcc 4.1.1[sad])

This topic is closed to new replies.

Advertisement