How would the optimal pointer syntax look like?

Started by
28 comments, last by Mabeline 13 years, 9 months ago
Quote:Original post by Shinkage
Quote:Original post by Fiddler
Do you *really* need pointers? What do they buy you that e.g. references don't?

Pointer arithmetic? The syntax of adding 3 to a pointer doesn't really have a reference equivalent. And array-pointer duality makes perfect sense to me. The array is a pointer to an area of memory, and its elements are offsets from that pointer. It goes hand in hand with pointer arithmetic:
array[3]// ... is essentially the same as ...*(array + 3)
In the world of C, array notation is really just a convenient shorthand, though there can be times when using pointer arithmetic is actually a more straightforward way of expressing things (e.g. when you're reseating a pointer by an offset).

Unimpressive. The most common use for pointer arithmetic is in extremely low level operations such as copying a block of arbitrary memory as a set of bytes. Most real code will use dynamic containers rather than arrays or pointers for storing data sets.

Having arrays and references generally nets the language 97% (or more) of the uses for pointers. And most of the other uses are unsafe anyway, and modern languages tend to shy away from giving them to programmers as there are workarounds (though they may not be as performant, e.g. member-wise serialisation rather than mulk memcpy/fwrite calls).

It is debatable whether these are strictly necessary. Such things are often overused for the rather dubious performance benefits - people worrying about taxing the CPU when their program is in reality waiting for the network or hard drive.
Advertisement
Quote:Original post by rip-off
Unimpressive. The most common use for pointer arithmetic is in extremely low level operations such as copying a block of arbitrary memory as a set of bytes. Most real code will use dynamic containers rather than arrays or pointers for storing data sets.

Having arrays and references generally nets the language 97% (or more) of the uses for pointers. And most of the other uses are unsafe anyway, and modern languages tend to shy away from giving them to programmers as there are workarounds (though they may not be as performant, e.g. member-wise serialisation rather than mulk memcpy/fwrite calls).

It is debatable whether these are strictly necessary. Such things are often overused for the rather dubious performance benefits - people worrying about taxing the CPU when their program is in reality waiting for the network or hard drive.


Keeping in mind of course that one of the stated core design goals of C and C++ is being a systems programming language, where such things are overwhelmingly important? Whether YOU use it somewhat beside the point, because I assure you, if you're programming a hardware driver or an operating system it can be VERY important, and as such it would be a somewhat silly notion that the languages used to perform those tasks should do without it.

This brings up an interesting point, and I think only serves to illustrate the fact that there is no "best" language, or even best design for a particular feature. If a language is being used for systems programming going to make very different design choices than one that's aimed at high level application programming, and that's only sensible.
Quote:Original post by Shinkage
Keeping in mind of course that one of the stated core design goals of C and C++ is being a systems programming language, where such things are overwhelmingly important? Whether YOU use it somewhat beside the point, because I assure you, if you're programming a hardware driver or an operating system it can be VERY important, and as such it would be a somewhat silly notion that the languages used to perform those tasks should do without it.

This brings up an interesting point, and I think only serves to illustrate the fact that there is no "best" language, or even best design for a particular feature. If a language is being used for systems programming going to make very different design choices than one that's aimed at high level application programming, and that's only sensible.

I agree entirely. I am curious about what nem123's language design goals are. Without knowing that it is hard to make any real comments on the apparent choice to include pointers in it.
Quote:Original post by rip-off
I agree entirely. I am curious about what nem123's language design goals are. Without knowing that it is hard to make any real comments on the apparent choice to include pointers in it.


Well, assuming the language is for game development among other things, and particularly if it's intended to be portable to embedded systems/consoles, I think including pointers is a sensible (even necessary) idea. Desktop-only game development not so much.
Quote:Original post by rip-off
I'm not sure about that. It depends on exactly how you implemented your class. If your states are an array, you can use an array index to achieve the same effect, for instance. Plus, most checkboxes have two states, checked and unchecked. Besides, nearly every language can implement something like a pointer to a pointer.


For example, my checkbox has 3 states (checked, unchecked, mixed). For each state I have a pointer to the resource / texture. I could keep track using an Enum as to what state it resides in, this would require if / else checking, which can become a mess the more states you have. The way I implemented it, is having a pointer to the pointer to the resource / texture. That way, whenever a state changes I can change the double pointer, and at draw time just dereference. It is also easy with RAII to make sure the pointer always points to valid data.
[size="2"]SignatureShuffle: [size="2"]Random signature images on fora
Quote:Original post by Decrius
Quote:Original post by rip-off
I'm not sure about that. It depends on exactly how you implemented your class. If your states are an array, you can use an array index to achieve the same effect, for instance. Plus, most checkboxes have two states, checked and unchecked. Besides, nearly every language can implement something like a pointer to a pointer.


For example, my checkbox has 3 states (checked, unchecked, mixed). For each state I have a pointer to the resource / texture. I could keep track using an Enum as to what state it resides in, this would require if / else checking, which can become a mess the more states you have. The way I implemented it, is having a pointer to the pointer to the resource / texture. That way, whenever a state changes I can change the double pointer, and at draw time just dereference. It is also easy with RAII to make sure the pointer always points to valid data.


This is trivial to implement even in languages that don't support pointers at all. E.g. F#:
type Texture (id : int) =    let Id = id // An OpenGL texture idtype CheckBoxState (icon : Texture) =    let Icon = icon    static let _enabled = new CheckBoxState (new Texture(1))    static let _disabled = new CheckBoxState (new Texture(2))    static let _mixed = new CheckBoxState (new Texture(3))    static member Enabled = _enabled    static member Disabled = _disabled    static member Mixed = _mixedtype CheckBox (?state : CheckBoxState) =    let mutable _state = defaultArg state CheckBoxState.Disabled    member this.State        with public get () = _state        and public set (state) = _state <- statelet main () =    let a = new CheckBox () // Defaults to Disabled    let b = new CheckBox (CheckBoxState.Enabled)    a.State <- CheckBoxState.Mixedmain ()

[OpenTK: C# OpenGL 4.4, OpenGL ES 3.0 and OpenAL 1.1. Now with Linux/KMS support!]

I couldn't do some of what I do without pointer arithmetic. Well, I could, but it would include an extra dereference which would negate the reason for using pointers in the first place.

I maintain several large blocks of memory for entity data, and use offsets to access the individual fields. Each data element can be a different type, and so the type sizes are different. Can't use array offsets for that.

This is done so that the data is all in one place for 4 reasons.

1) dramatic increase in overall processing speed.
2) I can share any data instance between the engine and script.
3) Certain data instances serve as data buffers for VBO/IBO operations.
4) The structure simplifies object instancing from script.

(I use pascal mostly, with some C++)

I never write property containers like this:

Obj.MyPropertyHelper.Add('Health',100);health = Obj.MyPropertyHelper.Get('Health');


I first define classes/fields in script, and use that to set up the custom data instances. Then, to access a given field from script I simply perform pointer arithmetic behind the scenes, adding the field offset to the data instance address.

The script looks nice and clean:

player.health -= monster.damageif (player.health <= 0)  player @@ Die  monster @@ Cheerend


and behind the scenes, the system performs the pointer arithmetic to surface the appropriate value:

result := psingle(pInstance "plus" var.Offset)^// I used "plus" because the plus sign wouldn't show in preview


Also at the lower system level, in plugins or whatnot, I can sometimes map predefined structures atop a given object's instance and access the fields directly from a record (struct in c).

TObjectMap = record   Health: integer;end;PObjectMap = ^TObjectMath;     // pointer declaration in pascalprocedure DisplayHealth(obj: TmObject);var  data: PObjectMap;begin  data = Obj.Instance;  ShowMessage('health = ' "plus" inttostr(data.Health));end;


Instance data can also be traversed if necessary using the same concept:

// simplified - no error checking or whateverpInstance = Obj.Instance;pNextInstance = pInstance "plus" Obj.InstanceSize;


Best of both worlds, imo. And, all courtesy of pointer arithmetic.

-----------------

Also, notice in the above example how pascal uses dot notation with pointers. no need for the ->. I like that, too.

The one other thing I would change with pointers would be to allow only 1 pointer type to be declared for any type. And, that the pointer would have to be declared at the same place where the structure is defined.

Node* root (would be illegal)

but you could declare root with the predefined pointer to Node.

pNode root (would be allowed, assuming pNode was previously defined)
--- "A penny saved never boils."
I like the c/c++ way, except for how to create pointers.. creating 2 pointers in one line is ugly for example:
int *foo, *bar;
imho the type should be seperate from the name.
also the standard gives too much freedom on how to do it:
int *foo;
int* bar;
int * foobar;

I guess, I'd like the second one most, so you could do:
int* foo, bar;

also I like having both pointers and references. references being always safe (never NULL),
and pointers having the advantage of NULL (so you dont have to allocate memory when not needed)
Quote:Original post by DevFred
Make the dereferencing operator postfix instead of prefix.


Are you referring to how C/C++ does it?, like so:
    C/C++: *a    Your suggestion: a*




Quote:Original post by Decrius
You see, when we define a pointer variable, we use the star *. However, any time after the definition, the star * is used to go from pointer to value. In the definition it reads like "we have a type Type, which we dereference, we go from pointer (Type) to value (*Type)". But this is not true. A more suiting symbol for pointer definition would be the ampersand &: "we have type Type, which we take the address from, we go from value (Type) to pointer (&Type)".

A very good point here, how do you feel about the use of the ampersand and star symbols? For example, if you have lots of dereferencing and address-fetching in your code, does the code not become harder to read with lots of symbols with special meaning (this also applies to symbols like '&&', '||', '^' and so on)? I guess this is something you get use to with time, but still, how do you feel about this?



Quote:Original post by f8k8
Maybe something similar to ->, but perhaps the other way, e.g.:

int Value = <-PointerToValue;

Dunno though, I haven't really thought too much about it :)

I thing the Go programming language does something like this, or that might just have been for their 'GoRoutines' or whatever they are calling those 'concurrency enabled functions' they have in there (not too involved with that language).
I'm not sure I like this syntax myself though, does it make things more clear? Possibly. Is it better then a straight up star for dereferencing? doubtfull.



Quote:Original post by rip-off
I'm not sure about pointer arithmetic. It is pretty low level. But since you are talking about designing a language with pointers in it, then I suppose that is what you are aiming at.

In reality, it's more me building a compiler (src->asm), the language design interest part is something that has been growing slowly while thinking about how stuff could be implemented and supported by the compiler (makes sense?).

Thank you for the suggestions on alternative syntax here. I like the one about using keywords to describe types. This is what Java does to some degree if I'm not mistaken. A side-effect being the increase in verbosity, perhaps too much verbosity?

Quote:Original post by rip-off
In C, "void foo(int *x)" isn't as expressive as it could be. It doesn't tell you if the function expects NULL. It doesn't say if the function expects and array. It doesn't say if the function will write to the variable. Yes, there are hints we can give by using int [] x or const, but this is insufficient I think. The language should allow us to specify these semantic differences without resorting to comments.


About NULL, perhaps there is someway to avoid its use/existence all-together, by requiring that all references/pointers be initialized (how do you deal with function returns? By not allowing return 0? What effects would this have?).
I agree in regards to you being required to specify semantic differences.

Quote:Original post by rip-off
Also, please include some form of inferred typing!

I've been thinking about inferred typing, I'm not sure I like the concept though. The compiler I'm building is targeting a statically typed language, and introducing inferred typing could be confusing for the programmer, not knowing what type a particular symbol has. And how would you deal with functions that has return statement for multiple types? Throw an error I guess :). I might be wrong in this regard though, I don't have much experience with type inference too be honest.



Quote:Original post by Fiddler
Do you *really* need pointers? What do they buy you that e.g. references don't?

If we take C++ for example, with a reference object, you cannot refer directly to the reference object, only to the object that it references to. And we also have the issue of not being able to 'reseat' a reference, something that can be done with pointers. If this last point is a good or bad one can be discussed.

Quote:Original post by Fiddler
I'd also like nullable parameters to be explicitly marked (rather than vice versa). In other words, make null values the exception rather than the rule:
void foo<T>(T? a)
// or
void foo<T>(T a) where T : nullable

Interesting! This is the next-best thing to banning NULL with the use of pointers/references, I wonder if this could be resolved at compile-time or if you need some run-time components to insure correctness.

Quote:Original post by Fiddler
Dereference of pointers/references can be handled with the dot operator (.) - there's no real reason for -> that I'm aware of. The compiler should be able to figure out and dereference automatically.

Indeed.



Quote:Original post by rip-off
Quote:Original post by Decrius
I need to have pointers, else I cannot have heterogeneous arrays.


Incorrect. You are thinking purely in terms of C or C++. In general, the above are achieved with some form of referential semantics, to which there are different solutions. True, these solutions typically involve things *like* pointers, but not the full combination of reference/optional/memory arithmetic that C++ pointers offer.

Do you have some example/resource describing different solutions of referential semantics?

Original post by rip-off
Quote:
How do you want to keep track of large memory files without pointers? I don't find references suiting for that...

For higher level languages, this sort of implementation detail can be wrapped. For example, file offsets can be specified as large integers, and the wrapped native code can deal with turning this into a raw memory offset.
Interesting alternative, do you thing that this could also be implemented in a 'lower-level language'?

Quote:Original post by rip-off
I agree entirely. I am curious about what nem123's language design goals are. Without knowing that it is hard to make any real comments on the apparent choice to include pointers in it.

Yes, maybe I should have made this clear to begin with, I'll update my original post as well as try to describe it in more detail here.

In essence, with my compiler, and language, I'm aiming for a systems programming language with a somewhat better/nicer (subjective?) syntax then C. With support for OOP as far as what can be resolved at compile time. I intend to shy away from constructs that require additional run-time support (meaning, I'd like to be able to generate the least amount of CPU operations from a given set of code, this might be more of a compiler optimization issue though, rather then a language issue). Other then this, I don't have much of a plan for the language, features will be added ad-hoc if I deem them necessary/useful in the domain of system programming, or can be talked into implementing by other more experiences people.



Quote:Original post by Whatz
The one other thing I would change with pointers would be to allow only 1 pointer type to be declared for any type. And, that the pointer would have to be declared at the same place where the structure is defined.


Node* root (would be illegal)

but you could declare root with the predefined pointer to Node.

pNode root (would be allowed, assuming pNode was previously defined)

I like this, having a specific type for pointers, not just tacking on a star or other symbol to indicate that we want to declare a pointer.
Maybe I'm naive at what goes on under the surface, but - by far! - the most important change to C++ for me would be a generic function pointer for any given function arguments. Function pointers are so useful; oftentimes, they are used for time-critical code, and to implement a functor class seems a very roundabout and clunky way to fix the issue. The alternative would seem to be passing dummy arguments to functions so they all look alike, or using switch/conditions to call a particular function. But maybe someone can clue me in to a better way to implement this pattern.. I've been wanting to ask, and, well, here's this thread..

This topic is closed to new replies.

Advertisement