Jump to content
  • Advertisement
Sign in to follow this  
nem123

How would the optimal pointer syntax look like?

This topic is 3013 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I've been lightly looking into programming language design and when looking at the subject of pointers and manual memory management I became curious about something. Namely, if you had free reign, and were tasked with designing the syntax used for pointer operations ((de)allocation, (de)referencing, arithmetic and so on), what syntax would you settle on and why? Feel free to make up your own syntax if you feel that whatever is currently out there is lacking.

When replying, please supply examples of your syntax as well as the good points and bad points from your perspective regarding it.

EDIT:
Quote:
Original post by rip-off
I agree entirely. I am curious about what nem123's language design goals are. Without knowing that it is hard to make any real comments on the apparent choice to include pointers in it.


Yes, maybe I should have made this clear to begin with.

In reality, it's more me building a compiler (src->asm), the language design interest part is something that has been growing slowly while thinking about how stuff could be implemented and supported by the compiler (makes sense?).

In essence, with my compiler, and language, I'm aiming for a systems programming language with a somewhat better/nicer (subjective?) syntax then C. With support for OOP as far as what can be resolved at compile time. I intend to shy away from constructs that require additional run-time support (meaning, I'd like to be able to generate the least amount of CPU operations from a given set of code, this might be more of a compiler optimization issue though, rather then a language issue). Other then this, I don't have much of a plan for the language, features will be added ad-hoc if I deem them necessary/useful in the domain of system programming, or can be talked into implementing by other more experiences people.

[Edited by - nem123 on July 18, 2010 4:10:38 AM]

Share this post


Link to post
Share on other sites
Advertisement
When I started learning C I was confused with the way pointers were defined:
Type *pointer; // we define a pointer variable
Type value;

value = *pointer; // we dereference the pointer (* means going from pointer to value)
pointer = &value; // we take the address (& means going from value to pointer)

You see, when we define a pointer variable, we use the star *. However, any time after the definition, the star * is used to go from pointer to value. In the definition it reads like "we have a type Type, which we dereference, we go from pointer (Type) to value (*Type)". But this is not true. A more suiting symbol for pointer definition would be the ampersand &: "we have type Type, which we take the address from, we go from value (Type) to pointer (&Type)".
Type &pointer; // we define a pointer variable
Type value;

value = *pointer; // we dereference the pointer (* means going from pointer to value)
pointer = &value; // we take the address (& means going from value to pointer)

In C++ this clashes with references, but to me this would be a much better suiting syntax. It is more consistent, this way the * has not 3 (pointer definition, dereferencing, multiplying) but only 2 meanings. The & still keeps the same meaning across the code (binary / boolean operation, take address).

This is my point of view, I can remember this caused me trouble to understand pointers back then, if someone has an explanation for the use of * for pointer definition, I'd be glad to hear.

Share this post


Link to post
Share on other sites
Maybe something similar to ->, but perhaps the other way, e.g.:

int Value = <-PointerToValue;

Dunno though, I haven't really thought too much about it :)

Share this post


Link to post
Share on other sites
One thing that could be done would be to separate the concept of NULLness from reference semantics.

For example (based on C for clarity for the moment):

int? x = null; // optional integer, must be initialised

// Must use this:
if(int y = ?x) {
// "y" is the not-null value of x
}
// optional "else" clause for null values.



Another example:

int? foo() { /* ... */ }

while(int x = foo()) {
// use x
}
// Here when x becomes null.


Note that the de-referencing has no explicit operator, it is done as part of the conditional expression.

A reference (again, C syntax not necessarily representative):

int& x = 5; // Must be initialised
int ?y = foo();
int &z = y; // Illegal (y may be null)

std::vector<int> ?vec = /* ... */;
if(std::vector<int> &ref = vec) {
// ...
}



Here we can introduce some syntactic sugar:

int ?foo = /* ... */
if(foo) {
// foo is now in scope as a reference to int
std::cout << foo << '\n';
}


So that we have a shorthand for the common operation "if X != null, use X". It is equiv

Here optional variables are references, so there is no syntax for an optional reference. This means (for example) that you cannot return an optional temporary. This is more a function of me making most of this up right now rather than any gut belief that this should be the way.

I'm not sure about pointer arithmetic. It is pretty low level. But since you are talking about designing a language with pointers in it, then I suppose that is what you are aiming at.

Again I would be inclined to separate the concepts, because this is totally different from the above. Again, inspired by C:

int [] allocate(int n);

int [] array = allocate(42);
for(int [] pointer = array ; pointer != array + 42 ; ++pointer) {
int value = pointer[]; // inspired by pointer[0]
}


You could also go a more C++ version and have opt<int> ref<int> array<int> as the types, which would make them easier to think about in combination:

opt<array<int>>
opt<ref<int>>
ref<array<opt<int>>> // you may want to support typedefs...


Alternatively, "ref" "array" and "opt" could become keywords (in the above, they might only be keywords when used with <>):

opt array int
opt ref int
ref array opt int



I'm not sure if this was what you are looking for. I actually don't care about syntax itself all that much, its more about how expressive the langauge is. Plus as I said I made up a lot of this on the spot so maybe it doesn't make much sense.

In C, "void foo(int *x)" isn't as expressive as it could be. It doesn't tell you if the function expects NULL. It doesn't say if the function expects and array. It doesn't say if the function will write to the variable. Yes, there are hints we can give by using int [] x or const, but this is insufficient I think. The language should allow us to specify these semantic differences without resorting to comments.

Also, please include some form of inferred typing!

Share this post


Link to post
Share on other sites
Do you *really* need pointers? What do they buy you that e.g. references don't?

Personally, I'd like a slightly expanded C# syntax for references:

void foo<T>(T a)
void foo<T>(ref T a)
void foo<T>(out T a)
void foo<T>(const ref T a)

I'd also like nullable parameters to be explicitly marked (rather than vice versa). In other words, make null values the exception rather than the rule:

void foo<T>(T? a)
// or
void foo<T>(T a) where T : nullable


For actual pointers, I'd avoid the array-pointer duality of C/C++ (it really makes no sense and introduces a host of other problems). I quite like the C# approch here, which allows the use of pointers only in explicitly defined "unsafe" regions. Outside of C interop you really don't miss them at all.

Dereference of pointers/references can be handled with the dot operator (.) - there's no real reason for -> that I'm aware of. The compiler should be able to figure out and dereference automatically.

Share this post


Link to post
Share on other sites
Quote:
Original post by Fiddler
Outside of C interop you really don't miss them at all.


Actually, you do. My code is full of pointers, though often hidden (in a custom smart pointer). I need to have pointers, else I cannot have heterogeneous arrays. I even have double pointers, that saves me a lot of if / else statements. For example, a checkbox has many states, you wouldn't want switch / if / else series to draw the correct state (which is a pointer to texture data), a simple (double) pointer is most elegant.

But it also depends on the language, in PHP there's no pointers. The language is pretty limited in what you can do (you can only output text for web stuff, which it is used most for). Most PHP programmers program pretty sloppy, delivering suboptimal code. I do too, simply because PHP doesn't let me use efficient, straightforward ways. Optimal code is also often irrelevant with PHP.

How do you want to keep track of large memory files without pointers? I don't find references suiting for that...

Share this post


Link to post
Share on other sites
Extend -> to cover "multiple layers" eg:

std::vector<Foo*> foos;
for(std::vector<Foo*>::iterator i=foos.begin();i!=foos.end(); ++i)
i-->bar();//not valid C++, you must do "(**i).bar()" or similar.







It would be useful if string literals and arrays had a way to exploit the fact that their length is known, eg:

//should this really need to use an strlen type method at runtime?
//The length was known when this was compiled after all...The length was known when this was compiled after all...
std::string a = "Hello World";




Maybe some kind of light weight array/string class that keeps the length in a constant which could be filled out at compile time.

string::string(FixedString str)
{
assign(str, str.size);
}




In the advent that the literal is immediately converted to a pointer (eg there was no overload, or it was just a "cosnt char *str = "Hello World" assignment) the compiler could obviously just do it the C++ way.



Quote:

Fiddler
Dereference of pointers/references can be handled with the dot operator (.) - there's no real reason for -> that I'm aware of. The compiler should be able to figure out and dereference automatically.

What about smart pointers, iterators, or any other object with an overloaded * and -> where the . isn't to helpful for most code?

Share this post


Link to post
Share on other sites
Quote:
Original post by Fiddler
Do you *really* need pointers? What do they buy you that e.g. references don't?


Pointer arithmetic? The syntax of adding 3 to a pointer doesn't really have a reference equivalent. And array-pointer duality makes perfect sense to me. The array is a pointer to an area of memory, and its elements are offsets from that pointer. It goes hand in hand with pointer arithmetic:
array[3]
// ... is essentially the same as ...
*(array + 3)
In the world of C, array notation is really just a convenient shorthand, though there can be times when using pointer arithmetic is actually a more straightforward way of expressing things (e.g. when you're reseating a pointer by an offset).

Share this post


Link to post
Share on other sites
Quote:
Original post by Decrius
I need to have pointers, else I cannot have heterogeneous arrays.

Incorrect. You are thinking purely in terms of C or C++. In general, the above are achieved with some form of referential semantics, to which there are different solutions. True, these solutions typically involve things *like* pointers, but not the full combination of reference/optional/memory arithmetic that C++ pointers offer.
Quote:

I even have double pointers, that saves me a lot of if / else statements. For example, a checkbox has many states, you wouldn't want switch / if / else series to draw the correct state (which is a pointer to texture data), a simple (double) pointer is most elegant.

I'm not sure about that. It depends on exactly how you implemented your class. If your states are an array, you can use an array index to achieve the same effect, for instance. Plus, most checkboxes have two states, checked and unchecked. Besides, nearly every language can implement something like a pointer to a pointer.
Quote:

But it also depends on the language, in PHP there's no pointers. The language is pretty limited in what you can do (you can only output text for web stuff, which it is used most for). Most PHP programmers program pretty sloppy, delivering suboptimal code. I do too, simply because PHP doesn't let me use efficient, straightforward ways. Optimal code is also often irrelevant with PHP.

That seems pretty off topic. I think PHP is more powerful than you give it credit for, it can certainly do more than output text for web stuff. Even so, you haven't really linked the lack of pointers to any weaknesses PHP has.
Quote:
How do you want to keep track of large memory files without pointers? I don't find references suiting for that...

For higher level languages, this sort of implementation detail can be wrapped. For example, file offsets can be specified as large integers, and the wrapped native code can deal with turning this into a raw memory offset.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!