In-depth programming knowledge

Started by
12 comments, last by tstrimp 17 years ago
Inspired by Promit's post in this thread, I decided to start a thread regarding how to get in-depth knowledge on the languages / libraries you're working with.
Quote: Do you know how virtual functions are implemented by the compiler? Do you know how templates are implemented? Are you familiar with the implementation and pitfalls of malloc, free, and the standard new and delete operators? Do you know how to override and overload new and delete? How are strings stored in the std::string class? What are the common implementations and performance characterisics of the popular STL classes and functions? How do your stack and heap lay out in memory? (And on a related note, how to buffer overflows work?) What are the benefits and pitfalls of static and shared libraries? (Related: allocation/freeing rules when working with shared libraries? What causes loader locks?) How do you deal with basic thread safety problems? Can you use SSE and other extensions like atomic compare-and-swap? How do you go about determining if those extensions are available? What affects the compiler's ability to inline a function? How could you write a profiler into your code? What are memory mapped files and why are they useful? What is IOCP and why is it useful? Can you throw and catch exceptions? What about SEH? How is exception handling implemented? (There's so much to be asked here...) Are you comfortable with RAII?
Some of those issues I'm comfortable with and others I have no idea about (IOCP?). If I have no idea about them, then how am I supposed to even know to look them up in the first place? I'm not asking specifically about what these things are, but how to identify these types of thing regardless of what language / environment you're working in. The examples listed above are C++ specific, what kinds of things should we be looking for in the .NET framework and it's specific languages to move us beyond simply knowing the syntax and common pitfalls in them? I had hopes that I could find such useful information in tech blogs but I must be reading the wrong ones since all I encounter is common sense issues that are addressed quite often. What are the resources for identifying these areas and then learning more about them?
Advertisement
(IO completion ports iirc)

I don't know.

You'd think that your local user group would provide such discourse during meetings or presentations, but in my experience they're not too advanced. Tech blogs and MSDN articles and the such too seem to be aimed more at your regular business developer who just needs to be aware of new things, not necessarily 'deep'.

Oddly enough, I find at least cursory descriptions of such things in Developer Journals here. Washu's quizzes indirectly, most of JollyJeffer's research... stuff where people can ignore the size of the target audience and discuss things mainly to sate their own need to do so.

Aiming for a target audience tends to lower the bar. Answering questions almost always lowers the bar since most people don't have 'deep' problems. And often the benefits gained aren't problem solutions as much as time savers or making things more solid.

It'd be interesting to see different opinions/experiences.
Quote:Original post by Telastyn
You'd think that your local user group would provide such discourse during meetings or presentations, but in my experience they're not too advanced.


I don't think such things exist in Iowa.
IMO you shouldn't really seek out the answer to these. If you read a lot about a subject then you will automatically learn about these kind of things.

Almost all of these questions are answered very well in Herb Sutter's books and articles. The rest is explained in books like Applied C++, MSDN articles, threading books or Dr. Dobb. Those are just some of the common places you'd get this kind of knowledge. Addison-wesley have a lot of books covering the more advanced parts of C++.

I don't know what publisher is the best for .NET related books, but there are probably a lot of books out there which describes common issues. A little browsing on Amazon will probably give you a good idea what to get.
It's been my experience that when it comes to really getting into the guts of a system, one of the best things to do is to research performance. Discussions about performance tend to dwell very heavily on architectural details, and spider outwards from there. I think it's because writing high performance code requires such an intimate understanding of everything that's going on underneath you and the consequences of any given decision.

Here's a similar list of topics for .NET. Again, incomplete, some are more important than others, several are platform dependent.

What optimizations does the C# compiler perform before generating the IL? (Hint: it's a very short list.)
What is the basic behavior of the .NET garbage collector? (Hints: compacting, generational, mark and sweep. Related: how is allocation performed and what is the typical cost of an allocation?)
Why should you avoid GC.Collect(), and when should it actually be used?
How does GC.KeepAlive() work?
Why is WeakReference useful?
What is the IDisposable pattern and why is it used? What are some of the basic rules and guidelines surrounding disposal?
When does object finalization occur?
What can cause an object to become resurrected after its finalizer, and what are the dangers of doing so?
What is GCHandle for?
What does pinning an object do, and why should it be avoided when possible?
How are anonymous delegates implemented in C#, particularly when using them as closures?
How are exceptions implemented, and how do managed and unmanaged exceptions interact?
How can you examine the runtime assembly (eg the actual x86 code, not the CIL) that is run by your program?
What are some of the security problems associated with using System.String, and how does SecureString address these issues?
What is boxing, and why is autoboxing useful?
What are the relative costs of various reflection and RTTI features? (Type checks and comparisons, casts, .Invoke, etc)
What does ngen do and why is it useful?
How does the .NET Compact Framework differ from the full one?
How do generics work? (Related: How do they differ from C++ templates?)

That just begins to scratch the surface (as does the C++ list) but I suppose it's a good start. If you want, here is some reading material.
SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.
One "reading" nobody mentions is CS course.

It may seem too abstract, but almost each and every question is covered in any respectable CS course in depth (not game development schools). Even better, they provide theory that is applicable regardless of language.

While there are technical aspects that are language specific, concurrency, assembly and general computational performance, optimization techniques, all of these are covered. And they cover the past, present and future for every language.

I mention this because of the following:
Quote:I'm not asking specifically about what these things are, but how to identify these types of thing regardless of what language / environment you're working in.


Technologies will change in next 6 months. IO completion ports may be all the rage right now, but they are just an implementation to address one aspect of server scalability.

If you want to know how to find answers to *any* question, then the only reference that will prepare you for that is solid education.

Language knowledge is important. But it will always be just a subset.

But perhaps the most important issue here is, if you answer these questions only, you will classify yourself as coder and nothing more. This may not seem like a big deal, but as soon as technology changes, you may quickly find yourself obsoleted and replaced while the company pursues the next set of buzz-words.

This is why it's vital to understand the broader picture, not only from language perspective, but from the theoretical point of view.

C# may implement garbage collection, but so do most of new languages. Interpreted code is used extensively, and scripting languages are more viable then ever.

But with all of them, the same issues exist. Knowing every nuance of how C# implements finalizers compared to how Java does them is much less useful than understaing the design and concepts behind garbage collection strategies.

So if you're really looking to further yourself, steer away from languages, but consider the theoretical fundation on which it's built.

It may seem counter-productive, but the stronger your fundations are, the easier it is to switch between languages. For a versed engineer, just saying "language is garbage collected" is enough. They'll know everything else already, and look language syntax from reference manual.
Quote:Original post by Antheus
One "reading" nobody mentions is CS course.

It may seem too abstract, but almost each and every question is covered in any respectable CS course in depth (not game development schools). Even better, they provide theory that is applicable regardless of language.

While there are technical aspects that are language specific, concurrency, assembly and general computational performance, optimization techniques, all of these are covered. And they cover the past, present and future for every language.


Things must be quite different in Austria. My, albeit, limited experience with the cs courses in the US (well, Iowa anyway) is that they teach very little other then syntax of a language. They never touch on anything in Promit's lists if they even cover C++ or C# at all (lot of java going on currently).

Quote:If you want to know how to find answers to *any* question, then the only reference that will prepare you for that is solid education.


I think you missed the point. I'm really after the questions, I can find answers if I know what I'm looking for and simply looking for a more complete understanding of the environment I'm working in isn't enough to generate relevant Google hits.

Quote:But with all of them, the same issues exist. Knowing every nuance of how C# implements finalizers compared to how Java does them is much less useful than understaing the design and concepts behind garbage collection strategies.


Not quite. A general understanding of how garbage collection works will not help you avoid potential pitfalls when dealing with garbage collection within a certain language. Those pitfalls could have a huge performance impact on the software you write.

Quote:So if you're really looking to further yourself, steer away from languages, but consider the theoretical fundation on which it's built.

It may seem counter-productive, but the stronger your fundations are, the easier it is to switch between languages. For a versed engineer, just saying "language is garbage collected" is enough. They'll know everything else already, and look language syntax from reference manual.


Understanding the theory behind something is a poor substitute for actually understanding how it really works in a production environment. You don't write software with theory and writing good code requires a hell of a lot more then just knowing the syntax.
It's true that the more fundamental knowledge is more important, particularly in the long run. But that doesn't mean that these sorts of architectural details can just be glossed over. Maybe IOCP is a fad, but it's necessary right now in many types of applications, so you need to know it regardless.

Knowing the fundamentals and the generalities of things over the transient details is all well and good, but it's the transient details that are most relevant to the practical task of actually writing software.
SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.
This is a hard problem. It is very difficult to expand one's breadth of knowledge without external help. It's something I've ranted about more than once.

The best thing I can suggest, at this point in my own programming career, is to do your best to surround yourself with people who know more than you do. This is the most reliable way to consistently expand the breadth of subjects you're aware of - i.e. to learn what questions you need to ask.

Unfortunately, that method gets exponentially more difficult and expensive as you learn, but such is life.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Quote:Original post by tstrimp
Quote:Original post by Antheus
One "reading" nobody mentions is CS course.

It may seem too abstract, but almost each and every question is covered in any respectable CS course in depth (not game development schools). Even better, they provide theory that is applicable regardless of language.

While there are technical aspects that are language specific, concurrency, assembly and general computational performance, optimization techniques, all of these are covered. And they cover the past, present and future for every language.


Things must be quite different in Austria. My, albeit, limited experience with the cs courses in the US (well, Iowa anyway) is that they teach very little other then syntax of a language. They never touch on anything in Promit's lists if they even cover C++ or C# at all (lot of java going on currently).



Well, perhaps there is a difference, maybe in courses, maybe in type of course.

First year covered the language essentials, procedural, OO languages, prolog, lisp, the likes. C and C++ were just one of the languages. After that, you were expected to know them, all courses from then on assumed you knew them. Nobody cared about syntax since that doesn't really matter. The hardware side covered all important assemblers during past 30 years, the ability to code in them. In addition, it involved hardware circuit design with protoboards, extending into actual microcontroller programming (one of Motorola chips I think).

Second year was then pure theory, mathematics, physics, algebra, number theory, etc.

From third year on all courses were specialized, databases, simulation, systems, graphics, modeling and management. And from there on specializations were also offered for select topics to follow throught the graduation.

This is what I assumed would be the content of CS oriented course. The syntax is not taught since that's just a matter of looking it up.

I don't have experience with US universities or curriculums, but I assumed they were similar. The only somewhat general reference I do have is the MIT OCW, which I consider to offer good coverage of topics I'd consider essential.

Quote:Understanding the theory behind something is a poor substitute for actually understanding how it really works in a production environment. You don't write software with theory and writing good code requires a hell of a lot more then just knowing the syntax.


Theory isn't syntax.

My point was, if you know theory, learning implementations (syntax) is trivial. The other way around is not. And yes, experience is still invaluable.

Real in-depth knowledge of a language will always be tied to both theory and history. Almost all languages today are based on foundations of older versions, and much of evolution is based on different paradigms that have either been known before, but weren't viable, or weren't necessary.

So perhaps a simpler advice to expand on general understanding of today's languages is to learn old ones. Not that much has changed in last 20-30 years. Some things just became more viable.


This topic is closed to new replies.

Advertisement