Jump to content

  • Log In with Google      Sign In   
  • Create Account


Avoiding Global Variables


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
71 replies to this topic

#61 Potatoman   Members   -  Reputation: 108

Like
0Likes
Like

Posted 24 March 2012 - 11:44 AM

Management examples aren't made up, they're simple fact of survival. Companies that rely on top-down management either died, exist due to government protection or will die very soon. World no longer tolerates single points of failure.

I'm sure this example applies to certain companies and code bases. Not all code needs to be infinitely scaleable, indeed in some cases you know full well the code will be largely discarded in a few years time. For the same reason you may decide your code only needs to run on a single thread, use of globals may well be suitable (note the term suitable, rather than 'appropriate').

Yes, sometimes they're abused, and in cases promote poor design. Don't think you're making your programmers 'better' by restricting their toolset though. The thought process if I eliminate bad practice 'x', I'll get better quality code, is not really a great way to get the changes you would like. The result could well be even more problematic programming habits springing up.

Sponsor:

#62 Telastyn   Crossbones+   -  Reputation: 3726

Like
0Likes
Like

Posted 24 March 2012 - 12:12 PM

I see largely the reverse, though certainly it depends on the situation. In my experience, if you were wrong, you just wasted a potentially nontrivial amount of time on developing and supporting infrastructure you will never need.


What infrastructure? Passing a variable along isn't infrastructure and if it takes a non-trivial amount of time then you're either doing it wrong or working on some throw away code. Seriously; it takes no effort at all to do it right in the first place.

Can you give an example of a situation where the rework is more significant than simply changing the singleton to be to passed in, that is not the result of poor design in general (as in, it's a painful rework because of the general code structure, not the use of a singleton).


The rework is always more signficant. Not only do you have to find where the singleton is referenced, you need to track down everything that references those bits of code and so on. For example, Image::Render uses some logging singleton. To pass the logger into that method, you don't just need to make it available to the class, but the module that owns the class. If you had the references there to begin with, you would've perhaps realized sooner that things were touching bits they shouldn't or designed things more cleanly. The very existence of a singleton promotes worse design.

What I mean is if we're talking about singleton vs passing in the argument, I see no difference between implementing it now or later. Why is it more painful later? At a coarse level, I imagine the 'pain' of implementing it later is proportional to the amount of time saved by making this simplifying assumption in the first place.


Write more code then. It's more painful later because you need to touch everything that uses the singleton as well as cascading up through everything that uses the classes that use the singleton. And you can't just replace the references either. You need to make sure that assumptions about single instance and/or global access are fixed. This is painful.


making that sort of stuff global prevents doing that effectively.

It doesn't prevent it at all, instead of passing the parameter in, you set the global to that value. It might not be as 'clean' or as intuitive to the reader as you'd like, but I think 'prevents' is a bit strong.

void TestMyContainer()
{
  std::unique_ptr<SharedData> data(new SharedData());
  MyCustomContainer c(data);
  c.DoSomething()
}

versus

void TestMyContainer()
{
  std::unique_ptr<SharedData> data(new SharedData());
  MyCustomerContainer::setSharedInstance(data.get());

  MyCustomContainer c;
  c.DoSomething()

  MyCustomerContainer::setSharedInstance(nullptr);
}

Now again with the disclaimers, I'm not recommending a container class that uses a global shared instance, that's not the point. I'm merely illustrating that it doesn't prevent you from unit testing. Please can we refrain from asking why you would actually have a container taking a shared instance in the manner demonstrated above.


You realise that pretty much every unit testing framework in existence runs tests in parallel right?

#63 Potatoman   Members   -  Reputation: 108

Like
0Likes
Like

Posted 24 March 2012 - 09:14 PM


I see largely the reverse, though certainly it depends on the situation. In my experience, if you were wrong, you just wasted a potentially nontrivial amount of time on developing and supporting infrastructure you will never need.


What infrastructure? Passing a variable along isn't infrastructure and if it takes a non-trivial amount of time then you're either doing it wrong or working on some throw away code. Seriously; it takes no effort at all to do it right in the first place.


I had a specific example in mind, and I guess there are two facets to the issue. One is working with existing code, and to be fair I should probably always clarify when I'm referring to modifying an existing code base. Certainly when building from scratch it's a lesser issue - even so, logging is a prime example. For the same reasons you assert removing a singleton is painful - when I need to add some logging, I may not be able to add it in trivially because there is no log handle readily available. I might not even be sold on the idea the logging needs to stay permanently in code, perhaps I just want to do a quick test? If there's two modules between where I am, and where I need to get the logger from, I might instead opt for a less optimal solution, maybe using conditional breakpoints. Some will question 'why do you need to log there'. Seriously? I'm constrained in what I can log, and where? Logging is a design issue? Perhaps on some projects, but not the ones I've worked on.

To give an example when working with existing code, the example was provided in another thread. I have a code base with hundreds if not thousands of load methods that already exist, taking an XML node in load calls. I want to add in template support, to eliminate duplication within the XML source. This is a somewhat trivial change using a global to store the template database, nontrivial (in terms of time to implement) if I need to pass in a template database to every single load call, along with everything that calls load. Now a template database using GUIDs can be considered global in every practical sense, and I'm finding it tremendously difficult to justify spending days instead of hours to inject the database lookup I need. This is a real world scenario, and a very real potential outcome is this support will never be added, if the cost benefit analysis doesn't add.

The rework is always more signficant. Not only do you have to find where the singleton is referenced, you need to track down everything that references those bits of code and so on. For example, Image::Render uses some logging singleton. To pass the logger into that method, you don't just need to make it available to the class, but the module that owns the class. If you had the references there to begin with, you would've perhaps realized sooner that things were touching bits they shouldn't or designed things more cleanly. The very existence of a singleton promotes worse design.


No matter what the issue, you should know I always get suspicious when someone talks in absolutes, 'always', 'never'. There most certainly exist trivial cases where the rework is not more significant, and I suspect this may well be true in more complex frameworks.

Not only do you have to find where the singleton is referenced, you need to track down everything that references those bits of code and so on.


Nothing you, or someone, wouldn't have had to do in the first place. Refer back to my comment on logging.

You realise that pretty much every unit testing framework in existence runs tests in parallel right?


This is an artificial block, turn off threading for unit tests that do not support threading and make it sequential. Though I'm glad you raised this point, as it could be what many people have in mind when they talk about the difficulties of unit testing code that uses global shared data.

I can't help but feel people are missing the point I'm trying to make though. I'm not advocating people use global shared data. Conversely I've seen projects run quite happily using global shared data. For every project that had serious complications through use of singletons, there are 'n' projects for which it was a non issue, I really depends on how liberally they were used. Where your project sits in the spectrum, I don't know, and I'm certainly not going to just assume (even demand) that you do things one way, and make the claim that you will regret it later if you don't.

#64 Hodgman   Moderators   -  Reputation: 28780

Like
3Likes
Like

Posted 24 March 2012 - 09:57 PM


The arguments for putting up with singletons because "I only need 1 instance right now" seems like a decent appeal to YAGNI, which I'd normally agree with... but only if you're writing "throw away" code, where you don't care at all about quality, and only if it's actually quicker and easier than writing quality code (which it's not).

I would argue that fewer lines of code is quicker and easier to write, and globals use demonstrably fewer lines of code. I'm not saying to use that metric to decide what approach to take, but it usually is quicker and easier - I feel it devalues your argument to suggest otherwise.

As a beginner, or on small projects, or projects where code doesn't have a long life-span, I'd agree that it can be easier to use globals, and to hard-code values... but in general, no I don't agree that it's easier or quicker to use a global instead of writing good, re-usable code.

To use the previous example of a texture factory requiring a global ID generator, three approaches are listed below. They're all equal effort to write initially, but provide different guarantees to the user.
The first implies that it's valid for different textures to be labelled by different IdGenerators. The second implies that textures should be labelled by the same IdGenerator. The 3rd is the same as the second, but the dependency on an IdGenerator is hidden, and the particular object instance is hard-coded.
In a large-scale project, this makes the 3rd one much harder to reason about (which ultimately means more maintenance time, slower long-term development) -- you're obfuscating a dependency, and hard-coding a pointer value that does not need to be hard-coded. Both of these things are flaws that can and do exist in shipping code, but software engineering exists as a science so that we can learn that they are flaws, and do have consequences.
class TextureFactory
{
public:
  Texture* Create( int foo, IdGenerator& id )
  {//all calls to Create make use of a user-supplied IdGenerator, which may differ with each call.
	return new Texture( foo, id.Generate() );
  }
};

class TextureFactory
{
public:
  TextureFactory(IdGenerator& id) : id(id) {} IdGenerator& id;
  Texture* Create( int foo )
  {//all calls to Create make use of a user-supplied IdGenerator provided when creating the TextureFactory and fixed to it's lifetime.
	return new Texture( foo, id.Generate() );
  }
};

class TextureFactory
{
public:
  Texture* Create( int foo )
  {//all calls to Create make use of a hidden singular IdGenerator instance with global lifetime.
	extern IdGenerator g_id;
	return new Texture( foo, g_id.Generate() );
  }
};
I can also see that some would see the 3rd to be easier to use, because there's one less detail for the user to worry about, but IMO hiding that detail actually makes it harder to use, because as above, you've hard-coded values and obfuscated operations.

#65 Telastyn   Crossbones+   -  Reputation: 3726

Like
0Likes
Like

Posted 25 March 2012 - 07:45 AM

One is working with existing code, and to be fair I should probably always clarify when I'm referring to modifying an existing code base.


Yes, many of my comments above refer to doing work in the design phase, which is generally where these decisions come up. When doing maintenance work there are plenty more things to consider (as you point out).

Certainly when building from scratch it's a lesser issue - even so, logging is a prime example.


Design should be done for the application, not for the bug at hand. If you're working with existing code that makes it difficult, then you maybe need to take a step back and see what needs to be done in general instead of that specific instance. When all you're concerned about is this bug (and are likely under time pressures because of it) you don't make good decisions for the code base (and that's what's likely gotten it to that point in the first place).

Regardless, globals need not be singletons.

I have a code base with hundreds if not thousands of load methods that already exist, taking an XML node in load calls.


Seems like you have a bigger problem than xml duplication if there's that many touchpoints.

This is an artificial block, turn off threading for unit tests that do not support threading and make it sequential.


Sure, but consider my real-world example for this: We have unit tests automatically run on check-in. You break the tests, you broke the build, so your code doesn't go in. Threading the unit tests is the difference between check-ins taking 10 minutes (which is arguably still too long) and over an hour. We could certainly remove the automatic test running, but that quickly leads to piles of broken tests (and by extension broken code).

#66 Potatoman   Members   -  Reputation: 108

Like
0Likes
Like

Posted 26 March 2012 - 08:30 AM


Certainly when building from scratch it's a lesser issue - even so, logging is a prime example.


Design should be done for the application, not for the bug at hand.


It just strikes me as designing for the sake of design, disregarding practical considerations. Many projects simply want a log method that takes a string, nothing more. To force every method that might log data accept a logger has a very real potential to be an unnecessary burden. Hodgman made the observation that globals are design flaws. I can accept that, but sometimes a simple cout is all you want, and it serves its purpose just fine. I get the distinct impression some people will drop features before accepting 'flawed code' into their code base, even if in practical terms there is zero risk associated with that code (tag the code with a low priority 'TODO', if you must). I think that's a shame - extremes at either end of the spectrum have the potential to seriously damage a company.

Seems like you have a bigger problem than xml duplication if there's that many touchpoints.


This is the second time I've heard a comment along these lines, which makes me wonder if I'm missing something. If you have 'n' classes to load (with their own unique data layout, being fundamentally different objects), where n is a large number - how exactly does one reduce the number of load calls below n? There is no duplication in code, it's just each class needs the option to be loaded from XML, in a largely data driven design.

Sure, but consider my real-world example for this: We have unit tests automatically run on check-in. You break the tests, you broke the build, so your code doesn't go in. Threading the unit tests is the difference between check-ins taking 10 minutes (which is arguably still too long) and over an hour. We could certainly remove the automatic test running, but that quickly leads to piles of broken tests (and by extension broken code).


Fair enough, it's going to depend on your project requirements where unit test speed is a practical constraint that needs to be taken into consideration. Some projects would be happy just getting an incremental compile in under one hour.

#67 Telastyn   Crossbones+   -  Reputation: 3726

Like
0Likes
Like

Posted 26 March 2012 - 09:15 AM

If you have 'n' classes to load (with their own unique data layout, being fundamentally different objects), where n is a large number - how exactly does one reduce the number of load calls below n?


There's a difference between the number of load calls and the number of load call sites.

I gathered that the later is what you were talking about. This runs into problems because that's a whole bunch of things to refactor (when the refactoring changes what data the method needs to know).

#68 Slavik81   Members   -  Reputation: 360

Like
0Likes
Like

Posted 26 March 2012 - 12:10 PM

You realise that pretty much every unit testing framework in existence runs tests in parallel right?

I've actually never seen a unit testing framework that runs tests in parallel within the same executable. It seems rather unnecessary as the execution of unit tests has never been a significant portion of the build process for me. Compiling and linking tests takes much longer than executing them.

#69 Telastyn   Crossbones+   -  Reputation: 3726

Like
0Likes
Like

Posted 26 March 2012 - 01:07 PM


You realise that pretty much every unit testing framework in existence runs tests in parallel right?

I've actually never seen a unit testing framework that runs tests in parallel within the same executable. It seems rather unnecessary as the execution of unit tests has never been a significant portion of the build process for me. Compiling and linking tests takes much longer than executing them.


In C#? I can believe that in C++. The product I'm working on (239k lines C#) currently builds in 4m52s, unit tests (1699) take 4m29s (based on my last commit)

#70 davepermen   Members   -  Reputation: 1007

Like
-1Likes
Like

Posted 27 March 2012 - 05:19 AM


I think the moral of the story is "making assumptions, especially implicit assumptions, is really stupid."

Or something to that effect.


I'm seeing assumptions from both sides, one that you will need more than one of these objects in future, and the other a simplifying assumption that chances are you won't.

What of the simplifying assumption that this class doesn't support multithreading? Or do I have to support mulitthreading for all classes because if I scale the system up I will likely need that one day? I just don't understand why everything has to be a black and white/all or nothing proposition.

Working with the cryengine SDK recently, and had a bit of a laugh seeing all the globals and #defines in there. But you know what, the world didn't end, I was able to get in there and update the code without a lot of hassle. Sure, it's no ideal, but answer me this - would cryengine be a better product had it been developed using software design best practices (under the same budget/time constraints). In my estimation, they wouldn't have a product today had they attempted to take this path.


If you work without globals and shared locals and stuff, you don't have to prepare for your classes to work with multithreading. they will just work. obviously, you have to write the multithreading stuff correctly, but you don't have to change anything in the class, or know any implementation details that you have to workaround then.

that's kinda the point: if you write code without globals and singletons and that crap, the code is more simple to write, and as a bonus, much simpler to use in future, nonpredicted scenarios.

just stop doing it and you'll notice it.
If that's not the help you're after then you're going to have to explain the problem better than what you have. - joanusdmentia

My Page davepermen.net | My Music on Bandcamp and on Soundcloud


#71 Slavik81   Members   -  Reputation: 360

Like
1Likes
Like

Posted 30 March 2012 - 11:29 AM

In C#? I can believe that in C++. The product I'm working on (239k lines C#) currently builds in 4m52s, unit tests (1699) take 4m29s (based on my last commit)

Ah. In C++, I'd expect a build of that sort to take maybe 5 to 10 times that long. Your perspective suddenly makes a lot more sense to me.

#72 Telastyn   Crossbones+   -  Reputation: 3726

Like
0Likes
Like

Posted 30 March 2012 - 11:38 AM

In C#? I can believe that in C++. The product I'm working on (239k lines C#) currently builds in 4m52s, unit tests (1699) take 4m29s (based on my last commit)

Ah. In C++, I'd expect a build of that sort to take maybe 5 to 10 times that long. Your perspective suddenly makes a lot more sense to me.


*nod* I had assumed as much, but should've spelled it out when I made my point.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS