What is a Thread of Execution? (C#)

Started by
10 comments, last by Calcious 6 years, 9 months ago

So I was going through Microsoft Virtual Academy C# Fundamental by Bob Tabor and during a video, he said something about the thread of execution leaving the scope and reducing the reference count. What is a "thread of execution" and how can it leave the scope? I also don't understand how that reduces the reference count of an object, I mean "thread of execution" doesn't sound like something related to an instance of a class to me.

The video is here for further information: https://mva.microsoft.com/en-US/training-courses/c-fundamentals-for-absolute-beginners-16169?l=ymM4awQIC_5206218949

Advertisement

A thread of execution is the concept of a statement being executed, and when one statement is done, the next one is being executed, and so on. In normal applications, there is only one such thread (only one statement is executed at any time), but you can create more threads if you need them (not very much encouraged for beginners, as it's easy to make a complete mess with them, which is hard to untangle).

When a thread of execution starts computing a method, it jumps to the fist statement in that method (and then it executes the second statement, and so on, until the last statement. Then it's done, and it returns back to the point where the call to the method was, for example


class C {
    void f() {
      var X x;
    
      x = new X(); // Do first statement, make a new X
      x.a = 3; // Assign value to "a" field in x.
      
      // What happens to the value of "x" here?
    }
}

C c = new C();
c.f();  // Call f method

It's a bit fantasy code, but I have a class C, with method f(). At the bottom, I make a new C object, and call c.f(). The thread jumps to the "x = new X()" statement, and executes it. Then it executes "x.a = 3".

At that point, the method is done, all statements have been executed. So the thread returns to the "c.f()" call, to look for the next statement (which I didn't write in the example).

A scope is text between { and the matching }. When you start executing f(), you enter that part of the code (enter a scope), and when you return you leave that part of the code (leave a scope). The { and } of the C class is also a scope, but a class as such is never executed, and the scope there only serves to define that f() is part of the C class.

However, inside "f()", the code made a new "X" object, and stored it in the variable "x" inside the f method. What happens with that object? The answer is, as soon as the thread leaves the "f()" method (technically called "leaves the scope of "f"), all variables inside the f method cannot be reached any more. To avoid having them around forever, the system decrements their reference count at that moment.

Since the "new X()" statement has set the reference count to 1, and you didn't copy the reference to some data outside the f() method (which would further increase the reference count), that decrement action makes the reference count of the "x" value 0, and the "new X()" object that we made inside "f" will be deleted at some time in the future.

Interesting video.

12 hours ago, Calcious said:

What is a "thread of execution" and how can it leave the scope?

Alberth covered this one fairly well.

As far as the video is concerned, they are referring to the program entering a function and then leaving it when done.  When you leave the function you leave the scope.

A program can have more than one set of things running at the same time, called multithreading. In that case each thread runs independent of each other, and multiple threads can be running at the same address at the same time.  As far as the For Beginners forum is concerned, that's an advanced topic that is a source of major bugs, so stay away for now.

12 hours ago, Calcious said:

I also don't understand how that reduces the reference count of an object, I mean "thread of execution" doesn't sound like something related to an instance of a class to me.

It was probably the wrong term to use for teaching beginners.

As it relates to the video, he's talking about how many references there are to an object, and how the flow flow of a program's execution changes over time.  The thread of execution enters a function, does the work, then exits the function. 

In the video he made an object referenced by myCar.  Then he used: myOtherCar = myCar.  Ignoring some internal details, this means there is an additional reference to the object.  Before it was just the myCar value, now it is both the myCar and the myOtherCar that reference the object.

When he changed one (myOtherCar.Model) it changed both, because they still reference the same object. He was demonstrating that there were two references to one object. 

When the code leaves the function, the reference called myOtherCar is destroyed.  Again ignoring the internal details, there is now just one reference through myCar.  The same happens if you assign a different object, maybe myOtherCar = differentCar or myOtherCar = null then the number of references also changes.  After running any of these myOtherCar stops referencing the object, only myCar references the object. So after reaching that point there is only one reference to the object.

When there are no more references to the object C# will clean up and destroy the object.

 

 

Thank you, that has helped me a lot, I got one last question though. It's about how these references can reference the same object but I haven't seen other data types do that, like if I do x = y, does it mean that x and y both reference the same thing or is there something different going on?

It is a little difficult to explain while sticking in the 'For Beginners' rules about deep dives into the language.  The point is to educate, not confuse with extra detail.

Languages like C and C++ allow direct manipulations of objects but come with consequences that the code must allocate and free the resources.  Any errors in allocation or release cause severe problems that will eventually crash your program, corrupt data, or worse.   Other languages like C# and Java protect the developers from that for many types of actions by adding more control about how objects are allocated and freed.

Consider the line in the video clip:  Car myCar = new Car();

This does several things.  First, it creates a new object of type Car. It construct the object with no parameters since there was nothing inside the parenthesis.  Then it create a name called myCar. Finally it points myCar (a reference) to the newly created object.

When the code goes on with Car myOtherCar = myCar, there is no new object created.  Instead a new name called myOtherCar is created, and that is pointed to the same object (a second reference).

When either name is pointed to something else, such as pointed to another Car object or pointed to null, then there is one less reference to the object.  When no names are pointed at the object, the system cleans up the resources automatically and destroys the object.  

In this situation, note that neither myCar nor myOtherCar were the actual object.  Both of them referenced an object that was created by new

 

The trick is that some things are created by new, and other things are not.

Some objects are not created by new, such as primitive types.  When you type int i = 0; or float x = 0.0f; they are names that ... for lack of better simple terms ... directly contain the object. There is no call to new that creates the object, instead, the names act as values by themselves.  They're called value types in C#.

With a value type each instance is unique.  When you have x of a value type, and you write x = y; then you are not pointing x to something else, instead you are modifying the value itself.

 

One of many differences between value types and reference types is touched on above, even if you didn't catch it.  In the first example with myCar and myOtherCar, since they are reference types it is possible to assign them to null. That is, you can declare that they don't reference anything any more  However, with the values of i, x, or y, they are value types so they always represent the value. You cannot assign them to null because that isn't an int or a float.

When you pass parameters to functions, the difference between reference types and value types are also important.  If you pass a value type then a copy gets made. If you modify the copy it doesn't make any difference to any other copy, each copy is distinct.  If you pass a reference type then you are passing a reference to the object, each reference points to the same actual object. If you modify one reference you modify the object that all the references are pointing to, so all references see the changes.

As code becomes larger and more complex, the difference between value types and reference types becomes extremely important.  There are concepts like "boxing" and "unboxing", where you take a value type and turn it into a reference type like a box, and you can extract them from the box into a value type.  

Does that answer the question without diving too deeply into the language details?

Huh, I just realized that I wasn't getting notifications at all.

now that I'm thinking about it if an object gets dumped after leaving a function, then how do stuff like save games, etc work? Even if you copy the reference to the main method, it's all going to be dumped after you exit the program right? I may be getting too complex here but I just find it hard to go through any tutorial without knowing basically everything. Second question - When you were talking about primitive types and you said they were a value type, does it mean that a primitive is also a value type? Lastly, what do you mean by passing parameters and having the value type get copied? If my understanding of what "passing" means is correct then wouldn't that mean that I'm just using its value to evaluate something with that method? Why would I need to create a copy of it unless I was passing it to a method to create a second, altered version of it? If I just used it to get a return value for a new variable then I didn't modify the value type that I passed in at all, I just used it to get a result for something else so why would there be a copy?

 

edit: I guess it wouldn't matter if a copy of the value type is made, garbage collection dumps the new value too, unless I store it in a new variable right?

 

edit 2: Curiosity got the best of me and I decided to google value type, I ended up learning a very bare bones definition of what a deep copy is. When you said that the names are the value themselves, did you mean it like it was a deep copy? I may be reading what "deep copy" means wrongly but the names being the value themselves didn't sound "right" to me.

You can easily test this reference stuff yourself


int i;
int j;

i = 3;
j = i;

// print i and j to confirm their values

j = 4;

// print i and j to check their values
//
// If i here also changed, j and i are referencing the same object, else they are different objects

 

54 minutes ago, Calcious said:

edit: Actually, now that I'm thinking, if an object gets dumped after leaving a function, then how do stuff like save games, etc work? Even if you copy the reference to the main method, it's all going to be dumped after you exit the program right?

Yes. In fact, if you turn off the computer, literally everything in memory is gone (note that hibernating isn't really switching off here, I'll explain how that works a bit further).

You are fully correct, data in memory is very volatile, it lives only as long as the program lives (and perhaps a lot shorter, depending on what references to the data). One obvious solution is to never turn the computer off, but that's a bit unpractical :P

That wasn't always the case though, at the very beginning we had magnetic drums https://en.wikipedia.org/wiki/Drum_memory

Computers didn't have memory chips, that drum WAS the memory that was used for running the program. Can you imagine, having a rotating drum next to your computer, literally seeing the computer reading and writing memory? I wasn't born at the time, but wow, talk about physicial computing :D

Obviously, this was slow as hell, and people invented computer memory chips https://en.wikipedia.org/wiki/Semiconductor_memory

Much faster access, but it had the problem that you found. When you turn off the power, all its content is gone. This caused all kinds of interesting startup problems, ie after power-on, the computer needs instructions to start, but the memory is empty.

Eventually, this led to creation of ROM (read-only memory) https://en.wikipedia.org/wiki/Read-only_memory

 

While it solved the start-up problem, it's read-only, ie you cannot write your data to it. That was another journey that started:

Smart people invented offline storage systems. These systems can store new data without needing power to preserve it. At a later date, you can read that data from the storage again, and you can continue working with it.

We had lots of different systems there, it mostly started with paper tape https://en.wikipedia.org/wiki/Paper_tape

(the holes in the paper contain the information, so yep, you can retrieve that many years later, until the paper cumbles or until it is burnt.)

Long lints of paper wasn't very convenient, so they switched to punch cards: https://en.wikipedia.org/wiki/Punched_card_input/output

Problem here was if you dropped a stack of cards on the floor, you had a great time putting them all back in the right order :)

 

Unfortunately, I never worked with the above systems, I would love to do that once, to experience how that worked.

 

Then people discovered you could use magnetic tape instead, so audio cassettes were used (you could store music on them, so surely you can store digitial information on it) https://en.wikipedia.org/wiki/Compact_Cassette

Professional systems used dedicated tape systems for that https://en.wikipedia.org/wiki/Magnetic_tape_data_storage

Even today, there are tape robots in data storage centers, where you have a bunch of tape readers / writers, and huge number of tapes, and a robot swapping the tapes of the reader. https://en.wikipedia.org/wiki/Tape_library

The reason is that tape storage is extremely cheap and reliable, so if you have a few Terrabyte (or more), it's a good option.

Closer to home, tapes have the problem that if some data is at the other end of the tape, you have to literally rewind the entire tape, which could easily take a minute or more. So people came up with the idea to put the tape on a circular flat surface, with a moving head, you could just move the head a few inches left or right, that'd be much quicker right? It was, and we got floppy disks https://en.wikipedia.org/wiki/Floppy_disk

Worked great, much better than tape. Of course, people are never happy, and wanted more speed, so they made the disk much less 'floppy', and put everything in a clean environment, so the head could hover above the surface at a distance of less than the thickness of a human hair. That dramatically increased the speed, and we got hard disks https://en.wikipedia.org/wiki/Hard_disk_drive

Nowadays, the head movement is a limiting factor, as is the fact that a disk rotates at a finite speed. Worst case, you need to wait a full rotation of the disk before you can read the data. That is being solved with SSD disks https://en.wikipedia.org/wiki/Solid-state_drive

No mechanical parts any more, so way faster. Also, it doesn't need power to move the head or spin the disk, which is great news for battery lifetime.

 

After this fun but totally unneeded trip through history, back to your question.

The solution to avoid loosing data is to store it onto a storage system, commonly known as "the disk" or "hard disk". In windows, it's called "C:'\" or "D:\" or so. The basic procedure is to open a file on it for writing, write the data you want to preserve, and then close the file. The next time you want that data, you do the reverse. You open the file for reading, read the data, and close the file again.

All this is explained in much more detail in a chapter called "file input / output" in your tutorial.

 

Finally, I promised to discuss "hibernating". If you start a computer from scratch (a cold start), it takes a long time to get to the point that you can start working with it. People don't want to wait that long, they want to open their laptop, and immediately start working. However, you can't turn off the computer without loosing all its data in memory, so what to do?

The trick they use is as simple as it's elegant. They simply save the entire memory in a big file. Modern hard disks are way bigger than memory space, so this is not a problem at all. They called it "hibernating", but its just "save the entire memory, and turn yourself off". When you open the computer again, it detects that saved file. Instead of starting from scratch, it just loads the entire file back in memory, et voila, back up and running in a few seconds!

 

22 hours ago, Calcious said:

how do stuff like save games, etc work?

Covered by Alberth, probably in more depth and history than you wanted.

22 hours ago, Calcious said:

Second question - When you were talking about primitive types and you said they were a value type, does it mean that a primitive is also a value type?

 

You ask nuanced questions. 

In C# the primitive types are direct number values.  They are : Boolean (true/false flag), signed/unsigned byte 8 bits, signed/unsigned integers of 16, 32, and 64 bits, Single and Double precision floating point numbers, and a character type.

C# value types include all the primitive types, C# structures, and enumerations. 

For each of them, when you create one of them it is a value type. When you write code like int x; or char c; or float f; it creates a new instance.  When you assign another value (int y = x) it also creates a new instance, because they are all value types.  The same is true for enumerations and structures since they are value types. If you create a structure and assign it by new, a new instance is created and the values are copied.

 

 

Beyond that question...

This is an area C# is somewhat different from many other languages.  A structure in C# is different from a class in C# since a class is reference type and a struct is a value type.  It is unfortunately a semi-common source of bugs.  You can use a structure (value type) the same way you use a class (reference type). You can use new to invoke a structure's constructor, but even though it has a name that might look like a class as a reference type, it is still a struct as a value type.  

Primitive types are obviously value types. Others like System.Drawing.Point are a structure and therefore a value type, even if you see code like: Point p = new Point(x,y);  That code calls a constructor on the value but does not create a new object with a reference to it.

 

 

22 hours ago, Calcious said:

Lastly, what do you mean by passing parameters and having the value type get copied? If my understanding of what "passing" means is correct then wouldn't that mean that I'm just using its value to evaluate something with that method? Why would I need to create a copy of it unless I was passing it to a method to create a second, altered version of it? If I just used it to get a return value for a new variable then I didn't modify the value type that I passed in at all, I just used it to get a result for something else so why would there be a copy?

Each of those questions depend on what you need the code to do.

C# has the ref keyword for parameter passing so you can provide additional guidance about what should be done.

Passing by value makes a duplicate. Many of the primitive types are faster (or at best, not slower than references) when passed by value. They fit in a CPU register and potentially the compiler can make the duplicate copy at zero cost. 

Value types (such as the System.Drawing.Point object discussed earlier) are passed by value. A copy of the structure is made.  All structures are value types, so structures usually have a slight performance cost when passed as parameters.  For a Point structure it will probably only be a fraction of a nanosecond, but it is still a cost.  If you have a large structure or a structure that requires resource management then passing the struct by value can have a potentially high cost.

Passing by reference is good for most situations because a reference is small. Remember a reference is just a pointer to the object, it is not the object itself. When you are passing around big things (like the entire game simulation or a collection of game objects) you'll want to pass a reference and not the actual object.  You don't want to make a copy of the entire game simulation, or make a copy of the entire collection of game objects, which passing a reference would do.

You can also pass by constant reference.  It is like regular pass by reference so you are not making a copy of the object, but you are telling the code the object should not be modified. 

 

Typically the primitive types are passed directly by values unless you need to modify them inside the function. Anything else is typically passed by reference or constant reference, and this holds true in all the major programming languages. 

Alright, I think I'm starting to understand this. I'm just going to make sure that I truly know this.


function(int X) 
{
	return X += X;
}

main()
{
	int x = 5;
	function(x);
}

 

x doesn't get changed and the duplicate gets deleted after thread of execution ends unless I do this?

x = function(x)

Correct.

The main function starts.

The x inside main is an integer value initialized to 5.

A copy of the integer value 5 is made and passed to the function named function.

The function named function receives the parameter value and gives it the name X,  does the addition making that value named X become 10, and the function returns the value 10 as X (and every other local variable in the function) is automatically released. 

Nothing captures the value returned by the function, so the value 10 is lost.

 

If you had  x = function(x) as you wrote, then the value 10 would be assigned to x when the function named function completes.   Because they are value types, conceptually the X inside the function no longer exists when the function returns.

This topic is closed to new replies.

Advertisement