A lot of developers have the wrong ideas about async code and multi-threading and how it all works, or how to use it. Here, you'll learn the differences between the two concepts and implement each one in C#.
ME: "Waiter, this is the first time I've been to this restaurant. Does it normally take 4 hours to get my food?"
WAITER: "Oh yes, sir. This restaurant only has one chef working in the kitchen."
ME: "....only one chef?"
WAITER: "Yes, sir, we have several chefs but only one works in the kitchen at a time."
ME: "So the 10 other people standing around in the kitchen wearing chef uniforms are just... doing nothing? Is the kitchen too small?"
WAITER: "Oh, our kitchen is huge, sir."
ME: "So why don't they all work at the same time?"
WAITER: "That does seem like a good idea, sir, but we just haven't figure out how to manage that."
ME: "Okay, weird. But... hey...where is the current chef, anyway? I don't see anyone doing anything in the kitchen right now."
WAITER: "Indeed, sir. The kitchen has run out of some supplies for one of the orders, so the chef has stopped cooking and is standing out back, waiting for the delivery."
ME: "It seems like he could keep cooking while waiting, and maybe the delivery person could just tell them when they arrive?"
WAITER: "Again a fabulous idea, sir. We even have a doorbell in the back for deliveries, but the chef likes to wait. I'll go get you some more water."
What a terrible restaurant, right? Unfortunately, this is how a lot of programs work.
There are two different ways that the restaurant could have worked a LOT better.
First, it's obvious that each individual dinner order -could- be handled by a different chef. Each one is a list of things that have to happen in a particular order (prepare the ingredients, then mix them together, then cook them, etc...). So if each chef was dedicated to handling that list of things, several dinner orders could be made in parallel.
This is an example of
multi-threading in the real world. The computer has the ability to have multiple different threads running at the same time, and each thread is responsible for executing a series of activities in a specific order (A, then B, then C, then D).
Then there's
async(hronous) behavior.
To be clear, async is NOT multi-threading. Remember that chef that was just waiting for the delivery? What a waste of time! While he's waiting, he's not doing anything productive, like cooking. And waiting doesn't make the delivery come any faster. Once he makes the call to order the supplies, the delivery will occur whenever it occurs, so why wait for it? Instead, the delivery person can just ring the doorbell, interrupting the chef just long enough to say, "Hey, here are your supplies!"
There are a lot of I/O activities that are handled by something outside your code. For example, consider sending a network request to a remote server. It's just like ordering supplies for the restaurant. The only thing your code does is make the call and receive the results. If you choose to wait for the results, doing absolutely nothing in-between, then that's "synchronous" behavior.
However, if you prefer to just be interrupted/notified whenever the results come back (much like the delivery person ringing the doorbell when he arrives), and you can work on other things in the meantime, then that's "asynchronous" behavior.
You can use async code whenever the work is being done by something that is out of direct control of your current code. For example, when you write a bunch of data to the hard drive, your code isn't doing the actual writing. It's just requesting that the hardware perform that task. So you can use async coding to start the writing and then be notified whenever it finishes, and continue working on other things in the meantime.
The beauty of async is that there are no additional threads required, so it's very efficient.
"
But wait!" you say. "If there's no additional thread, then who or what is waiting for the result? How will the code know that the result is back?"
Remember that doorbell? Well, there's a system in your computer called an "interrupt" system and it works a little like that doorbell. When your code starts an asynchronous activity, it basically installs a little virtual doorbell. When that other task (writing to the hard drive, waiting for a network response, etc...) finishes, the interrupt systems "interrupts" the currently-running code by ringing that doorbell and letting your app know that there's a delivery waiting! No threads are needed to sit around and wait!
So let's do a quick recap of our two tools here:
Multi-threading: Using an additional thread to perform a sequence of activities/tasks.
Async: Using the same thread and the interrupt system to have some activity be done by some other component outside of your thread, and be notified when it finishes.
Save The Whales UI Thread
There's one other important thing to know about why it's good to use these tools. In .NET, there is one, main thread called the UI thread, which is responsible for updating all the visual parts of your screen. By default, that's where everything is running. So when you click a button and you want to see the button push down briefly and then back up, that's the responsibility of the UI thread. And there is ONLY one UI thread in your app, which means that if your UI thread is busy doing things like heavy calculations or waiting for a network request or something, then it cannot update what you're seeing on the screen until it's finished. The result is that your app looks like it's "frozen" - you can click on a button but nothing will seem to happen because the UI thread is busy doing something else.
So ideally, you want the UI thread to be as unoccupied as possible so that your app always seems to be responding to the user's actions. That's where async and multithreading come into the picture. By using these tools, you can ensure that heavy work is being done elsewhere and the UI thread remains nice and responsive.
Now let's look at how to use these tools in C#.
Async in C#
The code for doing async stuff is very simple. There are two main keywords that you should know: "async" and "await", so people often refer to it all as just async/await. Let's say you currently had this code:
public void Loopy()
{
var hugeFiles = new string[] {
"Gr8Gonzos_Home_Movie_In_8k_Res.mkv", // 1 GB
"War_And_Peace_In_150_Languages.rtf", // 1.2 GB
"Cats_On_Catnip.mpg" // 0.9 GB
};
foreach (var hugeFile in hugeFiles)
{
ReadAHugeFile(hugeFile);
}
MessageBox.Show("All done!");
}
public byte[] ReadAHugeFile(string bigFile)
{
var fileSize = new FileInfo(bigFile).Length; // Get the file size
var allData = new byte[fileSize]; // Allocate a byte array as large as our file
using (var fs = new System.IO.FileStream(bigFile, FileMode.Open))
{
fs.Read(allData, 0, (int)fileSize); // Read the entire file...
}
return allData; // ...and return those bytes!
}
In its current form, this all runs synchronously. So if you click a button to run Loopy() from the UI thread, then the app is going to seem to freeze until all 3 huge files have been read, because each "ReadAHugeFile" is going to take a long time to run and will be reading synchronously on the UI thread. That is no bueno! So let's see if we can make "ReadAHugeFile" async so that the UI thread can continue handling other stuff.
Whenever there are async-capable commands, Microsoft usually gives us both sync and async versions of those commands. In our above code, the System.IO.FileStream object has both "Read" and "ReadAsync" methods. So the first step is to simply change "fs.Read" to "fs.ReadAsync"
public byte[] ReadAHugeFile(string bigFile)
{
var fileSize = new FileInfo(bigFile).Length; // Get the file size
var allData = new byte[fileSize]; // Allocate a byte array as large as our file
using (var fs = new System.IO.FileStream(bigFile, FileMode.Open))
{
fs.ReadAsync(allData, 0, (int)fileSize); // Read the entire file asynchronously...
}
return allData; // ...and return those bytes!
}
If you run it now, it would all return immediately and the "allData" byte array would have no data in it. Why?
It's because ReadAsync is STARTING a read and returning a Task object, which is sort of like a bookmark. It's a "promise" by .NET that once the async activity has finished (e.g. reading the data from the hard disk), it will return the result and that Task object can be used to access the results. But if we don't do anything with that Task, then the system will just immediately continue on to the next line of code, which is our "return allData" line, so it's returning an array that hasn't been filled with data yet.
So it can be useful to tell our code to wait for the result (but in such a way that the original thread can continue on doing other things in the meantime). To do this, we use an "awaiter", which is as simple as adding the word "await" before our async call:
public byte[] ReadAHugeFile(string bigFile)
{
var fileSize = new FileInfo(bigFile).Length; // Get the file size
var allData = new byte[fileSize]; // Allocate a byte array as large as our file
using (var fs = new System.IO.FileStream(bigFile, FileMode.Open))
{
await fs.ReadAsync(allData, 0, (int)fileSize); // Read the entire file asynchronously...
}
return allData; // ...and return those bytes!
}
Uh-oh. If you tried that, you'll see an error on that line. It's because .NET needs to know that the method is going to be async and it's going to EVENTUALLY return a byte array (which means you need to return your own little Task/promise). So the first thing we do is add the word "async" before our return type, and then wrap our return type with Task<...>, like this:
public async Task<byte[]> ReadAHugeFile(string bigFile)
{
var fileSize = new FileInfo(bigFile).Length; // Get the file size
var allData = new byte[fileSize]; // Allocate a byte array as large as our file
using (var fs = new System.IO.FileStream(bigFile, FileMode.Open))
{
await fs.ReadAsync(allData, 0, (int)fileSize); // Read the entire file asynchronously...
}
return allData; // ...and return those bytes!
}
Alright! Now we're cooking! If we run our code now, it will continue on the UI thread until we get to the "await" of the ReadAsync method. At this point, .NET knows that this is an activity that's going to be performed by the hard drive, so the "await" puts a little bookmark in its current position, and then the UI thread goes back to its normal processing (all the visual updates and all that).
Later on, once the hard drive has read all the data and the ReadAsync method has copied it all into the allData byte array, that Task / promise is now completed, so the system rings the doorbell to let the original thread know that the results are ready. The original thread says, "Great! Let me go back to where I left off!" At its earliest opportunity, it goes back to the "await fs.ReadSync" line and progresses to the next step, which is the return of the allData array, which is now filled with our data.
If you're following along example-by-example and using a semi-recent version of Visual Studio, you'll notice that this line:
ReadAHugeFile(hugeFile);
...is now underlined in green, and if you hover over it, it says, "Because this call is not awaited, execution of the current method continues before the call is completed. Consider applying the 'await' operator to the result of the call."
This is just Visual Studio letting you know that it recognizes that ReadAHugeFile() is an async method and instead of returning a result right away, it's also returning a Task / promise, so if you want to wait for the results, then you can add an "await" before it like this:
await ReadAHugeFile(hugeFile);
...but if we do that, then you also have to update the method signature:
public async void Loopy()
Notice that if we are on a method that doesn't return anything (a void return type), then we don't need to wrap the return type in a Task<...>. It can just remain void.
However, let's not do that. Instead, let's learn a little more about what we can do with async stuff.
If you don't want to wait for the results of ReadAHugeFile(hugeFile) because maybe you don't care about the final result for some reason, but you don't like that green underline / warning, you can use a special trick to tell .NET exactly that. Just assign the result to the _ character, like this:
_ = ReadAHugeFile(hugeFile);
That's the .NET syntax for saying, "I don't care about the result, but I don't want you to bug me with warnings about it."
Okay, let's try something else. If we use an "await" on that line, then it will wait for the 1st file to be read asynchronously, then wait for the 2nd file to be read asynchronously, and then finally wait for the 3rd file to be read asynchronously. BUT... what if we wanted to read ALL 3 files asynchronously at the same time, and THEN once all 3 were finished, we allowed the code to proceed to the next line (the message box)?
There's a method for that, called Task.WhenAll(), which itself is an async method that you can await. You pass in a list of other Task objects and then await it, and it will finish once all of the tasks have completed. So the easiest thing to do is create a List<Task> object:
List<Task> readingTasks = new List<Task>();
...then when we add our Task / promise from each ReadAHugeFile() call into the list:
foreach (var hugeFile in hugeFiles)
{
readingTasks.Add(ReadAHugeFile(hugeFile));
}
...and finally we await Task.WhenAll for our list of reading tasks:
await Task.WhenAll(readingTasks);
The final method looks like this:
public async void Loopy()
{
var hugeFiles = new string[] {
"Gr8Gonzos_Home_Movie_In_8k_Res.mkv", // 1 GB
"War_And_Peace_In_150_Languages.rtf", // 1.2 GB
"Cats_On_Catnip.mpg" // 0.9 GB
};
List<Task> readingTasks = new List<Task>();
foreach (var hugeFile in hugeFiles)
{
readingTasks.Add(ReadAHugeFile(hugeFile));
}
await Task.WhenAll(readingTasks);
MessageBox.Show(sb.ToString());
}
Some I/O mechanisms work better than others when it comes to parallel activities (e.g. network requests often work better in parallel than hard drive reads, but it depends on the hardware), but the principle is the same.
Now, one last thing the "await" operator also does is magically extract the final result of the completed Task / promise. So in our above examples, ReadAHugeFile returns a Task<byte[]>. The magic of "await" will automatically throw out the Task<> wrapper once it's finished and just return the byte[] array, so if you wanted to access the bytes inside Loopy(), you could do that like this:
byte[] data = await ReadAHugeFile(hugeFile);
Again, "await" is a magical little command that makes async programming super easy and takes care of all sorts of little things for you.
Now let's move onto multi-threading.
Multi-Threading in C#
Microsoft sometimes gives you 10 different ways to do the same thing, and that's how it is with multi-threading. You have the BackgroundWorker class, you have Threads, and you have Tasks (and there's a couple variations on them). Ultimately, they're all kind of doing the same thing, but they just have different features. These days, most people use Tasks because they're simple to set up and use and they also interact pretty well with async code, if you want to do that (we'll get to that later). There are lots of articles on the specific differences, if you're curious, but we'll use Tasks here.
To make any method run in a separate thread, you can simply use the Task.Run() method to execute it. For example, let's say you had this method:
public void DoRandomCalculations(int howMany)
{
var rng = new Random();
for (int i = 0; i < howMany; i++)
{
int a = rng.Next(1, 1000);
int b = rng.Next(1, 1000);
int sum = 0;
sum = a + b;
}
}
We could call it on our current thread like this:
DoRandomCalculations(1000000); // One MEEELLION calculations!
Or we could just let another thread do the work:
Task.Run(() => DoRandomCalculations(1000000)); // One MEEELLION calculations... on a separate thread.
So the general syntax here is:
Task.Run(() => CODE TO RUN IN ANOTHER THREAD);
Of course, there are some variations on that, but that's the general idea.
One nice thing about Task.Run() is that it returns a Task object that we can await. So if want to run a bunch of code in a separate thread and then wait for it to complete before going onto the next step, you can do that using an await just like you saw in the earlier section:
var finalData = await Task.Run(() => CODE TO RUN IN ANOTHER THREAD);
Easy, right?
Final Warning
Bear in mind that this article talks about how to get started and how the concepts work, but it is in no way thorough about everything. But perhaps with this bit of knowledge, you'll be able to understand more complex articles from others about more advanced varieties of multithreading and async coding.
Please be sure to click the like / thumbs up icon and/or leave feedback if you found this article helpful.
Copyright © 2021 - Jonathan Hilgeman. All Rights Reserved.
Comments (8)
Commented:
Commented:
Commented:
I've read a few articles on this subject and none are well explained like this. One thing I'd like to know is that I've read in a few places that when I use await to append ConfigureAwat(false) on the call. Numerous articles say you should, a couple say you shouldn't but none explain what it is, why you use false, what happens internally when you use it or why you shouldn't. Any input on that?
Great :)
Author
Commented:Before I begin, any article that says you ALWAYS should or shouldn't use ConfigureAwait(false) is wrong. There is a purpose to what that statement does and if you blindly use it, then you'll probably end up with some unexpected problems.
Okay, so the first thing is to make sure you know that there is ONE main UI thread. And that UI thread is the only one that is allowed to make changes to the UI. So updating the content of a textbox via code, or using code to move a button somewhere or update the layout, etc, etc, - all of that kind of code HAS to run from the UI thread or else you'll get an error.
The UI thread is basically the "main" thread for your app, but since I used the restaurant analogy in the article above, allow me to continue using that here, and we'll say that each thread is a chef in the kitchen. There's also a concept of a "context" but for simplicity's sake during this explanation, I'm going to treat it the same as a thread for now.
There is ONE main chef in the kitchen who does most of the work. Let's say her name is Chef Maria and Maria is responsible for a lot of stuff, but one of the things that she does is plate the food - arranging it to be visually appealing. Maria is the ONLY chef with the skills to do this - it is exclusively her job and if any other chef tries to do it, they get their hands slapped away.
Now let's go back to the whole async "having groceries delivered" analogy. So Maria makes the call to place the order. But when the grocery delivery arrives, one of the chefs has to answer the door and receive the delivery. There is no guarantee which chef it will be - it doesn't have to be Maria just because she placed the order.
So what if the delivery contains something that should immediately be placed onto the plate?
Well, if Maria was the one who received the delivery, then she could just place the newly-delivered item onto the plate because she's allowed to do that. But if another chef, Chef "Junior", received the delivery, then Junior waits for the first moment that Maria is free, and hands her the delivery and she can finish the food plating process.
The default behavior of async/await is that WHOEVER places the order will be the one to RESUME the process once the delivery arrives. So if Chef X places the order, and Chef Y receives the delivery, then Chef Y will wait for Chef X to become available, and then hand the delivery over to Chef X so Chef X can finish the rest of whatever they were originally doing.
This is really useful default behavior because a lot of apps use async calls to make visual updates, so they start off on the UI thread (e.g. a button click) and then run an async call with an await, and then update something on the screen with the result.
When you use ConfigureAwait(false), then you're saying, "whoever gets the delivery can finish the rest of the process." So Junior (or whomever receives the delivery) will not wait for Maria (or whomever placed the order) and will instead try to resume the original code himself.
So there's an immediate performance advantage to ConfigureAwait(false) - basically you're eliminating that automatic behavior / step for the "awaiter" to wait for the original thread to become available again so it can be given the results and it can resume that work.
There's one additional benefit that takes some extra explanation.
Basically, each chef is able to be told that it can only handle X number of jobs. If one of those chef is maxed out on their # of current jobs and then places an order, then the default behavior could result in a "deadlock" which is basically where the thread is hung up because whatever it is waiting for will never complete (e.g. maybe job A is waiting for job B, but job B is waiting for the results of job A). Since receiving the hand-off from the delivery results in yet another job, the default behavior could result in a deadlock if there are any concurrency limits imposed on the chef / thread.
This is not usually a problem with most basic async/await usage, but it can be a problem with more advanced usages of it.
With ConfigureAwait(false), you don't have to worry about a deadlock from that hand-off because whichever chef received the delivery can also continue on with the rest of the code.
With all of that said, you can use ConfigureAwait(false) to improve performance and stability when it doesn't matter which thread handles the rest of the code after the async call comes back. For example, an async job that writes the result to a database.
But if you're using async/await as part of UI updates, then don't use ConfigureAwait(false) because that will simply force you to have to manually invoke the UI thread for every update , so you're not gaining anything and just having to write more code.
Commented:
View More