Working with Variable Argument Lists in C/C++

C/C++ provides a means to pass a variable number of arguments to a function. This article shows how to use that to your advantage, and also discusses the potential problems that your program might encounter if you do so.

Coming from an ASM background and knowing how parameters are passed to function calls, this feature of C totally amazed me when I first saw it -- over thirty years ago. How on earth can a compiler know what program code to generate when there is no set number of arguments?

The first, and probably most significant, piece of the puzzle is that in C/C++, the calling function is responsible for fixing the stack after a call; that is, the compiler automatically generates code to do that. This is different from Pascal and some other language conventions. It means that however many arguments are pushed onto the stack before the call, they are automatically removed from the stack after the call.

The calling function easily knows how much stack space was used. But how can the called function know? When using a Variable Argument List, you need to provide some sort of mechanism so that the called function knows how many function arguments to process.

Embedded, Interpreted at Run-time
You are certainly familiar with the most well-known example of this type of function: printf (and sprintf, etc.) It knows how many arguments were passed because the first (required) string parameter contains some number of formating specifiers embedded in the string. For instance, if it contains:

"My name is %s. I am %d years old."

... then printf knows that there will be a string pointer ("%s") and an integer ("%d") on the stack, in that order.

So, one mechanism is that the caller passes a string that can be interpreted at run-time to determine how many (if any) extra arguments are on the stack.

Examine Arguments for Specific Value
Another mechanism that might be used is to cycle through the arguments until you hit a specific value. For instance, a function that sums up all of the positive integer values passed to it might stop when it hits a value of -1. A function that concatenates a variable number of string arguments might stop when it hits NULL or "".

One Argument Specifically Indicates How Many Other Arguments There Are
Another mechanism is more straight-forward: The calling function is required to pass an integer value as one of the early, required parameters.

We'll look at examples of these and get into the specifics of how to "walk the argument list." But first, let's look at an example that you can use without needing to understand the underlying mechanism.

Var Arg List as a Black Box
The ATL/MFC CString data type provides a Format() function that gives you a printf-like capability. For instance:

char* szName="Britney Spears";
                      int   nAge= 28;  // I'm not kidding, born in 1981
                      CString s;
                      s.Format( "Name: %s  Age: %d", szName, nAge );

Open in new window

But looking at the CString member functions, you might notice another function, FormatV(), that provides an additional capability: A way to intercept the action before the formating starts. Here's a function I've used to simplify a task of generating XML. Normally, the Attributes of an XML tag need to be surrounded in quotes. So, using CString::Format(), I might use:

CString sXml;
                      sXml.Format("<Product color=\"%s\">%s</Product>", sAttrColor, sElemValue );

Open in new window

The escaped quote marks (\") make this line awkward to type and hard to read. So, I wrote an XML-formatting function that would let me use %q to mean "replace with a string surrounded by quotes". Here's the function:

CString XmlPrintf( LPCSTR szFmt,... )
                      {
                          CString sTxt;
                          va_list args;  va_start(args, szFmt);   
                      
                          CString sFmt= szFmt;
                          sFmt.Replace("%q","\"%s\"" );
                      
                          sTxt.FormatV( sFmt, args );
                          return( sTxt );
                      }

Open in new window

Note the special use of the ellipsis (...) in the function declaration. That signals that the function will receive a variable number of arguments. It must receive at least one, szFmt, but after that, it's anyone's guess. Example of usage:

m_sOut += XmlPrintf( "<Product size=%q color=%q>%s</Product>", 
                           rc.sProdSize,
                           rc.sProdColor=="" ? "None" : rc.sProdColor,
                           rc.sProdName
                      );

Open in new window

And the output would be, for instance:

<Product size="Large" color="Green">Widget</Product>

In that function, we used va_list and va_start without having to know how they work, or even what they do... We treat them as magic words to put in the function so that we can pre-process the szFmt string before sending it through the CString::Format() function. Now let's take a closer look at what these "magic words" do.

Processing the Argument List
In this article, we won't go into printf-like parsing of a format string. Instead, let's start with the simplest variation: The following function accepts a variable number of positive integer arguments, and returns the integer average. To signal the end of the variable-length argument list, we must pass in a terminating value of -1. Example of usage:

int nAvg= GetAverage( 4,8,3, -1 ); // response is 5: (4+8+3)/3

And here's the function:

int GetAverage( int nVal, ... ) 
                      {
                          va_list pVarArg;
                          va_start( pVarArg, nVal );
                          int nCur= nVal;
                          int nSum=0;
                          int nArgCnt=0;
                      
                          while ( nCur != -1 ) { 
                              nSum += nCur;
                              nArgCnt++;
                              nCur= va_arg( pVarArg, int );
                          }
                          if ( nArgCnt==0 ) { // avoid division by 0
                              nArgCnt= 1;
                          }
                          return( nSum / nArgCnt );
                      }

Open in new window

In lines 3-4 we set up to access the argument list. The second parameter to va_start() is the name of the function argument that will be the first one to use in the following loop. In the loop, we check for the terminating value (-1) and if it's not there, then we accumulate a sum and increment the count of arguments processed. The average is calculated at the end (making sure not to divide by 0).

The key is in line 12 where the va_arg() function is used to extract each item from the argument list, one at a time as the loop cycles. Its second parameter is a C datatype; in this case, int. As we cycle the loop using va_arg(), we are actually moving through a series of sequential bytes on the stack. var_arg knows how many bytes are in an argument and how far to move after each one, based solely on the datatype.
[step=""]Note:
The mechanism of specifically how va_start() and va_arg() work is too complex to get into here. And frankly, there is no need to know exactly how it is that they process the stack data, only that when used as shown, they do it.[/step]
Danger, Will Robinson!
What happens if you forget to pass in a -1 as the list terminator? This is a very significant issue with using variable argument lists: It is easy for a careless programmer to seriously muck up the works. As written, the function will continue digging through the stack until it, by accident, hits a -1. The returned "average" will be a random number.

The more likely consequence, however, is that the function will never hit the terminating condition and will eventually try to access a part of memory that is off-limits. At that point, you will get a unhandled exception and you are hosed down with a -- "Access violation reading 0x12345678" crash.

You can avoid the crash by putting the loop in a try...catch exception handler or maybe add code to enforce a limit to the number of arguments. But those are only band-aids. In any case, the function (and thus your lovely program) fails -- the "average" that you display won't be valid.

Does this mean that you should not use this technique? Many a pundit has said exactly that -- it's just too dangerous. My opinion? I say, go ahead! Just don't let the summer interns anywhere near your source code :-)

Passing an Argument That Indicates How Many Arguments There Are
Here's a function from production code that uses a different technique to process the argument list. I wrote an XML generating class object and I wanted to be able to pass in an Element (tag name) and any number of Attribute name/value pairs. For instance:

CString s= XmlElemWithAttrs(
                          "Product", "Widget", 4, 
                          "size",     (LPCSTR)rc.m_sSize,      // #1 e.g., "large"
                          "color",    (LPCSTR)rc.m_sColor,     // #2 e.g., "blue",
                          "userData", (LPCSTR)rc.m_sOtherData, // #3 e.g., "",
                          "rating",   (LPCSTR)rc.m_sRating,    // #4 e.g., "7"
                      );

Open in new window

The third parameter is 4, indicating that four name/value pairs will follow. The output of that function call would be something like:

<Product size="large" color="blue" rating="7">Widget</Product>

Here's the function:

CString XmlElemWithAttrs( LPCSTR szTagName, LPCSTR szTagVal, int nAttrCnt, ... )
                      {
                          va_list pVarArg;
                      
                          CString sRet, sClose;
                          sRet.Format("<%s ", szTagName );
                          sClose.Format("</%s>", szTagName );
                      
                          CString sAttrName, sAttrVal;
                          va_start( pVarArg, nAttrCnt );
                          for ( int j=0; j<nAttrCnt; j++ ) {
                              try {
                                  sAttrName= va_arg( pVarArg, LPCSTR);
                                  sAttrVal=  va_arg( pVarArg, LPCSTR);
                              }
                              catch( ... ) {
                                  // LogErr("bad args in XmlElemWithAttrs" );
                                  ASSERT(0); // catch during debug runs
                                  sAttrName= sAttrVal="";
                              }
                              if ( sAttrVal > "" ) {
                                  sRet += sAttrName + "=\"";
                                  sRet += sAttrVal + "\" ";
                              }
                         }
                          if ( CString(szTagVal) > "" ) { // lazy check for both NULL and ""
                              sRet += ">";
                              sRet += szTagVal;
                              sRet += sClose; // e.g., "</Element>";
                          }
                          else {
                              sRet += "/>";   // e.g., "<Element .../>";
                          }
                          return( sRet );
                      }

Open in new window

Notice that there are three required parameters. The first two are the XML tag name and the element value. The third parameter is the key to the handling of the argument list: It indicates how many attribute name/value pairs will follow. Line 11 uses that value as the loop counter. Lines 13-14 break out an attribute name and an attribute value into C-style string pointers which are used in generating the output.

Other than that, the va_arg() handling is similar to that used in the earlier example. One difference is that the loop processes two of the function arguments at a time. Note that all of the optional arguments are (must be) char* values; that is, you must not try to pass in an integer of floating point value. If you look back at the example function call, you'll see that purely out of habit, I set up the attribute name/value pairs like:
"size", (LPCSTR)rc.m_sSize, // e.g., "large"
"color", (LPCSTR)rc.m_sColor, // e.g., "blue"
I find that it helps me avoid errors if I give myself reminders like that.

Note the use of try...catch as an attempt to avoid crashing. Here again, there are ways a sloppy programmer can screw things up when using this function. For instance, if the nAttrCnt value is too small, then the output won't include the final items. If that argument is too large or if a non-char* argument is passed, then the program will try to create CString variables from random memory, resulting in program-crashing memory access exceptions. The try/catch handler puts a band-aid around the code that could fail, and an ASSERT statement will make sure to bring it to your notice during debug runs.

As before, some would say that it's just too dangerous. I, however, have resigned my commission in the Programming Thought Police, so I leave it to you to decide.

Summary:
To write a "printf-like" C/C++ function that allows any number of arguments, use an ellipsis (...) as the final argument in the declaration. To process the anonymous arguments, use va_list, va_start(), and va_arg().

You need to provide a mechanism that will let the function know when to stop processing the arguments. We looked at some different ways to set up such a mechanism. Finally, we talked about the dangers -- what can go wrong when you use this kind of function and what you can do to limit the potential problems.

This is one of those C/C++ programming elements that you may never need to use. But I think it is interesting, powerful, and, well... elegant.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
If you liked this article and want to see more from this author, please click the Yes button near the:
Was this article helpful?
label that is just below and to the right of this text. Thanks!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Comments (1)

Zoppo

CERTIFIED EXPERT

Commented: 2017-09-21

Hi DanRollins,

great article, but I would like to mention two points:

1. I think it's probably important to mention how similar types of different size are used. I.e. think about using printf with %f works for both float and double, how is this possible?

The interesting fact is that C/C++ automatically promotes some types to larger types for arguments passed to a function with a varibale argument list, i.e. see http://en.cppreference.com/w/cpp/language/variadic_arguments:

std::nullptr_t is converted to void*

float arguments are converted to double as in floating-point promotion

bool, char, short, and unscoped enumerations are converted to int or wider integer types as in integer promotion

2. It would be a great idea to update this article (or to write a new one) about using varidic templates instead of functions with variable argument list. They are quite powerful and even type safe, i.e.:

// template for a single argument
template < typename T >
void Output( const T& data )
{
	std::cout << data;
}

// specialized template for type 'bool'
template <>
void Output( const bool& data )
{
	std::cout << ( false == data ? " is not" : " is" );
}

// variadic template passing arguments to above templates one by one
template < typename T, typename... Args >
void Output( const T& data, Args&& ... args )
{
	Output( data );
	Output( args... );
}

// same as 'Output' but appends a CR
template < typename... Args >
void OutputLine( Args&& ... args )
{
	Output( args..., "\n" );
}

void foo()
{
	int f = rand();
	OutputLine( "Hello world\n", "Pi is: ", 3.1415, "\n", "Random number ", f, f > 100, " larger than 100" );
}

Output is:

Hello world
Pi is: 3.1415
Random number 18495 is larger than 100

IMO it's highly recommended to use those variadic templates instead of functions with variable argument lists which are quite error prone.