asked on

How fast is your C/C++ Base64 Encoder?

Are you up for a challenge?
Several of us C and ASM programmers had a fun learning experience a few years ago when we tried to outdo each other by writing "The World's Fastest" binary-to-hex converter and output-formatting function. Now I'm throwing out the gauntlet again:

See if you can write a C/C++ function that encodes binary data into a
base64 output string -- and see if yours is faster than everybody else's.

For additional reference, the ASM version of this question is here: http:Q_21983447.html

A Base64 Encoder converts binary data into ASCII codes that can be transmitted/processed by systems that do not normally process raw binary... for instance, in email attachments and in HTTP headers and as values in HTTP URL. The encoder outputs only the following characters:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
plus, the output may end in zero, one, or two equal signs (=)

To encode the raw binary, you access it three bytes at a time, taking the data in groups of six bits. For instance:
Hex: 0x07 0xFF 0x02
Binary: 0000 0111 1111 1111 0000 0010 (groups of 4 bits)
Binary: 000001 111111 111100 000010 (groups of 6 bits)
\____/ \____/ \____/ \____/
decimal 1 63 60 2
lookup B / 8 C

The length of the output must always be an even multiple of four, because when you DEcode the data, you will be taking groups of four base64 "digits" and turning them back into three binary bytes of output. Therefore, when ENcoding, if your input data is not an even multiple of three, you will need to pad the end of the output with one or two equal signs (=)

If you need more info, then (as always) RTFR! (read the friggin RFC :-) http://www.ietf.org/rfc/rfc3548.txt

Also, to verify your output, you can use an Encoder at this URL:
http://www.opinionatedgeek.com/dotnet/tools/Base64Encode/Default.aspx

=-=-=-=-=-=-=-=-=-=-=
I'll be speed-testing the submissions using code like this:

//---------------------------------------- (assume dest buf is large enough)
void ToBase64( BYTE* pSrc, int nSrcLen, char* pszOutBuf ) // {
// your code here!
}

I've already posted one C function here: http:/Assembly/Q_21983447.html#17516191

=-=-=-=-=-=-=-=-=-=-=
I'm betting that the ASM code will out-perform even the most optimized C code. But as we saw in the Hex-converter challenge, a clever algorithm can often win over even the best hand-optimized ASM code (of course the ASM programmers can then implement the same algorithm and shave a few milliseconds and will surely win :-)

So submit your function and I'll post timing/speed comparisons. The timings will measure thousands of passes at a large buffer of random input data, so if that changes the way you write your code, then keep it in mind.

Submit your C/C++ function and I'll measure it against everybody else's as well as the ASM versions and any standard library calls and API functions that I can find. If this works out, I'll start a second challenge for Base64 DEcoder.

Are you game? 500 points and your reputation as a hotshot C programmer are on the line :-)

-- Dan

Infinity08

Just a question : is this algorithm only, or can we use a lookup table ?

grg99

Hate to be a killjoy, but this is an academic exercise only-- in the real world, CPU's are tens to hundreds of times faster than any I/O device, even gigabit ethernet, so doing base64 conversion is not a significant bottleneck.

Still it's a fun challenge-- go to it!