asked on

Square root routine -- standard libraries vs local functions

Hey, we are writing an embedded C program, and we need a square root. I know math.h has a standard implementation of square root function, but when I google square root functions I get plenty of hits, one example below.

My question is: what is the drawback, if we only need a square root, to just use something like the function below? Would techniques like the below be less accurate than what the standard libraries provide?

Background: I have a slight preference to use our own square root function rather than invoking standard libraries, since we are making safety code and there is an advantage to having the code be visible/reviewable (as far as I know we can't code review the IAR math library code, can't eliminate the remote possibility of a bug). Also it would make it slightly more portable, the possibility of different behavior from one compiler to another is lessened, since no library change (I know standard libraries are standard libraries, but it's the difference between "should behave the same" and "can prove absolutely will behave the same").

Thanks for any thoughts.

    float  SquareRoot(float number)
    {
    long int inter_value = 0;
    float x_value = 0.0,
    y_value = 0.0;
    float f_value = 1.5F;
    x_value = number * 0.5F;
    y_value = number;
    inter_value = * ( long int * ) &y_value;
    inter_value = 0x5f3759df - ( inter_value >> 1 );
    y_value = * ( float * ) &inter_value;
    y_value = y_value * ( f_value - ( x_value * y_value * y_value ) );
    y_value = y_value * ( f_value - ( x_value * y_value * y_value ) );
    return number * y_value;
    }

Open in new window

SOLUTION

ozo

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

RonMexico

ASKER

Don't understand, you're saying using floating point libraries is a portability sticking point anyway?

Good point, but on second thought I think I should've left portability out of it. It's more about visibility, guaranteed consistency, and eliminating a dependency.

RonMexico

ASKER

And mostly... is there any *drawback* to the local function?

ASKER CERTIFIED SOLUTION

evilrix

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

RonMexico

ASKER

"a) You are guaranteed the implementation is standard on all platforms"

Pardon if I misunderstand, but this is up to the skill level and QA discipline of, for example, the AVR (if you're using their compiler) implementers, correct? You obviously have a lot of experience, have you *never* heard of a bug in an implementation of standard C libraries (honest question, if you haven't I would find that compelling).

I will read your essay, it looks good. But with your overall philosophy about not reinventing the wheel, it seems like you're overlooking (or at least shorting) the middle classes of mistakes, which I think are more common -- not a bug in whatever generic library we're talking about, but (a) a mistake in your understanding of the library and (b) a mistake in the code you often have to create to massage your data into and out of a generic library (sometimes this is zero, sometimes it can be quite large).

(Not talking about any particular library anymore -- just the general question of third-party library use vs roll your own)

Take the standard string routines. I am sure they work exactly like they are designed to 99.9999999% of the time. When people run into errors with them it is almost always because they misunderstood the order of the parameters, or forgot an index was 0-based, or forgot what a formatting character does to some odd input (not gurus like you two I'm sure, but *everyone* else). Or they messed up the code that prepares their data for input into the generic interfaces of the standard libraries.

The thing is, that *counts*. Misunderstanding the interface can be blamed on the human, but the library is complicit.

The advantage of rolling my own little local function, instead of using a standard function, is that I know EXACTLY what it does, I know what it does when it receives bad input, i know that it works perfectly for the input I am going to throw at it, even accidentaly. I'm not saying its more watertight, it's just more known to me and more predictable. It's not typesafe, it doesn't require strict aliasing, but guess what? It doesn't need to be. It will never be used anywhere else. Know what else? I can QA it faster. Because when using a standard library, it's not so much QAing the library, it's QAing the use of it

Now, I'm sure you gurus will say "oh, well just don't be stupid, learn how to use the standard libraries correctly." Well, fine, I still use lots of standard libraries with general success but they occasionally zig when I expect them to zag -- sometimes I forget what the "%x" formatting character does, reread the doc, and misread it -- and I don't know if I believe anyone who says otherwise. But this function my friends, is PERFECT! BEHOLD:

char convert_int_to_asciihexcchar(int v)
{
if ((v>=0) && (v<=9))
{
return v+48;
}
else if ((v>=10) && (v<=15))
{
return v+55;
}
else
{
return '?';
}
}

Because I define perfection, and perfection need not include any complexity in the name of genericism. And mostly/consequently: it's trivial, and takes me 0.7 seconds to know that it will do exactly what I want it to do, always, including error paths. That's just not the case for a call to sprintf -- for the developer or the reviewer. Again, for people that don't spend ALL of their time writing C code, which is most engineers.

RonMexico

ASKER

Two clarifications:

1) I didn't know what strict aliasing was, not trying to pretend otherwise -- but I can still QA that squareroot routine for my use (not generic use).
2) I'm not saying rejecting the generic library in favor roll your own is always the way to go, So please keep that in mind if formulating coutnerexamples. I will probably agree with the standard library in a lot of cases. But probably less often than most folks.

Dave Baldwin

My 2 cents... since you have both codes available, why don't you just code up a test program and see what the differences in results are? I would.

evilrix

>> Pardon if I misunderstand, but this is up to the skill level and QA discipline of, for example, the AVR (if you're using their compiler) implementers, correct?
Indeed it does but the point is this, a compiler vendor of a well establish product will have had their implementation tested in the field by millions of lines of production code. Further, various security analysts (such as CERT) will have tested their implementation over and over. If you write something yourself it will have had no exposure and no testing so you are starting from scratch. What is the point of that, especially as you're still dependent on the C Runtime library, regardless? If we follow your logic to conclusion you might as well implement your own version of the C Runtime and maybe the OS that it runs on. Focus your efforts on the business logic of the program, leave the implementation detail of the building blocks to the compiler venders.

>> have you *never* heard of a bug in an implementation of standard C libraries
Of course. I found a nice memory leak in Microsofts implementation of iostream in Visual Studio 2005 (fixed in SP1). I've also found a couple of nice overly aggressive optimisation issues in gcc (one of them caused me a real headache). There is no such thing as bug free code; however, code that has been in the wild for a long time is far more likely to have had its defects fixed that new code you write yourself.

>> (a) a mistake in your understanding of the library
As opposed to a mistake in your understanding of implementing the library? If you're not capable of reading the documentation for a library and then implementing code to use that library what makes you think you'd be any better at implementing what it does? Besides, if a library has a difficult to understand API or the API isn't a good fit for your problem domain you're probably looking at the wrong library - pick another.

For example, I'm currently working on feature vector extraction and classification and I needed a tool that can perform structured tokenising in a format that can be used by an SVM. I could code one up but what's the point? There are plenty of tools/libraries that already exist. I spent about 1/2 a day evaluating a few and discovered that one fit my needs perfectly whereas the others didn't. Sure, I could have spent that 1/2 a day writing the tool; however, the one I have chosen to work with requires about 3 lines of code to set it up and invoke it. If I wrote my own I'd probably end up writing a hundred or so lines of code, I'd then have to write all the unit tests and QA would then have to write test scripts and test it. Every time I fix a bug I'd then have to resubmit the change to QA so they could (a) check it's fixed and (b) check for any regressions.

Statistically you create 1 bug for every 10 lines of code you write so I've just saved myself 10 defects, hours of work and my company money be not having to pay me and a QA member to test and fix defects in my code. Sure, the library could contain defects but it's a tried and tested, well establish library and is used by many people (both academic and commercially) and I suspect it is likely to be far more robust than anything I could knock up quickly because it is specifically focused on that one task; written by people who specialise in that problem domain and has been used by hundreds (if not thousands) of people who would have reported a problem if one was found.

As a developer my job is to deliver a solution for a business case. I know my business domain very well. Whilst I am more than capable of implementing my own tokeniser it isn't a problem domain I specialise in and, frankly, I've got more important things to worry about than wasting time writing code that already exists.

>> (b) a mistake in the code you often have to create to massage your data into and out of a generic library

Ok, so if - for example - you wrote your own class to implement a linked list how would it differ so much from the STL implementation that you'd have to massage your code? Or, say I need to implement a product that needs a key/value store. What would I gain from writing my own that I wouldn't get from using the plethora that already exist? My own implementation would still need to provide an API to get/set values. What massaging would I need to do that I wouldn't need to do with my own code?

>> The thing is, that *counts*. Misunderstanding the interface can be blamed on the human, but the library is complicit.

If you are not capable of understanding the library you are almost certainly not capable of implementing what the library does. Let me put it another way, I can drive. I've had a driving licence for over 23 years. I couldn't build a car from scratch even if my life depended on it. If I did the chances of it being any good are pretty low. I don't know if you've every seen the program "Top Gear"? There are a few episodes where they have tried to do just that... the consequences are quite hilarious :)

>> The advantage of rolling my own little local function, instead of using a standard function, is that I know EXACTLY what it does
You also know exactly what the standard implementation does. That's clearly documented in the standards documentation. I guess what you really mean is you know how it does it now what it does. Who cares? I care that the input vs. the output are deterministically correct. If you write your own version you'll need to write tests to prove that. You might as well just write the tests and plug the standard library version into them. You'll then have the confidence you need and will have saved yourself some effort.

>> I know what it does when it receives bad input, i know that it works perfectly
You only know that if you write a test case for each expected input. If you don't have a test case you are, at best, assuming.

>> it's just more known to me and more predictable
Really? But, surely if it was that simple no one would every write code with bugs in it? You're starting from a false premise that your code will be bug free. Without meaning to sound disrespectful - it's not! No one's is (not even mine - hard to believe, but true).

>> Know what else? I can QA it faster
What makes you think that? If you have a known set of inputs and a known set of expected outputs why is it quicker to run those tests on your code vs. the standard library? Surely the testing effort is the same. For QA testing to be effective it should be black box (engineers make poor testers because they know how the code is written and so they tend to only test that they expect it to do - in other words, they forget edge cases).

>> it's not so much QAing the library, it's QAing the use of it
But, that is exactly what QA is all about. If you're testing the implementation of the library rather than the contract it defines you are white box testing and that isn't really QA (that's dev testing, see my comment in the previous paragraph as to why that's generally a bad idea).

>> I'm sure you gurus will say ...
And why do you think we'd say that? Because, from experience we've realised (trust me, I've been exactly where you are and I've learned the hard way) that there are some things that just don't work (even if they seem like a good idea). Trying to re-implement standard functions is one of those things that might seem like a good idea but it's a rare case when it is. Trouble is, you can either take my word for it or you can find out for yourself. The former saves you time, money and effort. The latter delivers the same lesson but at a cost to both you and your company in terms of time, money and effort.

>> sometimes I forget what the "%x" formatting character does, reread the doc, and misread it
With respect, that's more user problem than a library problem. The fact you struggle to follow documentation isn't a good reason to re-write the standard library. You'll end up with code you know and understand but no one else does but because you've littered the code with your version others will then start introducing defects at your espense.

>> I don't know if I believe anyone who says otherwise
Heh. Ok :)

>> this function my friends, is PERFECT! BEHOLD:

Without knowing the use case for your function it's hard to qualify it as perfect or otherwise; however:

a) It doesn't handle negative values, yet it's contract alludes that it should.

b) It appears to assume the implementations default characterset is ASCII

c) Returning '?' for invalid conversion defines any standard convension

Let's address these points:

a) if your use case specifically states it only takes positive numbers your contract should support that. In other words, it should take an unsigned type (if nothing more, this is a hint to the user of the function that it won't handle negative numbers).

b) There is nothing in the C Standard to support your assertion that the implementations character set is ASCII. If it's not the results from this function will be undefined (and wrong).

c) Standard convention is to return a non-zero value on error. In this case, because your result set has valid positive values the non-zero it returns on error should be negative (-1, for example). Whilst this isn't a rule, it is a convention that is generally accepted and returning anything else is generally confusing and should be avoided without good reason

This is a perfect example of how a simple "I'll do this myself" function can be responsible for introducing very hard to identify defects because you either were not aware of the possible edge cases or you just didn't consider them. Sure, the issues I've found are easy to fix and may never cause you a problem; however, if you roll your own enough times you will, eventually, introduce a stonking defect that'll cost you (and your company) and lot of time, money and effort).

A safer way to implement your function would be something like this:

char convert_int_to_asciihexcchar2(unsigned int v)
{
switch(v)
{
case 0: return '0';
case 1: return '1';
case 2: return '2';
case 3: return '3';
case 4: return '4';
case 5: return '5';
case 6: return '6';
case 7: return '7';
case 8: return '8';
case 9: return '9';
case 10: return 'A';
case 11: return 'B';
case 12: return 'C';
case 13: return 'D';
case 14: return 'E';
case 15: return 'F';
default: return -1;
}
}

Yes, it's a little more code; however, it addresses:

a) it takes an unsigned int, which helps to enforce the API contract
b) it makes no assumption about the implementations characterset
c) it returns -1, which is likely to be far less confusing to other developers

NB. I could have used drop-through cases to reduce the code but that would then add an assumption about the characterset that this avoids.

That said, I wouldn't implement this function, period. I'd just use one of the standard functions to perform this conversion.

>> And mostly/consequently: it's trivial, and takes me 0.7 seconds to know that it will do
Wonderful, it took me about 7 seconds because I've never seen this code before and so had to read through it to make sure I understood it properly, including what it's preconditions are and what its post conditions are as well and it's invariants. It takes me 0 seconds to know what sprintf will do.

>> it will do exactly what I want it to do
Not always. See my commentary, above.

>> That's just not the case for a call to sprintf
Actually, I know that sprintf will always return exactly the same output for exactly the same input. I know this because the standards documentation tells me it does. I also know that this will be the same, regardless of the implementations characterset.

>> I didn't know what strict aliasing was, not trying to pretend otherwise
And that's the problem - you don't always know what you don't know.

"There are known knowns; there are things we know that we know.
There are known unknowns; that is to say there are things that, we now know we don't know.
But there are also unknown unknowns – there are things we do not know, we don't know.”

—United States Secretary of Defense, Donald Rumsfeld

Anyway, ultimately it's your decision; however, you asked for opinion and so I've given mine :)

Best of luck with whatever way you choose to go.

RonMexico

ASKER

"If you're not capable of reading the documentation for a library and then implementing code to use that library what makes you think you'd be any better at implementing what it does? "

Because I am doing something exceedingly simple, compared to it. Contrast my convert_int_to_asciihexcchar with sprintf. With genericism comes complexity, sometimes relatively massive complexity. That's actually a big part of my thesis here.

"If you are not capable of understanding the library you are almost certainly not capable of implementing what the library does. "

See point above. Also as I promised, if you are saying you never misunderstand the standard libraries, or have never created a bug in fitting data to the libraries, if they never, ever zig when you meant it to zag and you figure that out somewhere downstream... I'm not sure I believe you. My little roll-your-owns never zigzag on me (if we're being precise, then just far less frequently than use of generic libraries), when called by my callers (which is all that will ever call them). That's all I'm saying -- and I actually think that wouldn't be controversial -- and occasionally that makes them the wiser choice.

"I can QA it faster" "What makes you think that? "
My convert_int_to_asciihexcchar vs sprintf example made this obvious, I thought.

"it returns -1, which is likely to be far less confusing to other developers"
Well, this suggests you're missing one of my *key* points: no other developers will use this. No future functions will call this. In my example I am not going for genericism or reusability (and I do recognize those as good things generally.) It is a private helper function inside a unit only for that unit, will never be called by any functions other than those it was designed for. So your findings wouldn't bite (you'll have to trust me since I haven't postulated the only caller that would ever call it :)).

[Genericism is good, but is sometimes a tradeoff with complexity; clarifying my thesis]

"It takes me 0 seconds to know what sprintf will do."
Hm, it would be very challenging to predict everything sprintf it will do every time for all input on review, i.e., you would *never* not notice an incorrect format specifier -- there are 10 or 15 of them, and a page of rules that govern them. In any case if you're that good you are in a small minority. But I'm still sure that sprintf would bite you sooner than *your* version of convert_int_to_asciihexcchar2 (because you wrote it and you know it very well), and that bite will be similarly stonking.

"Anyway, ultimately it's your decision; however, you asked for opinion and so I've given mine :)"

Thank you, I do appreciate that. I happen to have written a white paper recently that takes the opposite stance of yours, so I enjoy this argument. I also realize it's against the grain, but the grain isn't perfect. I like to sharpen these points for use against my coworkers. :)

RonMexico

ASKER

By the way, it's funny you should use that Rumsfeld quote (which I like and also use) -- since I'm the one being more cautious of unknowns (the standard library is well documented and well traveled, but ultimately a bunch of black boxes).

RonMexico

ASKER

Actually I wonder what you'd think of this example, since you mention lists in your essay.

In .NET I used to use built-in collections and generic lists a lot. One one project I was dealing with a massive amount of data, so instead of putting them in lists, I used a makeshift singly linked list like below:

I timed my singly linked list next to the built-in lists (List, Collection, etc), and it was TWICE as fast, and my application blazed. I never went back, I use this pattern all the time now. The speed increase comes with the lack of capability of my built-in list compared to the very functional .NET lists: no random access, no sorting, etc, etc, etc. But 90% of the time I don't need random access -- I just need to save a list for future processing, and then process the list. (I'll bet we would all find this in our use of collections.)

So: I have abandoned the standard classes, and reinvented the wheel, except it is a simpler wheel with only the stuff I need, and as a result it is twice as fast. I have to write about 10-20 lines of code more than I would otherwise (pretty much the whole thing below -- add a "next" member and an Add method), but it is pretty simple and never caused a bug.

Have I reinvented the wheel, and if so was it worth it for a 2x speed increase?

****

Namespace ProcessorOfSavedGroupOfData

private class cItem
{
   int data;
   cItem * ptr_NextItem;
}

private cItem * ptr_FirstItem = null;
private cItem * ptr_LastItem = null;

public AddItem(data)
{

   cItem * item = new cItem(data);

   if (ptr_FirstItem == null)
   {
      ptr_FirstItem = item;
      ptr_LastItem = item;
   }
   else
   {
      ptr_LastItem->ptr_Next = item;
      ptr_LastItem = item;
   }
}

public sub ProcessItems()
{
   cItem * item = ptr_FirstItem;

   while (item != null)
   {
      // do something
      item = item->ptr_Next;
   }
}

public sub ClearItems()
{
   // dereferenced; garbage collection clears
   ptr_FirstItem = null;
   ptr_LastItem = null;
}

End Namespace

Open in new window

evilrix

>> I happen to have written a white paper recently that takes the opposite stance of yours, so I enjoy this argument.

I'd very much enjoy reading that if you are able/happy to share it.

>> so I enjoy this argument.
Ditto. :)

>> Actually I wonder what you'd think of this example
It's slightly hard for me to comment on that as I don't really know/code in .Net. What I do know; however, is that the C++ standard library list implementation (in fact, all containers) has pre-defiend time/space complexities that are laid down in the standard that a compiler must adhere to.

Of course, Big O complexities are relative not absolute and each implementation may differ in absolute performance but, in the general case I'd find it hard to justify creating my own implementation vs. that of the standard library without first performing some profiling and identifying a real problem that needed to be fixed. Then, I'd still consider looking for an alternative implementation and would only consider rolling my own as a last resort.

I think I said in my essay that there is rarely a good reason to roll your own, which does leave the door open for those very special cases - I just think those cases are very rare indeed.

Thanks for a great discussion and, again, if you don't mind sharing your white paper I'd enjoy reading it. If you don't want to post it here you are welcome to e-mail it to me (e-mailing it wouldn't violate the site rules as it isn't directly related to this question); my contact details are in my profile.

Best regards.

evilrix

Oh, and if you're up for it, I'd be quite interested to see and constructively critique the end result of your square root function. :)