Solved

how to extract numbers from string

Posted on 2013-06-04
12
463 Views
Last Modified: 2013-06-07
I have a string

OWNER_5477854

And i would like to simply return the sum of just the numbers. If my string does not contain any numbers, i would like any 2 digit number returned.
0
Comment
Question by:edvinson
  • 5
  • 4
  • 2
  • +1
12 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 39220548
Does 00 count as "any 2 digit number"?
0
 
LVL 1

Author Comment

by:edvinson
ID: 39220660
Sure  doesn't matter what.
0
 
LVL 32

Accepted Solution

by:
phoffric earned 500 total points
ID: 39221072
#include <stdio.h>
#include <string>
#include <iostream>
using namespace std;

int sumOfDigits( string text ) {
   int sum = 0;
   
   for( size_t i=0; i<text.size(); ++i ) { // loop over each char
      const char character = text[i];      // save the char
      if( isdigit( character ) ) {             // if the char is a digit, then we can
         char digitBuf[2] = {character, '\0'}; // convert it to a c-style string, so that
         sum += atoi( digitBuf );              // we can convert it to an int.
      }                                        // Could have just done a (character - '0')
   }                                           // to convert to int if standard ASCII code used
   return sum;
}

int main() {
   string text = "OWNER_5477854";
   char sumString[20]; // string that holds the sum
   sprintf( sumString, "%02d", sumOfDigits("OWNER_5477854") );
   cout << "Sum = " << sumString << endl;
   sprintf( sumString, "%02d", sumOfDigits("OWNER_NO_DIGITS") );
   cout << "Sum No Digits Present:  " << sumString << endl;
}

Open in new window

Output is:
Sum = 40
Sum No Digits Present: 00

Open in new window

0
 
LVL 1

Author Closing Comment

by:edvinson
ID: 39221381
Wow, that is a fantastic solution! Very well documented, thank you very much. More importantly than working code, i understand the way you presented it.
0
 
LVL 51

Expert Comment

by:Julian Hansen
ID: 39221593
Points already assigned - this is just for interest.

Just cos I am old school and this was an intersting problem - here is a solution that is about 100x faster than the one posted - actually there are two solutions depending on the question

If
a) from OW_49_NER_5477854 you want the answer to be 53
OR
b) from OW_49_NER_5477854 you want the answer to be 13 (i.e. break after the number is broken
Option 1
int SoD(const char * text)
{
	const char *s = text;
	int sum = 0;
	while(*s) {
		char d = *(s++) - 48; // Get digit value
		sum += (d >= 0 && d <=9) ? d : 0; // add to sum only if between 0-9
	}

	return sum;
}

Open in new window

Option 2
int SoD2(const char * text)
{
	const char *s = text;
	int sum = 0;
	bool flag = false; // Flag to tell us if we have found a number yet
	while(*s) {
		char d = *(s++) - 48;
		if (d >=0 && d<=9){
			flag = true; // found one
		}
		else if (flag) break; // this char is not a number and we already found a number so break
                // If we are in a number string then add to sum (can probably drop the condition
		if(flag){
			sum += (d >= 0 && d <=9) ? d : 0;
		}
	}

	return sum;
}

Open in new window

0
 
LVL 32

Expert Comment

by:phoffric
ID: 39228008
>>that is about 100x faster than the one posted
Is there a typo here? If not, try measuring and let me know what you actually discover. Why don't you post the measuring driver that you used - I kind of think your performance factors are off.
Also, which one posted are you referring to, since I posted two algorithms?
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 
LVL 51

Expert Comment

by:Julian Hansen
ID: 39228200
Ok here is what I did.

I put the accepted solution into a function and the posted code above.

Then for each of the functions I ran a loop of 100,000 iterations surround by

long start = GetTickCount();
... loop ...
long end = GetTickCount();

I then dumped the difference between the start and end times for each of the loops.

For the accepted solution the ranges in time were between 1500ms and 2100ms

For the code posted above the ranges in time were between 15ms and 20ms which indicates an approximate factor of 100 in terms of increase in speed.

Of course I may be totally off here because I knocked this together in a couple of minutes but it makes sense - the accepted solution is doing a lot of unnecessary work to achieve the same result. Also, in the grand scheme of things it makes absolutely no difference because you have to call the function 100,000 times before it makes a significant difference. I am just an old school programmer who looks for the optimal solution - which is why I posted it for interest.

Full source here

// ee1.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include "windows.h"
#include <stdio.h>
#include <iostream>
using namespace std;

int SoD(const char * text)
{
	const char *s = text;
	int sum = 0;
	while(*s) {
		char d = *(s++) - 48;
		sum += (d >= 0 && d <=9) ? d : 0;
	}

	return sum;
}

int SoD2(const char * text)
{
	const char *s = text;
	int sum = 0;
	bool flag = false;
	while(*s) {
		char d = *(s++) - 48;
		if (d >=0 && d<=9){
			flag = true;
		}
		else if (flag) break;

		if(flag){
			sum += (d >= 0 && d <=9) ? d : 0;
		}
	}

	return sum;
}

int sumOfDigits( string text ) {
   int sum = 0;
   
   for( size_t i=0; i<text.size(); ++i ) { // loop over each char
      const char character = text[i];      // save the char
      if( isdigit( character ) ) {             // if the char is a digit, then we can
         char digitBuf[2] = {character, '\0'}; // convert it to a c-style string, so that
         sum += atoi( digitBuf );              // we can convert it to an int.
      }                                        // Could have just done a (character - '0')
   }                                           // to convert to int if standard ASCII code used
   return sum;
}

void test1()
{
	const char * input  = "OWNER_5477854";
	char sumString[20];

	int sum;
	long end, start = GetTickCount();
	for(int i = 0; i < 100000; i++) sum = SoD(input);
	end = GetTickCount();
	sprintf( sumString, "%02d (%ld)", sum, (end - start));	
	cout << "Sum = " << sumString << endl ;
}

void test2()
{
   string text = "OWNER_5477854";
   int sum;
   char sumString[20]; // string that holds the sum
	long end, start = GetTickCount();
	for(int i = 0; i < 100000; i++) sum = sumOfDigits("OWNER_5477854");
	end = GetTickCount();
   sprintf( sumString, "%02d (%ld)", sum, (end - start) );
   cout << "Sum = " << sumString << endl;
   sprintf( sumString, "%02d", sumOfDigits("OWNER_NO_DIGITS") );
   cout << "Sum No Digits Present:  " << sumString << endl;
}

int _tmain(int argc, _TCHAR* argv[])
{
	test1();
	test2();

	int r ;
	cin >> r;
	return 0;
}

Open in new window

0
 
LVL 32

Expert Comment

by:phoffric
ID: 39228263
What results did you get for my other algorithm? See lines 14-15
0
 
LVL 51

Expert Comment

by:Julian Hansen
ID: 39228327
This is the output from the code above - what other algorithm are you referring to not sure what the 14-15 is pointing at?

For 100,000 iterations

Sum = 40 (16)
sumOfDigits = 40 (1515)

For 1,000,000 iterations

Sum = 40 (156)
Sum = 40 (13812)

If you want to take this offline mail me - address is in my profile.
0
 
LVL 32

Expert Comment

by:phoffric
ID: 39229639
           // we can convert it to an int.
           // Could have just done a (character - '0')
This is the other algorithm that I suggested to simplify things. I showed general usage of library functions for educational purposes, but then added this extra option. And I do not believe the numeric representation has to be ASCII, as I mentioned.
0
 
LVL 51

Expert Comment

by:Julian Hansen
ID: 39229858
Confused as to where this is going - I posted a solution (out of interest - after points were assigned) making the point that the solution was potentially 100x faster than the code in the accepted solution. You questioned where I got that result from - I posted code for that. What are we trying to achieve here?
0
 
LVL 32

Expert Comment

by:phoffric
ID: 39230621
I was wondering whether you got a 100x faster measurement for the alternative algorithm I proposed, where we replace my code where we "Could have just done a (character - '0')" to get the int. You explained that you only tested with the first algorithm presented. Thanks for including your test program. Concluded.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Introduction This article is the first in a series of articles about the C/C++ Visual Studio Express debugger.  It provides a quick start guide in using the debugger. Part 2 focuses on additional topics in breakpoints.  Lastly, Part 3 focuses on th…
Container Orchestration platforms empower organizations to scale their apps at an exceptional rate. This is the reason numerous innovation-driven companies are moving apps to an appropriated datacenter wide platform that empowers them to scale at a …
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.
The goal of this video is to provide viewers with basic examples to understand how to create, access, and change arrays in the C programming language.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now