Solved

General algoritm for whitespace normalization

Posted on 2003-11-23
2
320 Views
Last Modified: 2010-04-16
Hi all. I'm often having a need for removing whitespace in strings the following way:

1. Trim left and right
2. Replace all tabs (0x9) and linebreaks (0xA + 0xD) with spaces (0x20)
3. Turn all sequences of spaces to just one space

The third step is always a problem. The solutions I've been able to come up with are very cumbersome. But this must be a common problem, so I hope there's a general algoritm for this. Show me, please.
0
Comment
Question by:liljegren
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 48

Accepted Solution

by:
AlexFM earned 250 total points
ID: 9805668
Pseudo-code:

Input: string 1
Output: string 2 (initially empty)

For each character in string1
{
    if ( character != space  or  character number == 0  or  previous character != space )
    {
        add character to string2
    }
}

return string2
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 9805700
Use regular expressions. They are very strong in text matching and manipulating.

string result = Regex.Replace(input, @"((?<=\S)(?<1>(\s))\s*(?=\S))|(\s*)", "$1",  RegexOptions.Multiline | RegexOptions.ExplicitCapture);

Matches every first whitespace of whitespaces which have a non whitespace to the left and to the right.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

We all know that functional code is the leg that any good program stands on when it comes right down to it, however, if your program lacks a good user interface your product may not have the appeal needed to keep your customers happy. This issue can…
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
This video shows how to use Hyena, from SystemTools Software, to update 100 user accounts from an external text file. View in 1080p for best video quality.

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question