Solved

Remove whitespace from stringlist

Posted on 2006-11-27
13
1,225 Views
Last Modified: 2008-01-27
Hi,

I have a function that removes whitespace from a memo, but it dosn't work in VCL.NET. Something about unsafe code and invalid typecasts...

Could somebody rewrite it without using pointers? and adjust it to work on a stringlist instead of a memo.

procedure StripWhiteSpaces(AMemo: TMemo);
var p1,p2: pchar;
    s: string;
    i: integer;
begin
  s:= AMemo.Lines.Text;
  if s = '' then exit;

  p1:= pchar(s);
  p2:= p1;
  while p2^ <> #0 do
  begin
    if (p2^ = ' ') AND ((pchar(p2+1)^ = ' ') or (pchar(p2-1)^ in [#13,#10]))
      then Delete(s, p2 - p1 + 1,1)
      else inc(p2);
  end;
  AMemo.Lines.Text:= trim(s);
  for i:= AMemo.Lines.Count-1 downto 0 do
    if AMemo.Lines[i] = ''
      then AMemo.Lines.Delete(i);
end;

Thanks
0
Comment
Question by:zattz
13 Comments
 
LVL 28

Accepted Solution

by:
2266180 earned 125 total points
ID: 18017834
well, according to your other question, you should have read a little through that thread. it has valuable information and better yet, it has quite a few procedures for you to use.

for examnple the first one from alex:

function StripMultipleSpaces(const Value: string): string;
var
  I, J: Integer;
  Max: Integer;
begin
  Result := Value;
  I := Pos(#32#32, Result);
  if (I > 0) then
  begin
    Max := Length(Result);
    repeat
      Inc(I);
      J := Succ(I);
      while (J <= Max) and (Result[J] = #32) do
        Inc(J);
      while (J <= Max) do
      begin
        if (Result[Pred(I)] = #32) then
        begin
          while (J <= Max) and (Result[J] = #32) do
            Inc(J);
        end;
        if (J <= Max) then
        begin
          Result[I] := Result[J];
          Inc(I);
          Inc(J);
        end;
      end;
    until (J > Max);
    SetLength(Result, Pred(I));
  end;
end;

call it like:
var s:tstrings;
begin
  s.text:=StripMultipleSpaces(s.text);

or any other solution that doesn't use pointers.
0
 
LVL 15

Expert Comment

by:mikelittlewood
ID: 18017946
Is there no StringReplace function you can use to remove white space?
0
 
LVL 4

Assisted Solution

by:David_Ward
David_Ward earned 125 total points
ID: 18019080
Function StripMultipleSpaces(FromThis: String): String;
begin
  result:=fromthis;
  while pos(#32#32,result)<>0 do
    delete(result,pos(#32#32,result),1);
end;

// example usage:

// YourString:=StripMultipleSpaces(ThisMemo.Text);
// YourString:=StripMultipleSpaces(AnyTStringList.Text);
0
3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

 
LVL 28

Expert Comment

by:2266180
ID: 18019150
just in case anyone wonders about which thread I was talking about :)
http://www.experts-exchange.com/Programming/Programming_Languages/Delphi/Q_21980883.html

@David: you might want to get an eye in there ;)
0
 
LVL 17

Assisted Solution

by:Wim ten Brink
Wim ten Brink earned 125 total points
ID: 18083306
Guys, nice to point to my examples but the question is for a .NET solution. :-)

Okay, the BETTER link is http://www.experts-exchange.com/Programming/Programming_Languages/Delphi/Q_21982971.html which was the .NET version of my challenge. Something about using StringBuilder.Replace instead of the normal Delphi string-handling functions. Keep in mind that .NET is by itself a huge library with all kinds of useful functions. Using the "oldfashioned" Delphi solutions is actually a bit inappropiate in these cases. If you're lucky, the Delphi solution will be slower than the .NET solution and in the worst case, the Delphi solution won't even compile in .NET.

Also keep in mind that Delphi for .NET might be very similar to the Delphi that we outselves are used to, but on a lower level, it is very different. For example, in WIN32 we know a string as just an array of characters, with each character taking up one byte. In .NET a string is an object and accessing each and every character of the string would create a new object for every character! Thus, the WIN32 solutions that would be highly optimized in WIN32 will actually generate a huge amount of overhead in .NET. Which is why you better use the .NET libraries for string manipulations. And don't worry because those .NET libraries are fast enough.

StringBuilder.Replace seems to be the proper way.
0
 
LVL 28

Expert Comment

by:2266180
ID: 18083393
hm... seems I ought to put my hands on the delphi .net internals or stop reading questions related to it :))

>accessing each and every character of the string would create a new object for every character!
I wouldn't have imagined something this stupid in my life. it is logical that going from delphi to delphi .net a programmer would still be using the delphi way, so it would be logical to store the string as array of chars and passing references to the char object instead and not creating a char object at every read request. looking at it as a tobjectlist descendant for example. but who knows what optimizations were at the base of that decision.
0
 
LVL 17

Expert Comment

by:Wim ten Brink
ID: 18084949
I agree that it seems stupid to consider each character in a string as a single object but don't forget that the string itself is still just a single object containing several bytes of data. However, when you're going to access the characters of the string character by character, then each and every character will be converted to a Char object with it's additional overhead. Basically just because the whole .NET environment is 100% object-oriented. You just don't have any data types that aren't objects.

It's actually one of the nastier pitfalls when you start converting from WIN32 to .NET code. In WIN32 you have data types that are pure Data. In .NET you won't have any of those...

Still, it's not that stupid if you do things the smart way. For example, you have a String class but within this class you can add a lot of methods that modify the internal data of this class. And internally, the string class might have raw data access and thus be able to access every character without using the object wrappers for them. It just depends on how they optimized it and which functionality they have added to the string class in .NET. Unfortunately I don't have access to any .NET help files right now, but the principle is quite simple. Everything is an object. Only on the deeper lower levels you might get at a point where you're accessing the raw data.

And to be honest, Borland had to try and keep everything VCL-compatible but when you're doing .NET development with Delphi, you better just forget about the VCL for any new projects and only use it to convert existing projects to .NET. The VCL is still supported, of course, but it has been defeated by .NET itself.
0
 
LVL 28

Expert Comment

by:2266180
ID: 18084985
>I agree that it seems stupid to consider each character in a string as a single object
you didn't understood me: not consider: create. you said that a new object is created for each access. or maybe you meant consider in your initial post as well?

I know about what makes java and .net to be so OO (I have an EE master certificate in C# :P ), it's just that what you said initially might be incorrect (though I didn't think of this aspect until now; I took it for granted, every single word :P ).
0
 
LVL 17

Expert Comment

by:Wim ten Brink
ID: 18086145
Actually, if you would use MyString[1], it would create a new object of type Char to hold that single character. If you therefore walk through a string of 30 characters like this:
for I := 1 to 30 do AChar := MyString[I];
You would be creating 30 character objects this way, which the garbage collector will have to free again. However, the string itself isn't made of 30 objects. It's just a single object which has reserved 30 bytes in some buffer for processing. And it has some built-in functionality for faster processing. Thus:
* a string is a single object.
* Accessing individual characters of a string will generate a new object for every character, thus creating additional overhead.
* Internally, the string class is better optimized to handle it's data without creating those additional objects.
0
 
LVL 28

Expert Comment

by:2266180
ID: 18088334
got it. thanks for the extra info ;) starts to sound logical :)
0
 
LVL 17

Expert Comment

by:Wim ten Brink
ID: 18091607
Well, it is all very logical. It's just that even very experienced .NET developers might forget about these things. :-)
And to be honest, it's just more a matter of optimizations. No matter which programming language or environment you're using but there will always be lots of possible solutions. It's just that most of them aren't the most optimal, speed-wise. Even in WIN32 Delphi programming I see lots of code where the programmer is using simple string concatenations to add characters to a string from within a loop. something like this:

AString := '';
for I := 1 to Length(BString) do begin
  if (BString[I] in ['A'..'Z']) then AString := AString + BString[I];
end;

You might actually find quite a few functions that work like this, even in the Delphi VCL itself. Of course, while it looks okay, it starts to become very slow when you have to execute this loop about a million times or so. And in those cases you need to use more optimized solutions. This example could be improved a lot by changing it into this:

SetLength(AString, Length(BString));
J := 0;
for I := 1 to Length(BString) do begin
  if (BString[I] in ['A'..'Z']) then begin
    J := J + 1;
    AString[J] := BString[I];
  end;
end;
SetLength(AString, J);

More code, thus it might seem a bit slower. But since we don't have to allocate new blocks of string data for the string concatenations, it tends to be faster. Especially with large loops and huge numbers of runs. (I hope I didn't make any typo's, btw. Didn't compile above code.:-)

With .NET you have the same dilemma but then with one additional bite: If you access the string data character by character then you're creating a new object for each and every character! So in this case you would like to use a more optimized solution too, like by using a buffer. And this is where you might want to use a StringBuilder class. See also http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpconusingstringbuilderclass.asp for more about the StringBuilder. Which would assume the Replace method of the StringBuilder class which should be the most useful for this case. :-)

Of course, there's nothing wrong by using the old-fashioned Delphi ways to walk through a string character by character but as I already said, it's not the most optimal way.

Btw, also keep in mind that every time you use a string method to manipulate the string, it will create a new string object and won't modify the existing string object. It's just one of the minor things people might forget. :-)
0
 
LVL 3

Assisted Solution

by:fjocke
fjocke earned 125 total points
ID: 18167665
Memo1.Lines.Text := StringReplace(Memo1.Lines.Text, ' ', '',
 [rfReplaceAll, rfIgnoreCase]);

Or if you need a function of it.

function RemoveWhiteSpace(Source : String) : String;
begin
Result := StringReplace(Source, ' ', '',
 [rfReplaceAll, rfIgnoreCase]);
end;

Will be used like
Memo1.Lines.Text := RemoveWhiteSpace(Memo1.Lines.Text);

Hope that helps you.

Cheers

Jocke
0
 

Author Comment

by:zattz
ID: 18168485
Thanks guys,

will take a look soon
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
proper way to parse text with delphi 7 122
Communication Between RC4 Delphi <-> PHP 3 115
how to center only a line in richedit? 4 58
can't find the executable in Simulator 1 90
Introduction The parallel port is a very commonly known port, it was widely used to connect a printer to the PC, if you look at the back of your computer, for those who don't have newer computers, there will be a port with 25 pins and a small print…
In my programming career I have only very rarely run into situations where operator overloading would be of any use in my work.  Normally those situations involved math with either overly large numbers (hundreds of thousands of digits or accuracy re…
This Micro Tutorial demonstrates using Microsoft Excel pivot tables, how to reverse engineer competitors' marketing strategies through backlinks.
This video shows how to quickly and easily add an email signature for all users on Exchange 2016. The resulting signature is applied on a server level by Exchange Online. The email signature template has been downloaded from: www.mail-signatures…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question