Solved

Remove whitespace from stringlist

Posted on 2006-11-27
13
1,229 Views
Last Modified: 2008-01-27
Hi,

I have a function that removes whitespace from a memo, but it dosn't work in VCL.NET. Something about unsafe code and invalid typecasts...

Could somebody rewrite it without using pointers? and adjust it to work on a stringlist instead of a memo.

procedure StripWhiteSpaces(AMemo: TMemo);
var p1,p2: pchar;
    s: string;
    i: integer;
begin
  s:= AMemo.Lines.Text;
  if s = '' then exit;

  p1:= pchar(s);
  p2:= p1;
  while p2^ <> #0 do
  begin
    if (p2^ = ' ') AND ((pchar(p2+1)^ = ' ') or (pchar(p2-1)^ in [#13,#10]))
      then Delete(s, p2 - p1 + 1,1)
      else inc(p2);
  end;
  AMemo.Lines.Text:= trim(s);
  for i:= AMemo.Lines.Count-1 downto 0 do
    if AMemo.Lines[i] = ''
      then AMemo.Lines.Delete(i);
end;

Thanks
0
Comment
Question by:zattz
13 Comments
 
LVL 28

Accepted Solution

by:
2266180 earned 125 total points
ID: 18017834
well, according to your other question, you should have read a little through that thread. it has valuable information and better yet, it has quite a few procedures for you to use.

for examnple the first one from alex:

function StripMultipleSpaces(const Value: string): string;
var
  I, J: Integer;
  Max: Integer;
begin
  Result := Value;
  I := Pos(#32#32, Result);
  if (I > 0) then
  begin
    Max := Length(Result);
    repeat
      Inc(I);
      J := Succ(I);
      while (J <= Max) and (Result[J] = #32) do
        Inc(J);
      while (J <= Max) do
      begin
        if (Result[Pred(I)] = #32) then
        begin
          while (J <= Max) and (Result[J] = #32) do
            Inc(J);
        end;
        if (J <= Max) then
        begin
          Result[I] := Result[J];
          Inc(I);
          Inc(J);
        end;
      end;
    until (J > Max);
    SetLength(Result, Pred(I));
  end;
end;

call it like:
var s:tstrings;
begin
  s.text:=StripMultipleSpaces(s.text);

or any other solution that doesn't use pointers.
0
 
LVL 15

Expert Comment

by:mikelittlewood
ID: 18017946
Is there no StringReplace function you can use to remove white space?
0
 
LVL 4

Assisted Solution

by:David_Ward
David_Ward earned 125 total points
ID: 18019080
Function StripMultipleSpaces(FromThis: String): String;
begin
  result:=fromthis;
  while pos(#32#32,result)<>0 do
    delete(result,pos(#32#32,result),1);
end;

// example usage:

// YourString:=StripMultipleSpaces(ThisMemo.Text);
// YourString:=StripMultipleSpaces(AnyTStringList.Text);
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 28

Expert Comment

by:2266180
ID: 18019150
just in case anyone wonders about which thread I was talking about :)
http://www.experts-exchange.com/Programming/Programming_Languages/Delphi/Q_21980883.html

@David: you might want to get an eye in there ;)
0
 
LVL 17

Assisted Solution

by:Wim ten Brink
Wim ten Brink earned 125 total points
ID: 18083306
Guys, nice to point to my examples but the question is for a .NET solution. :-)

Okay, the BETTER link is http://www.experts-exchange.com/Programming/Programming_Languages/Delphi/Q_21982971.html which was the .NET version of my challenge. Something about using StringBuilder.Replace instead of the normal Delphi string-handling functions. Keep in mind that .NET is by itself a huge library with all kinds of useful functions. Using the "oldfashioned" Delphi solutions is actually a bit inappropiate in these cases. If you're lucky, the Delphi solution will be slower than the .NET solution and in the worst case, the Delphi solution won't even compile in .NET.

Also keep in mind that Delphi for .NET might be very similar to the Delphi that we outselves are used to, but on a lower level, it is very different. For example, in WIN32 we know a string as just an array of characters, with each character taking up one byte. In .NET a string is an object and accessing each and every character of the string would create a new object for every character! Thus, the WIN32 solutions that would be highly optimized in WIN32 will actually generate a huge amount of overhead in .NET. Which is why you better use the .NET libraries for string manipulations. And don't worry because those .NET libraries are fast enough.

StringBuilder.Replace seems to be the proper way.
0
 
LVL 28

Expert Comment

by:2266180
ID: 18083393
hm... seems I ought to put my hands on the delphi .net internals or stop reading questions related to it :))

>accessing each and every character of the string would create a new object for every character!
I wouldn't have imagined something this stupid in my life. it is logical that going from delphi to delphi .net a programmer would still be using the delphi way, so it would be logical to store the string as array of chars and passing references to the char object instead and not creating a char object at every read request. looking at it as a tobjectlist descendant for example. but who knows what optimizations were at the base of that decision.
0
 
LVL 17

Expert Comment

by:Wim ten Brink
ID: 18084949
I agree that it seems stupid to consider each character in a string as a single object but don't forget that the string itself is still just a single object containing several bytes of data. However, when you're going to access the characters of the string character by character, then each and every character will be converted to a Char object with it's additional overhead. Basically just because the whole .NET environment is 100% object-oriented. You just don't have any data types that aren't objects.

It's actually one of the nastier pitfalls when you start converting from WIN32 to .NET code. In WIN32 you have data types that are pure Data. In .NET you won't have any of those...

Still, it's not that stupid if you do things the smart way. For example, you have a String class but within this class you can add a lot of methods that modify the internal data of this class. And internally, the string class might have raw data access and thus be able to access every character without using the object wrappers for them. It just depends on how they optimized it and which functionality they have added to the string class in .NET. Unfortunately I don't have access to any .NET help files right now, but the principle is quite simple. Everything is an object. Only on the deeper lower levels you might get at a point where you're accessing the raw data.

And to be honest, Borland had to try and keep everything VCL-compatible but when you're doing .NET development with Delphi, you better just forget about the VCL for any new projects and only use it to convert existing projects to .NET. The VCL is still supported, of course, but it has been defeated by .NET itself.
0
 
LVL 28

Expert Comment

by:2266180
ID: 18084985
>I agree that it seems stupid to consider each character in a string as a single object
you didn't understood me: not consider: create. you said that a new object is created for each access. or maybe you meant consider in your initial post as well?

I know about what makes java and .net to be so OO (I have an EE master certificate in C# :P ), it's just that what you said initially might be incorrect (though I didn't think of this aspect until now; I took it for granted, every single word :P ).
0
 
LVL 17

Expert Comment

by:Wim ten Brink
ID: 18086145
Actually, if you would use MyString[1], it would create a new object of type Char to hold that single character. If you therefore walk through a string of 30 characters like this:
for I := 1 to 30 do AChar := MyString[I];
You would be creating 30 character objects this way, which the garbage collector will have to free again. However, the string itself isn't made of 30 objects. It's just a single object which has reserved 30 bytes in some buffer for processing. And it has some built-in functionality for faster processing. Thus:
* a string is a single object.
* Accessing individual characters of a string will generate a new object for every character, thus creating additional overhead.
* Internally, the string class is better optimized to handle it's data without creating those additional objects.
0
 
LVL 28

Expert Comment

by:2266180
ID: 18088334
got it. thanks for the extra info ;) starts to sound logical :)
0
 
LVL 17

Expert Comment

by:Wim ten Brink
ID: 18091607
Well, it is all very logical. It's just that even very experienced .NET developers might forget about these things. :-)
And to be honest, it's just more a matter of optimizations. No matter which programming language or environment you're using but there will always be lots of possible solutions. It's just that most of them aren't the most optimal, speed-wise. Even in WIN32 Delphi programming I see lots of code where the programmer is using simple string concatenations to add characters to a string from within a loop. something like this:

AString := '';
for I := 1 to Length(BString) do begin
  if (BString[I] in ['A'..'Z']) then AString := AString + BString[I];
end;

You might actually find quite a few functions that work like this, even in the Delphi VCL itself. Of course, while it looks okay, it starts to become very slow when you have to execute this loop about a million times or so. And in those cases you need to use more optimized solutions. This example could be improved a lot by changing it into this:

SetLength(AString, Length(BString));
J := 0;
for I := 1 to Length(BString) do begin
  if (BString[I] in ['A'..'Z']) then begin
    J := J + 1;
    AString[J] := BString[I];
  end;
end;
SetLength(AString, J);

More code, thus it might seem a bit slower. But since we don't have to allocate new blocks of string data for the string concatenations, it tends to be faster. Especially with large loops and huge numbers of runs. (I hope I didn't make any typo's, btw. Didn't compile above code.:-)

With .NET you have the same dilemma but then with one additional bite: If you access the string data character by character then you're creating a new object for each and every character! So in this case you would like to use a more optimized solution too, like by using a buffer. And this is where you might want to use a StringBuilder class. See also http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpconusingstringbuilderclass.asp for more about the StringBuilder. Which would assume the Replace method of the StringBuilder class which should be the most useful for this case. :-)

Of course, there's nothing wrong by using the old-fashioned Delphi ways to walk through a string character by character but as I already said, it's not the most optimal way.

Btw, also keep in mind that every time you use a string method to manipulate the string, it will create a new string object and won't modify the existing string object. It's just one of the minor things people might forget. :-)
0
 
LVL 3

Assisted Solution

by:fjocke
fjocke earned 125 total points
ID: 18167665
Memo1.Lines.Text := StringReplace(Memo1.Lines.Text, ' ', '',
 [rfReplaceAll, rfIgnoreCase]);

Or if you need a function of it.

function RemoveWhiteSpace(Source : String) : String;
begin
Result := StringReplace(Source, ' ', '',
 [rfReplaceAll, rfIgnoreCase]);
end;

Will be used like
Memo1.Lines.Text := RemoveWhiteSpace(Memo1.Lines.Text);

Hope that helps you.

Cheers

Jocke
0
 

Author Comment

by:zattz
ID: 18168485
Thanks guys,

will take a look soon
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Create Database on Android via Delphi dbExpress 3 129
select query - oracle 16 101
tidtcpserver connection lost handle 2 109
Create a path if not exists 7 105
Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…
Hello everybody This Article will show you how to validate number with TEdit control, What's the TEdit control? TEdit is a standard Windows edit control on a form, it allows to user to write, read and copy/paste single line of text. Usua…

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question