• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2173
  • Last Modified:

VBA - Convert all Special Characters to Text

Hello,

There seems to be many ways of doing this - but I'm after a smart function that will forfill all the following string manipulation.

Background : I have a table of Word Documents that are going to sucked out, using stream read - the content becomes a String, which will be indexed by SQL Server indexing.

I have successful got the 'string' but when inserted into the 'pure text' column, contains lots of special characts (appear as square) and such like. Don't really want a clumsy 'replace(str,vbxxx," ")' over and over.

I would like

1. Anything more that a double space to be a single
2. No tabs, lf, crlf or any form of formatting like that
3. Character 1-9, A-Z + "*",".","£" etc ... to be ALLOWED (as people will search on .Net, C++ etc)

So basically, any 'normal' ascii should be fine - but ALL those strange formatting characters to be replaced by single space.

A nice callable function is what I'm after ... myString = CleanMyString(myString)

Thanks y'awl !

S.S.
0
SpencerSteel
Asked:
SpencerSteel
  • 4
1 Solution
 
manthaneinCommented:
Function CleanMyString(strwhat As String) As String

Dim icount As Integer
Dim iFirst As Integer
CleanMyString = ""
iFirst = 0
For icount = 1 To Len(strwhat)
    If Asc(Mid(strwhat, icount, 1)) > 32 And Asc(Mid(strwhat, icount, 1)) < 127 Then
        CleanMyString = CleanMyString & Mid(strwhat, icount, 1)
        iFirst = 0
    ElseIf Asc(Mid(strwhat, icount, 1)) = 32 Then
        iFirst = iFirst + 1
        If iFirst < 3 Then
            CleanMyString = CleanMyString & Mid(strwhat, icount, 1)
        End If
    End If
Next
CleanMyString = Replace(CleanMyString, "  ", " ")
End Function
0
 
SpencerSteelAuthor Commented:
this looks great ... i say looks as my front-end *&*($ing database app has just got irreverseabley corrupt and guess who doesn't have a backup from the last day. (it opens - but no objects)

Gimme some time to get back to where I was and I'll close this question asap.

Thanks

S.S.
0
 
burtdavCommented:
If you Add Reference to something like Microsoft VBScript Regular Expressions, you can implement CleanMyString with just a few calls to RegExp.Replace:

set re = new RexExp
' rule 2 - delete tabs, newlines, carriage returns, form feeds
re.Pattern="[\t\r\n\l]"
myString=re.Replace(myString, "")
' rule 1 - adjacent whitespace becomes a single space
re.Pattern="\s+"
myString=re.Replace(myString, " ")
' rule 3 - delete everything except spaces, numbers, word chars (a-z, A-Z), *s, .s, #s
re.Pattern="[^ \d\w\*\.\#]"
myString=re.Replace(myString, "")
CleanMyString=myString
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
SpencerSteelAuthor Commented:
Burtdav - don't understand that at all really (sorry!)

The other solution is working great, except one minor flaw - here's an example of the output

...Curriculum Vitae ProfileAn IT professional with over nine years functional...

In the original doc, the CV Profile would of been a heading, so you see we kind need an extra space of those CTRLs and I'm being REEEEEEALY lazy here - but i'm up to my eyeballs ... finish it off and the points are yours

Many thanks

S.S.
0
 
SpencerSteelAuthor Commented:
Doh...all I needed was

   CleanMyString = Replace(CleanMyString, Chr(13), " ")

at the end, just before

   CleanMyString = Replace(CleanMyString, "  ", " ")

Thanks

Great routine

Points awarded !

Take care

S.S.
0
 
SpencerSteelAuthor Commented:
Just what I wanted - quickly !

Thanks

S.S.
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now