Solved

Getting rid of Word control characters before writing text to another application

Posted on 2004-08-08
10
138 Views
Last Modified: 2010-05-19
I'm developing a script for work using VBA that reads information from a MS Word Document, uses Split to create an array, then writes the contents of the array into an Excel file. The Word document is  a highly formatted Form and I need the data to be placed in Excel so that it is delimited by the text in each section (one cell per line of text, basically).

Everything is almost great using Chr(13) as my Split delimiter character, this splits the text correctly but I still get some  looking characters outputted to the Excel sheet. I want to suppress these  control characters but have so far had no success in doing it. I need the characters to be gone so that I can next use ADODB to create a recordset from the Excel sheet and I only want text in the recordset. I've marched through every ASCII charachter from 0 to 255 with no success. How do I get rid of the remaining  control characters before they go into the array or before they go into the Excel sheet?

Here is an example of the output.....     PUT IN  VERIFIED                            PUT IN  VERIFIED                           PUT IN VERIFIED                              PUT IN  VERIFIED
                                 BY          BY                                        BY        BY                                       BY         BY                                         BY          BY

ALDA


KYZF









ALDB


KYZG








0
Comment
Question by:philTN
10 Comments
 
LVL 22

Expert Comment

by:DarkoLord
ID: 11748387
Read the data back from Excel and use Asc() function to find out which character is that

Darko
0
 
LVL 7

Expert Comment

by:Burbble
ID: 11749292
(what he said)...and then use strData = Replace(strData, Chr(x), "")

-Burbble
0
 

Author Comment

by:philTN
ID: 11798719
Tried that but it won't return an ASC code for the character. Instead it spits out the first two codes (horizontal tab and space), then returns an error saying "invalid procedure or argument" when it encounters one of the square looking characters.

By saving to a text file using FSO and splitting the array on Chr(13), I've been able to figure out that the characters in question seem to be placeholders for blank cells within the original Word table. The table is composed of 3 columns, only one of which currently contains data. Although there is no data or extraneous spaces in the other two cells, these square looking characters get written to the text file and match up exactly with where there are blank cells in the table. The current output to the text file looks like this:

\ALDA\\\KYZF\\\\\\\\\\ALDB\\\

It also seems to be saving a delimiter of sort  between tables(i have 4 tables across with 3 columns each).
0
Courses: Start Training Online With Pros, Today

Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

 

Author Comment

by:philTN
ID: 11801658
I can further clarify after turning on the formatting marks in the original Word 2000 document that the characters are the paragraph marks. These marks are not eliminated by the Split command. So how do I remove them either before or after saving to Excel or plain text? I've already tried Trim and Replace but that was because i thought they wre rtf codes. All i got in return was that it was a type mismatch for "\" or"{".
0
 
LVL 7

Expert Comment

by:Burbble
ID: 11804787
Paragraph marks... maybe Chr(10) and Chr(13)?

Try...

strData = Replace(strData, Chr$(10), "")
strData = Replace(strData, Chr$(13), "")

-Burbble
0
 

Author Comment

by:philTN
ID: 12178430
Thanks, I tried that but it still didn't work. When I tried reading in the ASCII code for the character it kept telling me (13) but it wouldn't Replace it.

I found what I was looking for through further research. By calling Excel's CLEAN function from VB I was able to make another copy of the seadsheet with all non-printing characters deleted, with one line of code. Thanks again to both of you for your help.

phil
0
 
LVL 7

Expert Comment

by:Burbble
ID: 12183246
Ah, ok. Glad you got it fixed :)

-Burbble
0
 

Accepted Solution

by:
modulo earned 0 total points
ID: 12503580
PAQed, with points refunded (75)

modulo
Community Support Moderator
0

Featured Post

Gigs: Get Your Project Delivered by an Expert

Select from freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Have you ever wanted to restrict the users input in a textbox to numbers, and while doing that make sure that they can't 'cheat' by pasting in non-numeric text? Of course you can do that with code you write yourself but it's tedious and error-prone …
Most everyone who has done any programming in VB6 knows that you can do something in code like Debug.Print MyVar and that when the program runs from the IDE, the value of MyVar will be displayed in the Immediate Window. Less well known is Debug.Asse…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question