Do not use on any
shared computer
September 5, 2008 08:38pm pdt
 
[x]
Attachment Details

Extracting text with special characters from MS Word to VB.NET

Tags: vb.net
I am trying to automatically extract text from Word files (see attached example) and into VB using the below code snippet. The data contains special characters; my main concern is the inequality signs (such as as greater or equal sign, unicode hex 2265).

The problem is that when the text is pulled from the Word file and turned into a string, the special characters are lost... either they are turned into a square or they are "converted down", ie:  The greater than or equal sign is turned into the greater than sign.

My goal is to extract the contents of the Word file into a manageable string that I can further parse without losing the special characters. I would prefer to do this without using Microsoft.Office.Interop -> richtextbox because it seems that it is very hard to manage.

Please use the attached Word file as example.
1:
2:
3:
4:
5:
6:
Dim filepath = "c:\file.doc"
    dim currentline as string
    Dim currentfilelocation As SeekOrigin = 0
    Dim fs As New FileStream(filepath, FileMode.Open, FileAccess.Read)
    Dim d As New StreamReader(fs)
    currentline = d.ReadLine()
Attachments:
 
Test file containing greater than or equal char
 
Start your free trial to view this solution
[x]
The Solution Rating System

With so many solutions, how can you tell which solutions are most likely to help you and which ones are not? To provide you with a tool to use, we rate our solutions based on various elements that most accurately determine if a solution is a quality solution. To explain what factors affect the solution rating, here are the elements we take into consideration when formulating our solution rating.

  • The Grade of the Solution
  • The Zone Rank of the Expert Providing the Solution
  • The Number of Author and Expert Comments
  • The Number of Experts Contributing
  • The Feedback of the Community

Your Input Matters
Because of the way the system is set up, the most important variable in this equation is you. As a member of Experts Exchange, you are able to cast your vote on the quality of the solutions in regard to how complete, accurate, helpful and easy to understand each solution is. When you provide your feedback, each rating is adjusted accordingly. So, if you see a solution that has a poor rating that you think is a good solution, let us know by rating it. As you do, the rating will be adjusted and will become more accurate for other members of our site.

If you have any suggestions that you would like to make for our rating system, please ask a question in the Suggestions Zone of Community Support.

Thank you!

Question Stats
Zone: Microsoft
Question Asked By: PaulNovel
Solution Provided By: PaulNovel
Participating Experts: 1
Solution Grade: A
Views: 43
Translate:
Loading Advertisement...
 
[+][-]Assisted Solution by cmrnp
Assisted Solution by cmrnp:

All comments and solutions are available to Premium Service Members only.

Start your 7-day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
[+][-]Author Comment by PaulNovel
Author Comment by PaulNovel:

All comments and solutions are available to Premium Service Members only.

Start your 7-day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
[+][-]Accepted Solution by PaulNovel
Accepted Solution by PaulNovel:

All comments and solutions are available to Premium Service Members only.

Start your 7-day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
Loading Advertisement...
20080723-EE-VQP-34 / EE_QW_2_20070628