Link to home
Start Free TrialLog in
Avatar of wchestnut
wchestnutFlag for United States of America

asked on

Display SQL Word VARBINARY(MAX) Binary Document in Visual Studio Form - REVISITED

I already asked this question in this post.  It was recommended to me not to continue with the original plan of importing Word documents into VARBINARY(MAX) fields:

Original EE Post

But after doing some research, I believe this is the better way to go.  According to a whitepaper I found from Microsoft, since the vast majority of our Resume Word documents are around 50KB (well under 1MB), it was recommended to import them into a VARBINARY(MAX) instead of using FILESTREAM.  We have about 77,000 Resume files which total about 4GB so I think we'll be fine once they are all loaded.

So, now I'm back to my original question.  In a Visual Studio Form using Visual Basic, how do I display the Word document in it's original format?  I have a table where I manually inserted 10 Resumes rows including the Word binary file into a VARBINARY(MAX) column.  I dropped all of the fields from the Data Source in Details mode so it looks like a record edit screen complete with the usual record options -- except the resume_binary field which is showing a red circle with a white horizontal line (indicating "no").
Avatar of wchestnut
wchestnut
Flag of United States of America image

ASKER

BTW, I did try and change the field type to Rich TextBox, but only "Byte[] Array" is displayed:
User generated image
Avatar of Alexandre Simões
Do you already know how to read Word files and their different formats?

Basically what you have there is the Byte array representing the file.
You need to use a MemoryStream and a tool to actually read the doc or docx files.

What exactly do you have now?
Have you considered using a browser object and allow that to open your document

There are a couple of 3rd party tools as well... (though dwindling in number) a friend has suggested : http://www.gemboxsoftware.com/document/overview there is a free version but really limited to size of doc (20 paragraphs) but means you can test and play.

Might be worthwhile reading : http://support.microsoft.com/kb/257757
I have nothing now.  I'm just starting again with Visual Basic and SQL after being away from them for a number of years.

mark_willis, that is an AWESOME idea and I think it's exactly what we need.  I'm going to try it out and let you guys know.
I already used GemBox on a project some years ago and highly recommend it :)
Okay, bare with me guys... I installed SetupGemBoxDocument23.msi, used Add Reference and checked GemBox.Document 23 for .NET 3.0 - 4.5.  Then I found sample code to "Get Content" and modified it:

    Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click

        Imports GemBox.Document
        Imports System.Text

        ' If using Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document As DocumentModel = DocumentModel.Load(File_linkLinkLabel.Text)

        Dim sb = New StringBuilder()

        ' Get content from each paragraph
        For Each paragraph As Paragraph In document.GetChildElements(True, ElementType.Paragraph)
            sb.AppendFormat("Paragraph: {0}", paragraph.Content.ToString())
            sb.AppendLine()
        Next

        ' Get content from each bold run
        For Each run As Run In document.GetChildElements(True, ElementType.Run)
            If (run.CharacterFormat.Bold) Then
                sb.AppendFormat("Bold run: {0}", run.Content.ToString())
                sb.AppendLine()
            End If
        Next

        '        Console.WriteLine(sb.ToString())

        WebBrowserResume.DocumentText = sb.ToString

Open in new window


I created a Button1 on the Form which I was going to use to populate the WebBrowserResume object.  But I received a bunch of errrors:

User generated image
What did I do wrong or not do?
I'm also having difficulty finding some sample code to try and use WebBrowser to load Word documents.  I've searched all over but didn't find anything current with the code.
But wait... isn't just transforming the byte array to a file and send it to the user enough?
Do all the users have Word installed or you have to provide a way to show the contents on the website?

If this is enough just use:
System.IO.File.WriteAllBytes(filepath, bitearray);

Open in new window

After this you know that filepath holds your doc, just send it to the user!
He will click save or open with Word and it's done... no? :)
No, not really.  Our goal is to create a system to import the 77,000 Resumes we have in Word documents (and all new ones moving forward) into an SQL database and be able to search and view them.  My first task as proof of concept is to create a user interface and provide basic keyword search abilities.  The end game would not use external files at all after they're imported and include some intelligence to narrow-down potential candidates faster and easier than the Recruiters do now.

I've already manually created and populated a 10 row sample in a table with file information as well as imported the Word document into a VARBINARY(MAX) column which would be used for Full Text searching as well as (hopefully) displaying the document on a form:
User generated image
The large white box at the bottom is where the Resume contents would be displayed in it's original Word format -- either from the VARBINARY(MAX) column or the actual Word file itself.

I've been trying to get GemBox Document to work, but I'm not as advanced in .Net now as I want to be.  I created some Projects in Visual Studio but have to learn how to get the Modules to work from a Form.

I do have a backup plan to use VBA code in Word to Open, Select All, Copy the text in each of the 77,000 Word documents then insert new rows into SQL using a large Text field instead and provide a link on the Form to the external document.  But that's more of a last-resort option if I can't get this to work.
Can use something like : WebBrowser1.Navigate(@"C:\myDocment.doc");

but that assumes the doc can be loaded from disk...

as for gembox, it was a recommendation from a developer mate of mine. So, cant really help all that much other than following the examples. I can always ask him though...

also might be worth running the sample from their online documentation : http://www.gemboxsoftware.com/SampleExplorer/Document/CommonUses/Reading?tab=vb
Yeah, tried the Navigate method... it just opens the document externally in Word.

I've been up and down through their documentation.  I think I figured out that I needed to create a Module in my VS Project.  I just don't know how I can use it from my Form yet.
ASKER CERTIFIED SOLUTION
Avatar of Mark Wills
Mark Wills
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I read Microsoft's white paper on FILESTREAM... they recommend loading the binary into a VARBINARY(MAX) if the file sizes are less than 1MB -- which virtually all of them are.

I managed to figure out the GemBox to a point where I hit the 20-paragraph limit with the free version on my own resume (didn't think it really had 20 paragraphs).  Now I just have to figure out the error event handler and let it work with those first 20 paragraphs and display the results somewhere just for proof of concept.

I will also check our your streaming from disk link, too.  Thanks for the feedback so far!