Solved

HowTo read contents of a .pdf/.zip ?

Posted on 2002-07-29
4
209 Views
Last Modified: 2010-08-05
Hello all

any pointers on how to

1. get the text contents of files such as *.html ( or any url ),*.pdf etc and storing them in a db
i've the VBA-word to get the contents of a word file but not decided on how to approach a .pdf and .html file, any pointers here ?

2. programmatically unpacking a .zip file's contents to a folder of choice, and then going to step 1

on another note (you get the points even if the following is not answered ) i need to do (programmatic) searches on the contents of files ( that's why i store the contents into a db and do a sql server full-text search ), but also might need to do regular expression searches -
any leads here ?

TIA

0
Comment
Question by:dkjnkm
4 Comments
 
LVL 4

Accepted Solution

by:
AlonHirsch earned 200 total points
ID: 7187274
Hi,

For HTML and other Text based files - it's very easy. Simply read the file into a string variable and write that variable into a Text field in SQL Server using AppendChunk.

For PDF and other binary file types - you would need to get some sort of control or something that can read those types of files and then do the same type of thing : translate them to text and appendchunk to the database.

To Unzip files in a ZIP you would need some sort of UNZIP control or DLL. InfoZip have a freeware (I think) DLL that has that capability. Go to http://www.infozip.com or http://www.infozip.org and search from there.

HTH,
Alon
0
 
LVL 70

Expert Comment

by:Éric Moreau
ID: 7187535
To unzip, you may use this free component: http://vbaccelerator.com/codelib/zip/zipvb.htm
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 8049086
Hi dkjnkm,
It appears that you have forgotten this question. I will ask Community Support to close it unless you finalize it within 7 days. I will ask a Community Support Moderator to:

    Accept AlonHirsch's comment(s) as an answer.

dkjnkm, if you think your question was not answered at all or if you need help, just post a new comment here; Community Support will help you.  DO NOT accept this comment as an answer.

EXPERTS: If you disagree with that recommendation, please post an explanatory comment.
==========
DanRollins -- EE database cleanup volunteer
0
 

Expert Comment

by:SpideyMod
ID: 8095929
per recommendation

SpideyMod
Community Support Moderator @Experts Exchange
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Introduction While answering a recent question about filtering a custom class collection, I realized that this could be accomplished with very little code by using the ScriptControl (SC) library.  This article will introduce you to the SC library a…
The debugging module of the VB 6 IDE can be accessed by way of the Debug menu item. That menu item can normally be found in the IDE's main menu line as shown in this picture.   There is also a companion Debug Toolbar that looks like the followin…
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Get people started with the utilization of class modules. Class modules can be a powerful tool in Microsoft Access. They allow you to create self-contained objects that encapsulate functionality. They can easily hide the complexity of a process from…

713 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question