asked on

HowTo read contents of a .pdf/.zip ?

Hello all

any pointers on how to

1. get the text contents of files such as *.html ( or any url ),*.pdf etc and storing them in a db
i've the VBA-word to get the contents of a word file but not decided on how to approach a .pdf and .html file, any pointers here ?

2. programmatically unpacking a .zip file's contents to a folder of choice, and then going to step 1

on another note (you get the points even if the following is not answered ) i need to do (programmatic) searches on the contents of files ( that's why i store the contents into a db and do a sql server full-text search ), but also might need to do regular expression searches -
any leads here ?

TIA

ASKER CERTIFIED SOLUTION

Alon Hirsch

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Éric Moreau

To unzip, you may use this free component: http://vbaccelerator.com/codelib/zip/zipvb.htm

DanRollins

Hi dkjnkm,
It appears that you have forgotten this question. I will ask Community Support to close it unless you finalize it within 7 days. I will ask a Community Support Moderator to:

Accept AlonHirsch's comment(s) as an answer.

dkjnkm, if you think your question was not answered at all or if you need help, just post a new comment here; Community Support will help you. DO NOT accept this comment as an answer.

EXPERTS: If you disagree with that recommendation, please post an explanatory comment.
==========
DanRollins -- EE database cleanup volunteer

SpideyMod

per recommendation

SpideyMod
Community Support Moderator @Experts Exchange