HowTo read contents of a .pdf/.zip ?
Posted on 2002-07-29
any pointers on how to
1. get the text contents of files such as *.html ( or any url ),*.pdf etc and storing them in a db
i've the VBA-word to get the contents of a word file but not decided on how to approach a .pdf and .html file, any pointers here ?
2. programmatically unpacking a .zip file's contents to a folder of choice, and then going to step 1
on another note (you get the points even if the following is not answered ) i need to do (programmatic) searches on the contents of files ( that's why i store the contents into a db and do a sql server full-text search ), but also might need to do regular expression searches -
any leads here ?