Link to home
Start Free TrialLog in
Avatar of zahziah
zahziah

asked on

tons of text on web !!

Hi there,
I need some help regarding my problem....I have a client here in pakistan and they already have a setup of Novell 4.1 with web server support and they want to utilize this support....So the manager told me to develop a web and the purpose of the web is to avoid the reading of 1000 pages Taxations & Laws book...So they told me to scan the books and make web pages so users can read and search the text on pages from the web server...but I dont know how to do that because there are lots of information in one book you know like 1200 pages book...so if I export them in html format they will take lot of space and thousands of html documents....so I read about acrobat reader and i think it is useful for me because it saves tons of informatino in .pdf file format which is so much compress but i dont know how can i access this .pdf file on web...or do u have any idea how to store tons of information in compress format and which is easily accessable through web.....

So please give me suggestions and ideas...nothing is impossible in this world i guess...

Thanks
Zahid
ASKER CERTIFIED SOLUTION
Avatar of MorFF
MorFF

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of zahziah
zahziah

ASKER

Well first of all I am telling you that my client is a bank and copyright is no problem for them because the taxation books are the property of bank...and they dont have electronic version available there....

I think i should adopt the acrobat reader facility but the downloading is a problem because they wants to read from the web not download the files....I want to make indexes and then I want to divide into sections of the information and then put a hyperlink there so they can search or jump through the sections easily....but i really dont know how come I can do that...please tell me the method so I can setup whole things ...

Thanks
Zahid

You should be able to scan and OCR the book in a resonably short time. If you haven't got the facility there must be someone in Islamabad that can provide the service, a CAD beureau or someone similar ??? Anyway get the document into a Micorosoft Word format and then download Microsofts Internet Publishing Assistant from their website. Then use Word to format the document, assign bookmarks, etc. Which are then published as .HTML files by the Assistant. The download is free, the only thing that should cost is the scanning and your time.
I'm going to have to agree with MorFF.  PDF is the best bet for keeping formatting and size.  PDF would be considerable smaller in file size than HTML even without graphics.  Also, PDF coversion would take about 1/1000 of the time it would take to do it in HTML.
Avatar of zahziah

ASKER

okey buddies i agree and know that .PDF format is smaller in size but I need to know the
steps of doing this ...suppose I already convert into PDF files then what's next what will I do after that ...I want to know this so please tell me about the steps.

After you've converted to PDF, simple place a link to those files off of a page...when the user clicks the link, it will "spawn" their Adobe Acrobat Reader.....your users can get the adobe acrobat reader for free at http://www.adobe.com
The PDF file will launch in their Web Browser if they are using 4.0 and if they are using a lower version, it will just open in a differnt window...it is very simple.....hope this helps.
zahziah,

You said  'my client is a bank and copyright is no problem for them because the taxation books are the property of bank...and they dont have electronic version available there.... '

How did they get the printers to print the book without an electronic copy?  I suggest you investigate this avenue a bit more thoroughly, as this would save you the time and hassle of reformatting etc.  You may even find that a set of PDF files were sent to the printers in the first case!

Cheers - MorFF