Solved

Changing Word Docs and  PDFs to html

Posted on 2014-01-25
6
385 Views
Last Modified: 2014-01-26
I am talking to some people who write  Word docs and publish them as PDF files all the time as attachments to blog posts. They were talking about wanting to do this - the letters/files etc. should be converted to html format so they are readable on the site. SOme of these documents are really long, like over 50 pages.
My first reaction is that this is a BAD idea. Sure, you can save a Word doc as html, but in my experience the html is bloated and crappy.
I know there are program that converts Word to html, but how are they? DO they work? Certainly the CSS on the site won't work if the correct classes and IDs are not applied.
Also, what is the matter with PDFs? Everyone has Adobe Reader.
0
Comment
Question by:nanharbison
6 Comments
 
LVL 34

Assisted Solution

by:Dan Craciun
Dan Craciun earned 100 total points
ID: 39809076
Using Word's save to HTML feature is usually a bad idea, but why aren't they just pasting from Word into the blog's editor?
If the blog uses something like TinyMCE (the default on Wordpress) then they can learn pretty quickly how to do basic formatting, if needed.

HTH,
Dan
0
 
LVL 58

Accepted Solution

by:
Gary earned 100 total points
ID: 39809078
My reaction would be stick to PDF, easy to maintain a single format throughout, easy to read offline and small in size (assuming not lots of images etc).
But maybe first try using Google Docs (export to HTML) or a number of the online free convertors and see how the .doc exports.  Will depend a lot on the makeup of the doc.
0
 
LVL 83

Assisted Solution

by:Dave Baldwin
Dave Baldwin earned 100 total points
ID: 39809108
Word and PDF can display things that HTML pretty much can't.  Some formatting, fonts, and layered presentations are difficult if not impossible to duplicate easily in HTML.  In addition, Word and PDF are oriented towards paper and not the screen.
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 38

Assisted Solution

by:BillDL
BillDL earned 200 total points
ID: 39810231
I agree with the others on this one.  PDF (Portable Document Format) is being used exactly as was originally intended, which is sharing of documents in a common format that should be displayed to all readers in a consistent layout.  It is no surprise that most user manuals come as PDF files for this reason.  A Blog (Web Log) is really intended for short entries, not the equivalent of 50 pages in a PDF document.  I would not particularly want to have to scroll down through 50 pages worth of content on a single blog entry.

To say that "Everyone has Adobe Reader" is probably not completely accurate, however.  Malformed PDF files have been used to spread viruses because of the default behaviour of opening inside the browser window via the browser plugin.  This can be a System Administrator's nightmare.  My preference is to always prompt for a "Save" or "Open With Acrobat Reader", but not everybody knows how to configure Acrobat Reader and disable all the potentially risky functionality like allowing JavaScript and allowing executables to be opened from links.  In addition, Google Chrome, which is in use on many smart phones these days, uses its own internal PDF Viewer within the browser.  Saving to the hard drive and then opening is a nuisance to some impatient or uninformed people who prefer to just click a link and have the document open right in the browser, but rendering it (especially with a long document full of images) can be slow and jerky, leading to complaints.

What this comes down to is common sense.  If somebody has a lengthy document that doesn't contain a bunch of cross-links to other resources, and has embedded images that are large enough to see without having a link to pop up a larger image, then PDF is the logical choice over MS Word or posting a long HTML blog entry.

There will always be people who will decide to use something just because it is there, and because recent versions of Word allow you to Save As PDF, they will use this even for short documents that could have just as easily been posted and formatted in the blog's editor.  These people will never be changed and they will always be there ... walking amongst us.
0
 
LVL 17

Author Closing Comment

by:nanharbison
ID: 39810779
Thanks for all this feedback! Very helpful.
0
 
LVL 38

Expert Comment

by:BillDL
ID: 39810983
Thank you nanharbison
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Preface: When I started this series, I used the term CommandBars because that is the Office Object class that it discusses. Unfortunately, when Microsoft introduced Office 2007, they replaced the standard Commandbar menus with "The Ribbon" and rem…
SASS allows you to treat your CSS code in a more OOP way. Let's have a look on how you can structure your code in order for it to be easily maintained and reused.
This video shows where to find templates, what they are used for, and how to create and save a custom template using Microsoft Word.
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now