?
Solved

export mediawiki pages to individual text files

Posted on 2010-09-06
9
Medium Priority
?
1,256 Views
Last Modified: 2013-12-14
i have a mediawiki server with hundreds of pages that i need to export to either Word, pdf, rtf (or maybe txt). and i need to export each mediawiki page into its own file, not all into one file. and teh export needs to be of the rendered content, not the mediawiki code: so instead of tags formattted text and pictures and all. how to do that?

because there are hundreds of pages, manual copy paste or save as not an option ..

ta.
0
Comment
Question by:KristjanLaane
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
9 Comments
 
LVL 40

Expert Comment

by:Richard Quadling
ID: 33613429
From the client side (i.e. a browser), the pages will be rendered using HTML.

So, a HTML to xxx (PDF, Word, RTF, etc.) would certainly do the job.

Combine that with some sort of scripting engine to navigate the links and you should be done.

If you have a all the links already on 1 page (or an index or something), then that would certainly save some time.

Just googling around ...

A commercial offering ... http://html2pdf.seven49.net/Web/

There are also apps available on SourceForge and other places that do HTML 2 PDF.

What do you intend to do with the end results?

If you've got hundreds of good pages being edited/maintained, then why bother with a frozen snapshot? If you need a snapshot, clone the DB and lock it away.


There is also http://www.mediawiki.org/wiki/Extension:Pdf_Export

This looks promising. You would need to script the navigation (I think), but certainly looks the way I would go if it was me.
0
 
LVL 51

Expert Comment

by:Ted Bouskill
ID: 33613732
Any way you look at the problem it's a lot of hard work.

At our company a MediaWIKI server was created to store documentation and over time the stakeholders have realized it's not what they want.  My time inherited maintenance of the server and we've been tasked with trying to find a way to move the content into other formats and all the solutions require a lot of time.
0
 
LVL 40

Expert Comment

by:Richard Quadling
ID: 33613783
@Ted. Not for the faint hearted.
0
Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

 

Author Comment

by:KristjanLaane
ID: 33613908
Thanks for the replies! I have the special: all pages which congregates links to all the pages which would probably become very handy. I should have specified this earlier but my first preference is to outpu word files : do you know any html 2 word tools that can be scripted to spit out a word file per link?
0
 
LVL 40

Expert Comment

by:Richard Quadling
ID: 33614256
What version of Word?

You could probably just use MSWord to open the URLs directly.

Can you try doing one manually?

Load MSWord.
File|Open
Enter URL of page.

Does it look OK?

You are going to get some differences. Word is NOT natively a HTML rendering engine (or did that all change in Office 2007+ - yuech!)

If that is the case then a simple VBA macro would probably do the trick.

0
 

Author Comment

by:KristjanLaane
ID: 33616011
i tried opening directly, but 1) the rendering differences are big and ugly 2) i dont know how to log in to my wiki from word to access most of the content that is log-in only

i think what is needed is something that is able to export the main content of any given wiki page, but not the mediawiki navigation stuff etc, and then using that "pure" export (also without any mediawiki tags) and convert that to word somehow. my thinking is to try to convert all the pages to pdf files somehow (i might try  http://www.mediawiki.org/wiki/Extension:Pdf_Export ) but only if i know of a way to then convert those pdfs into word after?

p.s. also i need to work out how to script  http://www.mediawiki.org/wiki/Extension:Pdf_Export

p.p.s. its harder than i thought it woudl be, i agree!
0
 
LVL 40

Expert Comment

by:Richard Quadling
ID: 33616161
What do you intend to do with all those word documents?
0
 

Author Comment

by:KristjanLaane
ID: 33616316
the content does not need to be shared anymore, so only i need local access, and i need offline access to this content in an editable form so Word is good for that, in addition to providing WYSIWYG. ...
0
 
LVL 40

Accepted Solution

by:
Richard Quadling earned 2000 total points
ID: 33616361
I think getting the exporter working would be the best way to go. Once it is in PDF format, there are any number of PDF 2 Word converters.
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Did you know SD-WANs can improve network connectivity? Check out this webinar to learn how an SD-WAN simplified, one-click tool can help you migrate and manage data in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article was originally published on Monitis Blog, you can check it here . Today it’s fairly well known that high-performing websites and applications bring in more visitors, higher SEO, and ultimately more sales. By the same token, downtime…
Dramatic changes are revolutionizing how we build and use technology. Every company is automating, digitizing, and modernizing operations. We need a better, more connected way to work together as teams so we can harness the insights from our system…
Internet Business Fax to Email Made Easy - With  eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, f…
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.
Suggested Courses

801 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question