Solved

Extract site content

Posted on 2009-05-16
6
592 Views
Last Modified: 2013-12-20
One of my clients currently has a site that is rather large with many pages, links to pdfs and external links.

They are currently using a CMS called MySource Matrix. I am redeveloping the new site (no CMS) on a dedicated server. I do have access to the back end of the current CMS but this thing is impossible to extract anything meaningful from. I also have no root access or any access to the server itself. It's a complete mess.

So, my question is... Is there any way to extract or build a hierarchy of each and every page (in essence a site map) and also extract all linked PDFs (hopefully maintaining some form of link to the parent page)?

Is there any method or software package to perform such a task?

I am desperate. Please, any ideas at all.

Thanks
0
Comment
Question by:rgoggins
  • 3
  • 2
6 Comments
 
LVL 8

Expert Comment

by:paololabe
ID: 24402893
I think you could use an utility or library to generate a sitemap.xml and parse it to extract pdf link

 
0
 
LVL 1

Author Comment

by:rgoggins
ID: 24402906
Thanks paololabe. Could you elaborate a little more on your suggestion?
0
 
LVL 1

Author Comment

by:rgoggins
ID: 24406475
Anyone?
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 
LVL 17

Expert Comment

by:selvol
ID: 24406482
Offline Explorer Enterprise.

Will do all that and MORE.
0
 
LVL 17

Accepted Solution

by:
selvol earned 500 total points
ID: 24407344
Like I stated. Metaproducts Offline Explorer Is an Exelent Data Extractor.

I have used it many times. Doing almost exactly what you need to to.
This app is not a cheap program as many are.
Best thing is they have a Free 30 trial. I believe it is Unrestricted.

http://dl.filekicker.com/send/file/167627-YJTV/eesetup.exe
0
 
LVL 1

Author Closing Comment

by:rgoggins
ID: 31582231
Thank you selvol.

OEE is absolutely perfect for what I need.

Thanks again,
Rob
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Get to know the ins and outs of building a web-based ERP system for your enterprise. Development timeline, technology, and costs outlined.
Does your audience prefer people in photos or no people? How can you best highlight what you’re selling? What are your competitors doing, and what can you do that is different and unique from them?  Continue reading to learn how to make your images …
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…

773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question