Extracting the complete Generated HTML from the web application

I want to make a HTML prototype from the current Web application running.
For all the pages present in the application, I need to collect the HTML generated
Let me know if there is any tool or how can I go about doing this.
kssreelakshmiAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

cmalakarCommented:
Is this what you are looking for..

http://www.programurl.com/software/store-web-pages-offline.htm

web spider will be a better one.. !
0
evguenCommented:
Hello kssreelakshmi,

I have seen that "web spider" is a shareware.

I can advise you another free software (GPL licence) I have used for years : Httracks - http://www.httrack.com/page/1/en/index.html

Hope this helps

Best regards
0
kssreelakshmiAuthor Commented:
This is the error I am facing.. Let me know what changes I need to do ?


HTTrack3.42+htsswf+htsjava launched on Tue, 12 Feb 2008 10:43:44 at http://10.18.8.91:9000/admin +*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar
(winhttrack -qwC2%Ps2u1%s%uN0%I0p3DaK0H0%kf2A25000%f#f -F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by HTTrack Website Copier/3.x [XR&CO'2007], %s -->" -P 10.18.1.7:80 -%l "en, en, *" http://10.18.8.91:9000/admin -O1 "d:\crbt\My Web Sites\b" +*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar )
Information, Warnings and Errors reported for this mirror:
note: the hts-log.txt file, and hts-cache folder, may contain sensitive information,
 such as username/password authentication for websites mirrored in this project
 do not share these files/folders if you want these information to remain private
10:43:44 Warning:  link is probably looping, type unknown, aborting: 10.18.8.91:9000/admin
10:43:44 Info:  No data seems to have been transfered during this session! : restoring previous one!
0
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

kssreelakshmiAuthor Commented:
The above problem got solved by changing the proxy user/pwd.. But the problem now is the entire website is not translated even after changing the content of the filters as mentioned in the FAQ... Please let me know what else is missing
0
evguenCommented:
Hello kssreelakshmi,

do you mean you have several languages on your website and nothing changes when you try to change the language on downloaded pages with HTTracks ?

If yes, I expect that you have dynamic pages and the language displayed is a session variable.
Httracks (and equivalent softwares) are made for downloading a page only once, even if several links points to this page, according to the page name.

This means that if your webpage, named index.php, is displayed by default in english and it contains a button to switch the interface in russian, when Httrachs will follow the link of the button it will go to the same page (index.php). Even if the content has changed (ie. the language of the interface has changed) - the pagename is still the same (index.php). This way, Httracks will not download again your webpage. If it does, it would download infinite number of times the same page.

This is why you cannot re-create dynamic web site by downloading websites with that sort of tool (or any other tools).

Best regards
0
kssreelakshmiAuthor Commented:
Thanks for the extensive description and analysis.
You are right. I have a multi-lingual web application. Basically I want to get the offline content for the Web application developed in JSP based on Struts kind of a framework with lot many *.do in place.
Please let me know if some tool is present to generate the content
0
evguenCommented:
Hello,

As I have worked with Struts for a year, I maybe can help you. But could you please specify what do you mean by "to generate the content" ?

Best regards
0
kssreelakshmiAuthor Commented:
Offline content I meant. I wanted to know if the crawling for each of the page happens automatically using the tools and store the HTML content for me ?
0
evguenCommented:
Hello,

I'm sorry, but english is not my native language, and I an not sure to understand your last question.
Yes, Httracks will store the pages for you in the selected folder.

Please let me know your trouble more in detail

Best regards
0
kssreelakshmiAuthor Commented:
It is storing only the first login page and not any further pages though I mention the username and password for the login in the configuration
0
evguenCommented:
Ok, I see.

Have you seen this tutorial ? : http://httrack.kauler.com/help/Authentication

Best regards
0
kssreelakshmiAuthor Commented:
Yes. Tried all this. My application is running in localhost only :-)
0
evguenCommented:
After your athentication page, are you redirected to a new page (with a new URL) or do you stay in the same URL ? If you stay in the same URL, i think you're in trouble to automate this. You would also maybe need to reconsider your approach. In that case Httracks would not be the solution.

You want a solution to make a HTML prototype. What is the context ? For example, do you need to make a demo for a customer ? Why not create a video demo then ? You can use Camtasia studio for example. (Camtasia v3 became free few months ago, but you have to pay for the last version.) I use this to make some demo from time to time.

Best regards

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
kssreelakshmiAuthor Commented:
You got the context very right. Need a demo for a customer. But with a video demo, it shall not be same as how my application works.  

How about Backstreet browser ?
0
kssreelakshmiAuthor Commented:
I am using WinHTTrack. I could get the page which takes me to the page which says the link has been captured.

Then I followed the instructions on the manual. But I am facing the following error:

HTTrack3.42+htsswf+htsjava launched on Tue, 19 Feb 2008 17:34:14 at
URL info...........
Information, Warnings and Errors reported for this mirror:

note:      the hts-log.txt file, and hts-cache folder, may contain sensitive information,

      such as username/password authentication for websites mirrored in this project

      do not share these files/folders if you want these information to remain private



17:34:14      Warning:       Cache: damaged cache, trying to repair

17:34:14      Warning:       Cache: 0 bytes successfully recovered in 0 entries

17:34:14      Warning:       Cache: error trying to open the cache

17:34:15      Warning:       link is probably looping, type unknown, aborting: URL ....
17:34:15      Info:       No data seems to have been transfered during this session! : restoring previous one!

0
kssreelakshmiAuthor Commented:
Please close it
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Web Development

From novice to tech pro — start learning today.