Save URL as Text (not HTML)

How can I save the content of a URL to a local file in plain text format (no HTML tags)?
fribergAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

gwaltersCommented:
I'm assuming you wanted this in the Browser category...

For Netscape, choose File, Save As.  Type a name like "foo.txt" (the important part is the ".txt" extension).

For Internet Explorer, pretty much the same thing.


Now, if you wanted to save the content of a URL to a local file in plain text using Java, that's a different story.
0
fribergAuthor Commented:
Of course I want to save it using Java, otherwise I wouldn't have posted it here. :-)

0
shogiCommented:
When you said on the local, it's on the client or on the server?
If it's on the client you can't because the security : APPLET restriction.  But, if you can have a good secur level, and it's accepted by the client, you have this possibility.  On the server you don't have any problem, except if it's not your server :)


0
Learn SQL Server Core 2016

This course will introduce you to SQL Server Core 2016, as well as teach you about SSMS, data tools, installation, server configuration, using Management Studio, and writing and executing queries.

fribergAuthor Commented:
I'm talking about a Java application, not an applet. Let's say I want to capture all the text at the www.yahoo.com main page, and save it as a text file (no HTML tags) on my HD.
To save it in HTML format is no problem, but is there a way to remove the HTML tags automatically?


0
shogiCommented:
No, but you can simply rename your file, with xxx.TXT
0
KECommented:
As far as i know, you can't just get rid of the tags.
Use a FilterInputStream and discard anything between "<" and ">"

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Java

From novice to tech pro — start learning today.