Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Extract links form provided webpage

Posted on 2016-07-19
5
Medium Priority
?
52 Views
Last Modified: 2016-08-01
Hi All,
There is an requirement to download the PDF and html links from the given web-page. if html links contains more pdf links than those were required to download as well.
Is there any sample or reference perl script or any other script which can be run on linux machine.

Thanks,
Shail
0
Comment
Question by:Shailesh Shinde
  • 3
  • 2
5 Comments
 
LVL 12

Accepted Solution

by:
Benjamin Voglar earned 2000 total points
ID: 41718268
if I understand this correctly. You like to download all PDF files from a site.

You can try with powershell.

$psPage = Invoke-WebRequest "http://www.powertheshell.com/cookbooks/"
$urls = $psPage.ParsedHtml.getElementsByTagName("A") | ? {$_.href -like "*.pdf"} | Select-Object -ExpandProperty href

$urls | ForEach-Object {Invoke-WebRequest -Uri $_ -OutFile ($_ | Split-Path -Leaf)}

Open in new window

0
 
LVL 3

Author Comment

by:Shailesh Shinde
ID: 41718749
Hi,
I tried running this script. However, getting below error on command prompt screen
links.ps1 cannot be loaded because the execution of scripts is disabled on this system.

Thanks,
Shail
0
 
LVL 12

Expert Comment

by:Benjamin Voglar
ID: 41718959
open powershell as Admin and enter:

 Set-ExecutionPolicy -ExecutionPolicy Unrestricted

then try the script again.
0
 
LVL 12

Expert Comment

by:Benjamin Voglar
ID: 41718961
Or You can use "Windows Powershell ISE"
0
 
LVL 3

Author Closing Comment

by:Shailesh Shinde
ID: 41738491
Thanks
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this tutorial I will show you how to provide a dynamic RTF document on your website generated with data from your database. For this tutorial you will need Microsoft Word or WordPad, WhizBase and Microsoft Access. In this tutorial I will show …
A quick Powershell script I wrote to find old program installations and check versions of a specific file across the network.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.
Suggested Courses

885 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question