[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

How to download all the product details from all the webpages of a particular website  into excel

Posted on 2010-01-08
23
Medium Priority
?
268 Views
Last Modified: 2012-05-08
Hi All
I wish to download all the products details from all the pages of a particulare website into an excel file. This is to minimize the time to copy paste every product details into excel file please.

Attached is the image of the webpage how it looks like.
The solution can be either a code, or a free software please.

Thanks
products.jpg
0
Comment
Question by:Raheman M. Abdul
  • 11
  • 7
  • 5
23 Comments
 
LVL 45

Expert Comment

by:patrickab
ID: 26212857
marahman3001,

Please provide the full URL.

Patrick
0
 
LVL 19

Author Comment

by:Raheman M. Abdul
ID: 26214574
0
 
LVL 45

Expert Comment

by:patrickab
ID: 26216379
marahman3001,

The macro below is in the attached file. Press the button to download the details/prices of the mobiles.

If you want the data in a different format it will need to be parsed to get more easily read.

Hope it helps

Patrick


Sub get_data()
With ActiveSheet.QueryTables.Add(Connection:= _
    "URL;http://search.next.co.uk/search?w=*&af=cat%3Amobilephones&nxti=0&nxtv=000&filter=subset%3A4201&ts=v8&isort=score&cnt=96" _
    , Destination:=Range("A1"))
    .Name = "Mobiles"
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlEntirePage
    .WebFormatting = xlWebFormattingNone
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False
End With
End Sub

Open in new window

web-download-01.xls
0
Free Backup Tool for VMware and Hyper-V

Restore full virtual machine or individual guest files from 19 common file systems directly from the backup file. Schedule VM backups with PowerShell scripts. Set desired time, lean back and let the script to notify you via email upon completion.  

 
LVL 19

Author Comment

by:Raheman M. Abdul
ID: 26273104
Thanks for this. I am actually looking for downloading only the products which are displayed in pages 1, 2, 3 etc  not the whole page please.
0
 
LVL 45

Expert Comment

by:patrickab
ID: 26273151
You cannot download only part of a webpage. What can be done is to download the whole webpage and then remove all the data that is not wanted.

0
 
LVL 45

Expert Comment

by:patrickab
ID: 26273393
marahman3001,

The code below is in the attached file.

1st press the 'Get web data' button and when the data is visible on the worksheet, press the 'Sort out data' button. That hopefully will do what want.

Patrick
Sub get_data()
With ActiveSheet.QueryTables.Add(Connection:= _
    "URL;http://search.next.co.uk/search?w=*&af=cat%3Amobilephones&nxti=0&nxtv=000&filter=subset%3A4201&ts=v8&isort=score&cnt=96" _
    , Destination:=Range("A1"))
    .Name = "Mobiles"
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlEntirePage
    .WebFormatting = xlWebFormattingNone
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False
End With
End Sub

Sub data_parser()
Dim rng As Range
Dim celle As Range
Dim firstrow As Long
Dim lastrow As Long
Dim endrow As Long
Dim i As Long

Application.ScreenUpdating = False

With Sheets("Sheet1")
    Set rng = Range(.Cells(1, "A"), .Cells(.Rows.Count, "A").End(xlUp))
    endrow = .Cells(.Rows.Count, "A").End(xlUp).Row
    
    For Each celle In rng
        If celle = "You searched for" Then firstrow = celle.Row
        If celle = "Next departments" Then lastrow = celle.Row - 2
    Next celle
    
    For i = endrow To 2 Step -1
        If i <= endrow And i >= lastrow Then
            .Rows(i).EntireRow.Delete
        End If
        If i < firstrow Then
            .Rows(i).EntireRow.Delete
        End If
        If .Cells(i, 1) = "" Then
            .Rows(i).EntireRow.Delete
        End If
        If .Cells(i, 1) = .Cells(i - 1, 1) Then
            .Rows(i).EntireRow.Delete
        End If
        If LCase(Left(.Cells(i, 1), 3)) = "now" Then
            .Cells(i, 1) = Mid(.Cells(i, 1), 4, Len(.Cells(i, 1)))
        End If
    Next i
    Set rng = .Range(.Cells(1, "A"), .Cells(.Rows.Count, "A").End(xlUp))
    
    For Each celle In rng
        If IsNumeric(celle) Then
            celle.Offset(0, 1) = celle.Offset(1, 0) & " - £" & celle
        End If
    Next celle
    .Columns("A").EntireColumn.Delete
    endrow = .Cells(.Rows.Count, "A").End(xlUp).Row
    For i = endrow To 1 Step -1
        If .Cells(i, 1) = "" Then
            .Rows(i).EntireRow.Delete
        End If
    Next i
    .Columns("A").AutoFit
    Set rng = Range(.Cells(1, "A"), .Cells(.Rows.Count, "A").End(xlUp))
    rng.Sort Key1:=Range("A1"), Order1:=xlAscending, Header:=xlNo, _
        OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
        DataOption1:=xlSortNormal
End With

Application.ScreenUpdating = True

End Sub

Open in new window

web-download-02.xls
0
 
LVL 6

Expert Comment

by:TomSchreiner
ID: 26276712
PatrickB.  Your code throws an error on the sort after import.
0
 
LVL 6

Expert Comment

by:TomSchreiner
ID: 26276722
Mara.  The link you provided does not bring up a web page that looks like your picture.
0
 
LVL 45

Expert Comment

by:patrickab
ID: 26276827
Tom,

>PatrickB.  Your code throws an error on the sort after import.

Oh no it doesn't. I am not in the habit of offering solutions that I have not tried several times and in this instance that is certanly the case. I have just run my macros yet again and they worked perfectly.

Please leave it to questioners to comment on solutions offered by someone other than yourself. Perhaps in other websites criticising other people's solutions is acceptable behaviour but we do our best to avoid that here in Experts Exchange. Instead if you can offer a better solution then by all means do that.

Thanks.

Patrick

0
 
LVL 19

Author Comment

by:Raheman M. Abdul
ID: 26277323
Patrickab: There is some data mismatch after sorting mate. There must be some sorting error.  The prices are not matching with the item.
Cant we do this way.
Download the page source and search for the specific block of HTML and terminate the remaining HTML tags.
With the above result, strip the HTML tags and we will have the result.
This results the location of the images, name of the item and prices into excel sheet?
thanks mate
0
 
LVL 45

Expert Comment

by:patrickab
ID: 26277366
marahman3001,

If you know of a better way to do it then do please let us have sight of your VBA code. I'm always ready to learn new ways.

Patrick

ps Not 'mate' please
0
 
LVL 6

Expert Comment

by:TomSchreiner
ID: 26277753
It's still throwing an error BTW on rng.Sort.

0
 
LVL 45

Expert Comment

by:patrickab
ID: 26277764
marahman3001,

The data_parser macro in the attached file removes duplicate entries in the final result. BTW now that I know that an 'LG Cookie' is some sort of mobile I have hopefully got the models and prices in sync.

Patrick
web-download-03.xls
0
 
LVL 6

Expert Comment

by:TomSchreiner
ID: 26277791
Mara.  If you are looking for all of the product details listed in your image, the url you provided does not open that page.  What information are you looking for?  What details exactly?
0
 
LVL 19

Author Comment

by:Raheman M. Abdul
ID: 26278547
Patrickab:
Looks like its working. But the order they are displayed in is not satisfied. Might be the order is lost in sorting or whatever. Can you have a look into this please.
thanks





0
 
LVL 45

Accepted Solution

by:
patrickab earned 2000 total points
ID: 26278569
marahman3001,

OK, I have removed the sorting of the final result. It's in the attached file.

Patrick
web-download-04.xls
0
 
LVL 6

Expert Comment

by:TomSchreiner
ID: 26278740
I wonder why I am not getting the product details.  The only thing I can see when I go to the ULR is an item name and price.  Including Patrick's web query.
0
 
LVL 19

Author Closing Comment

by:Raheman M. Abdul
ID: 31674717
Many thanks Patrickab  You done great job and thanks for all your effort to solve this problem.
thanks again.
0
 
LVL 45

Expert Comment

by:patrickab
ID: 26279625
marahman3001 - Pleased I was able to help. Thanks for the grade. - Patrick
0
 
LVL 19

Author Comment

by:Raheman M. Abdul
ID: 26279733
Patrickab:
If i wish the URL value to be read from the cell value, what is the change in the following statement?

With ActiveSheet.QueryTables.Add(Connection:= _
    "URL;http://search.next.co.uk/search?w=*&af=cat%3Amobilephones&nxti=0&nxtv=000&filter=subset%3A4201&ts=v8&isort=title&cnt=96" _
    , Destination:=Range("A1"))
0
 
LVL 45

Expert Comment

by:patrickab
ID: 26279844
marahman3001,

Try the attached file. Do please remember that the data_parser macro is specific that URL and the data that it downloads today - it might not hold good tomorrow for example if they change the webpage.

The URL is on Sheet2.

Patrick


web-download-05.xls
0
 
LVL 19

Author Comment

by:Raheman M. Abdul
ID: 26279914
Excellent Patrickab
Thanks again
0

Featured Post

Vote for the Most Valuable Expert

It’s time to recognize experts that go above and beyond with helpful solutions and engagement on site. Choose from the top experts in the Hall of Fame or on the right rail of your favorite topic page. Look for the blue “Nominate” button on their profile to vote.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Microsoft's Excel has many features that most people will never need nor take advantage of.  Conditional formatting is one feature that you may find a necessity once you start using it.
In real business world data are crucial and sometimes data are shared among different information systems. Hence, an agreeable file transfer protocol need to be established.
The viewer will learn how to create two correlated normally distributed random variables in Excel, use a normal distribution to simulate the return on different levels of investment in each of the two funds over a period of ten years, and, create a …
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

834 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question