• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 237
  • Last Modified:

Multiple HTML to Excel

Hello to you all,

I have appoximately 50 html files that I would like to copy/import 1 at a time into a worksheet called "raw".  Each time a file is imported I would like to remove rows 1, 3 to 5 and 9 to 21.  Once these rows have been removed I would like to copy this data to a worksheet called "complete" and move onto the next html file.  

Each time the "raw" worksheet would need to be blanked and every time I copy the data from "raw" to "complete" I would like the data to be pasted at the end of the previous copied data.  Therefore the "complete" worksheet would end up with data from all 50 html files.

I hope I have been clear in my explanation and any help would be appreciated, many thanks, Alan.

0
Canders_12
Asked:
Canders_12
  • 4
  • 2
1 Solution
 
Arno KosterCommented:
You could automate this using a VBA macro.
looking at the 3 steps :
 - importing html file to "raw" tab
 - removing rows
 - copy to "complete" tab

this can be easily done :
Sub import_html_files()


    '-- import all html files in the same path as this excel file
    res = Dir(Path & "\*.html")
    Do
        process (res)
        res = Dir
    Loop Until res = ""

End Sub

Sub process(file As String)
Dim buffer As String
Dim pos As Double

    Application.StatusBar = "Processing " & file
    '-- clean raw worksheet
    Worksheets("raw").UsedRange.Clear
    
    '-- import file
    pos = 1
    Open file For Input As #1
    While Not EOF(1)
        Line Input #1, buffer
        Worksheets("raw").Cells(pos, 1) = buffer
        pos = pos + 1
    Wend
    Close #1

    '-- clear rows 1, 3 to 5 and 9 to 21
    '-- take care to remove bottom-up !!
    Worksheets("raw").Range("9:21").Delete
    Worksheets("raw").Range("3:5").Delete
    Worksheets("raw").Range("1:1").Delete
    
    
    
    '-- copy processed info to "complete" worksheet
    pos = Worksheets("complete").UsedRange.Rows.Count + 1
    If pos = 2 Then pos = 1
    Worksheets("raw").UsedRange.Copy Destination:=Worksheets("complete").Range("A" & pos)
    

End Sub

Open in new window


0
 
Canders_12Author Commented:
It works a treat pulling in the data but the format shows the actual HTML code.  For example:

<TD align="left" width="371">00381642</TD>

Every cell is similar to the above.  If I right click any of the files in Explorer and click open with Excel it looks perfectly formatted.  Any cunning ideas?
0
 
Arno KosterCommented:
That's only a small change, it even makes the process somewhat more straightforward:

Sub process(file As String)
Dim buffer As String
Dim pos As Double
Dim wb As Workbook

    Application.StatusBar = "Processing " & file
    '-- clean raw worksheet
    Worksheets("raw").UsedRange.Clear
    
    '-- open html file
    Application.ScreenUpdating = False
    Set wb = Workbooks.Open(file)
    wb.ActiveSheet.UsedRange.Copy Worksheets("raw").Range("A1")
    wb.Close False
    Set wb = Nothing
    Application.ScreenUpdating = True
    
    '-- clear rows 1, 3 to 5 and 9 to 21
    '-- take care to remove bottom-up !!
    Worksheets("raw").Range("9:21").Delete
    Worksheets("raw").Range("3:5").Delete
    Worksheets("raw").Range("1:1").Delete
    
    '-- copy processed info to "complete" worksheet
    pos = Worksheets("complete").UsedRange.Rows.Count + 1
    If pos = 2 Then pos = 1
    Worksheets("raw").UsedRange.Copy Destination:=Worksheets("complete").Range("A" & pos)
    

End Sub

Open in new window

0
Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

 
Arno KosterCommented:
or even better, forget the raw sheet :
Sub process(file As String)
Dim buffer As String
Dim pos As Double
Dim wb As Workbook

    Application.StatusBar = "Processing " & file
    '-- clean raw worksheet
    Worksheets("raw").UsedRange.Clear
    
    '-- open html file
    Application.ScreenUpdating = False
    Set wb = Workbooks.Open(file)

    '-- clear rows 1, 3 to 5 and 9 to 21
    '-- take care to remove bottom-up !!
    wb.ActiveSheet..Range("9:21").Delete
    wb.ActiveSheet..Range("3:5").Delete
    wb.ActiveSheet..Range("1:1").Delete

    '-- copy processed info to "complete" worksheet
    pos = Worksheets("complete").UsedRange.Rows.Count + 1
    If pos = 2 Then pos = 1
    wb.ActiveSheet.UsedRange.Copy  Destination:=Worksheets("complete").Range("A" & pos)

    '-- close html source without saving
    wb.Close False
    Set wb = Nothing
    Application.ScreenUpdating = True

End Sub

Open in new window

0
 
Arno KosterCommented:
you can remove line 7 & 8 from comment 35490311
0
 
Canders_12Author Commented:
Didn't work straight off but snuck in:

pos = Workbooks("EDNData.xls").Worksheets("complete").UsedRange.Rows.Count + 1

and

wb.ActiveSheet.UsedRange.Copy Destination:=Workbooks("EDNData.xls").Worksheets("complete").Range("A" & pos)

With the direct reference to the Workbook it was happy.

Many thanks, Alan.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: SQL Server Core 2016

This course will introduce you to SQL Server Core 2016, as well as teach you about SSMS, data tools, installation, server configuration, using Management Studio, and writing and executing queries.

  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now