Link to home
Start Free TrialLog in
Avatar of smagnus1
smagnus1

asked on

Save webpage to a file

I may have met my match on this one.  I am attempting to write a VB program to:
1.)  Open a webpage and load its contents - possibly a slight delay of 3 seconds or so for the entire page to load
2.) Once the page loads, save the page to a file (Preferrably .htm or .html)
3.)  Send the saved file into an e-mail message as an attachement.

I can handle step 3 with no issues.  I envision this running as a Console Application that is fired off using the Windows Task Scheduler so that it can run at scheduled times, but am flexible in how this works.  Can someone help steer me in the right direction, please?  Thanks in advance.
Avatar of PaulHews
PaulHews
Flag of Canada image

Here's the basics of it.  If you need more help, let us know.
Imports System.Net
Imports System.IO
 
Module Module1
 
    Sub Main()
        Dim URL As String = "http://www.yahoo.com"
        Dim request As WebRequest = WebRequest.Create(URL)
       
        Dim response As HttpWebResponse = CType(request.GetResponse(), HttpWebResponse)
        Dim dataStream As Stream = response.GetResponseStream()
        Dim reader As New StreamReader(dataStream)
 
        Dim sw As New StreamWriter("C:\temp\myfile.html")
        sw.Write(reader.ReadToEnd)
 
        sw.Dispose()
        reader.Dispose()
        dataStream.Dispose()
        response.Close()
 
 
    End Sub
 
End Module

Open in new window

Avatar of smagnus1
smagnus1

ASKER

I do appreciate the quick response, and all I can really say is "Wow" on the efficiency of it.  I am now, however faced with a new issue.  The page that I am attempting to load has a lot of ActiveX plugins and whatnot that are loaded the first time this page is run on a PC.  Granted, my PC has these components installed, but the file that is created simply does no have anything but a blank space where these components are at.  Needless to say, this is an internal website behind a firewall that I simply can't show to you, but the file that is generated is attached.  Is there perhaps a way to take a screen shot of the page that has loaded or is there a better way to pull this off?  Thanks again.
ActiveX plugins don't play well with most email HTML renderers.  If it isn't necessary, I would simply strip out the tags from the source.  If the information you need is in those plugins, then you would have to open a browser window, take a snapshot of that, then send the snapshot instead of the HTML.  Let us know which approach you would need.
The information I need is in the plugins, so I am going with the second approach.  I have no problem with going this way, and granted, it isn't HTML, but I am OK with that.  Thanks again.
Because it requires a webbrowser window, we have to move away from a console type of application (this is a Winform sample.)  You should know that the code will have problems unless it is run under an interactive windows user (because it has to create a window to run.)  If you use Task Scheduler, you may have to run it under the same user ID that is logged in.

Add a WebBrowser control to the default form in a new Windows Winform application and paste this code:
'Here's a simple VB.NET sample based on an example by Mike Gold:
'http://www.vbdotnetheaven.com/UploadFile/mgold/ScreenCapturing04202005045050AM/ScreenCapturing.aspx
 
Public Class Form1
    <System.Runtime.InteropServices.DllImportAttribute("gdi32.dll")> _
    Private Shared Function BitBlt(ByVal hdcDest As IntPtr, ByVal nXDest As Integer, ByVal nYDest As Integer, ByVal nWidth As Integer, ByVal nHeight As Integer, ByVal hdcSrc As IntPtr, ByVal nXSrc As Integer, ByVal nYSrc As Integer, ByVal dwRop As System.Int32) As Boolean
    End Function
 
 
    Private Sub WebBrowser1_DocumentCompleted(ByVal sender As Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
        If WebBrowser1.ReadyState = WebBrowserReadyState.Complete Then
          
            Dim WebBrowserGraphics As Graphics = WebBrowser1.CreateGraphics()
            Dim ScreenCap As Bitmap = New Bitmap(WebBrowser1.ClientRectangle.Width, WebBrowser1.ClientRectangle.Height, WebBrowserGraphics)
            Dim MemoryGraphics As Graphics = Graphics.FromImage(ScreenCap)
            Dim WebBrowserDevice As IntPtr = WebBrowserGraphics.GetHdc()
            Dim MemoryDevice As IntPtr = MemoryGraphics.GetHdc()
            BitBlt(MemoryDevice, 0, 0, WebBrowser1.ClientRectangle.Width, WebBrowser1.ClientRectangle.Height, WebBrowserDevice, 0, 0, 13369376)
            WebBrowserGraphics.ReleaseHdc(WebBrowserDevice)
            MemoryGraphics.ReleaseHdc(MemoryDevice)
            ScreenCap.Save("c:\test.jpg", System.Drawing.Imaging.ImageFormat.Jpeg)
            'Do your mailing here
            Me.Close()
        End If
    End Sub
 
    Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
        WebBrowser1.Navigate("http://www.experts-exchange.com")
    End Sub
 
    
End Class

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of PaulHews
PaulHews
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
THANKS!  You are da man!

Here is the code with some tweaking...

Imports System.IO
Imports System.Net
Imports System.Text
Imports System.Windows.Forms
Imports Microsoft.VisualBasic

Public Class Form1
    Inherits System.Windows.Forms.Form
     _
        Private Shared Function BitBlt(ByVal hdcDest As IntPtr, ByVal nXDest As Integer, ByVal nYDest As Integer, ByVal nWidth As Integer, ByVal nHeight As Integer, ByVal hdcSrc As IntPtr, ByVal nXSrc As Integer, ByVal nYSrc As Integer, ByVal dwRop As System.Int32) As Boolean
    End Function

#Region " Windows Form Designer generated code "

    Public Sub New()
        MyBase.New()

        'This call is required by the Windows Form Designer.
        InitializeComponent()

        'Add any initialization after the InitializeComponent() call

    End Sub

    'Form overrides dispose to clean up the component list.
    Protected Overloads Overrides Sub Dispose(ByVal disposing As Boolean)
        If disposing Then
            If Not (components Is Nothing) Then
                components.Dispose()
            End If
        End If
        MyBase.Dispose(disposing)
    End Sub

    'Required by the Windows Form Designer
    Private components As System.ComponentModel.IContainer

    'NOTE: The following procedure is required by the Windows Form Designer
    'It can be modified using the Windows Form Designer.  
    'Do not modify it using the code editor.
    Friend WithEvents WebBrowser1 As AxSHDocVw.AxWebBrowser
    Friend WithEvents Timer1 As System.Windows.Forms.Timer
    Friend WithEvents TextBox1 As System.Windows.Forms.TextBox
     Private Sub InitializeComponent()
        Me.components = New System.ComponentModel.Container
        Dim resources As System.Resources.ResourceManager = New System.Resources.ResourceManager(GetType(Form1))
        Me.WebBrowser1 = New AxSHDocVw.AxWebBrowser
        Me.Timer1 = New System.Windows.Forms.Timer(Me.components)
        Me.TextBox1 = New System.Windows.Forms.TextBox
        CType(Me.WebBrowser1, System.ComponentModel.ISupportInitialize).BeginInit()
        Me.SuspendLayout()
        '
        'WebBrowser1
        '
        Me.WebBrowser1.Enabled = True
        Me.WebBrowser1.Location = New System.Drawing.Point(64, 72)
        Me.WebBrowser1.OcxState = CType(resources.GetObject("WebBrowser1.OcxState"), System.Windows.Forms.AxHost.State)
        Me.WebBrowser1.Size = New System.Drawing.Size(680, 432)
        Me.WebBrowser1.TabIndex = 0
        '
        'Timer1
        '
        Me.Timer1.Interval = 5000
        '
        'TextBox1
        '
        Me.TextBox1.Location = New System.Drawing.Point(136, 16)
        Me.TextBox1.Name = "TextBox1"
        Me.TextBox1.Size = New System.Drawing.Size(208, 20)
        Me.TextBox1.TabIndex = 1
        Me.TextBox1.Text = "TextBox1"
        '
        'Form1
        '
        Me.AutoScaleBaseSize = New System.Drawing.Size(5, 13)
        Me.ClientSize = New System.Drawing.Size(808, 574)
        Me.Controls.Add(Me.TextBox1)
        Me.Controls.Add(Me.WebBrowser1)
        Me.Name = "Form1"
        Me.Text = "Form1"
        CType(Me.WebBrowser1, System.ComponentModel.ISupportInitialize).EndInit()
        Me.ResumeLayout(False)

    End Sub

#End Region

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
        Try

            WebBrowser1.Navigate("http://10.21.16.14/webhmi/seatinfo.htm")
            'Here's a simple VB.NET sample based on an example by Mike Gold:
            'http://www.vbdotnetheaven.com/UploadFile/mgold/ScreenCapturing04202005045050AM/ScreenCapturing.aspx
            Timer1.Enabled = True

        Catch ex As Exception
            MsgBox("Form Load - " & ex.Message)

        End Try

    End Sub
    Public Sub TakeShot()
        Dim WebBrowserGraphics As Graphics = WebBrowser1.CreateGraphics()
        Dim ScreenCap As Bitmap = New Bitmap(WebBrowser1.ClientRectangle.Width, WebBrowser1.ClientRectangle.Height, WebBrowserGraphics)
        Dim MemoryGraphics As Graphics = Graphics.FromImage(ScreenCap)
        Dim WebBrowserDevice As IntPtr = WebBrowserGraphics.GetHdc()
        Dim MemoryDevice As IntPtr = MemoryGraphics.GetHdc()
        BitBlt(MemoryDevice, 0, 0, WebBrowser1.ClientRectangle.Width, WebBrowser1.ClientRectangle.Height, WebBrowserDevice, 0, 0, 13369376)
        WebBrowserGraphics.ReleaseHdc(WebBrowserDevice)
        MemoryGraphics.ReleaseHdc(MemoryDevice)
        MemoryGraphics.Dispose()
        WebBrowserGraphics.Dispose()
        ScreenCap.Save("c:\test.jpg", System.Drawing.Imaging.ImageFormat.Jpeg)
        'Do your mailing here
        Me.Close()
        'End If
        'End Sub

    End Sub

    Private Sub Timer1_Tick(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Timer1.Tick
        TakeShot()

    End Sub
End Class