Solved

VB. Net - Parsing HTML within a email (.eml)

Posted on 2002-06-19
8
1,321 Views
Last Modified: 2008-02-01
Hi i have a folder full of saved emails, all saved as .eml file. All of the emails have HTML text embedded in them, i want to write a peice of VB.NET to parse these .eml files and search for text strings.

I have tried used the FileSystemObject and opening the file as a textstream, but this doesn't work right

Can someone show me what i should be doing to parse the HTML correctly - some sample code would be a great help

Thanks
0
Comment
Question by:Molko
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 22

Expert Comment

by:CJ_S
ID: 7093355
To parse HTML correctly you will need to emulate the webbrowser. This can easily be accomplished by placing a WebBrowser control on the form and loading the .eml files into it. Then you can automate it and check for the innerHTML property.

If you need more info just say so.

CJ
0
 

Author Comment

by:Molko
ID: 7093404
Hmm, i'm not too sure i need to go that far, essentially all i am trying to do is sequentially read a folder full of .eml files (each file contains HTML) and strip out an email address - the email address is in the HTML.

I wanted to store each email address in a datagrid for use later on.

I have never used the WebBrowser control before, are you saying it would allow me facilitate some of the above ?

Thanks
0
 
LVL 20

Expert Comment

by:Silvers5
ID: 7093585
use the system.IO class to parse files in ASP.NET.. search using wincv.exe utility that comes with vs.net on the methods and propreties to read text files..

here's some little code on playing around with files.. it's VB.NET..


Imports System.IO
Public Class Form1
    Inherits System.Windows.Forms.Form

#Region " Windows Form Designer generated code "

    Public Sub New()
        MyBase.New()

        'This call is required by the Windows Form Designer.
        InitializeComponent()

        'Add any initialization after the InitializeComponent() call

    End Sub

    'Form overrides dispose to clean up the component list.
    Protected Overloads Overrides Sub Dispose(ByVal disposing As Boolean)
        If disposing Then
            If Not (components Is Nothing) Then
                components.Dispose()
            End If
        End If
        MyBase.Dispose(disposing)
    End Sub

    'Required by the Windows Form Designer
    Private components As System.ComponentModel.IContainer

    'NOTE: The following procedure is required by the Windows Form Designer
    'It can be modified using the Windows Form Designer.  
    'Do not modify it using the code editor.
    Friend WithEvents Button1 As System.Windows.Forms.Button
    Friend WithEvents TextBox1 As System.Windows.Forms.TextBox
    Friend WithEvents Label1 As System.Windows.Forms.Label
    Friend WithEvents Label2 As System.Windows.Forms.Label
    Friend WithEvents TextBox2 As System.Windows.Forms.TextBox
    Friend WithEvents Button2 As System.Windows.Forms.Button
    Friend WithEvents Button3 As System.Windows.Forms.Button
    Friend WithEvents Button4 As System.Windows.Forms.Button
    <System.Diagnostics.DebuggerStepThrough()> Private Sub InitializeComponent()
        Me.Button1 = New System.Windows.Forms.Button()
        Me.TextBox1 = New System.Windows.Forms.TextBox()
        Me.Label1 = New System.Windows.Forms.Label()
        Me.Label2 = New System.Windows.Forms.Label()
        Me.TextBox2 = New System.Windows.Forms.TextBox()
        Me.Button2 = New System.Windows.Forms.Button()
        Me.Button3 = New System.Windows.Forms.Button()
        Me.Button4 = New System.Windows.Forms.Button()
        Me.SuspendLayout()
        '
        'Button1
        '
        Me.Button1.Location = New System.Drawing.Point(16, 88)
        Me.Button1.Name = "Button1"
        Me.Button1.Size = New System.Drawing.Size(88, 24)
        Me.Button1.TabIndex = 0
        Me.Button1.Text = "Create da filos"
        '
        'TextBox1
        '
        Me.TextBox1.Location = New System.Drawing.Point(64, 56)
        Me.TextBox1.Name = "TextBox1"
        Me.TextBox1.Size = New System.Drawing.Size(344, 20)
        Me.TextBox1.TabIndex = 1
        Me.TextBox1.Text = ""
        '
        'Label1
        '
        Me.Label1.Location = New System.Drawing.Point(8, 56)
        Me.Label1.Name = "Label1"
        Me.Label1.Size = New System.Drawing.Size(48, 16)
        Me.Label1.TabIndex = 2
        Me.Label1.Text = "Text"
        '
        'Label2
        '
        Me.Label2.Location = New System.Drawing.Point(8, 24)
        Me.Label2.Name = "Label2"
        Me.Label2.Size = New System.Drawing.Size(56, 16)
        Me.Label2.TabIndex = 3
        Me.Label2.Text = "File Name"
        '
        'TextBox2
        '
        Me.TextBox2.Location = New System.Drawing.Point(64, 24)
        Me.TextBox2.Name = "TextBox2"
        Me.TextBox2.Size = New System.Drawing.Size(208, 20)
        Me.TextBox2.TabIndex = 4
        Me.TextBox2.Text = ""
        '
        'Button2
        '
        Me.Button2.Location = New System.Drawing.Point(112, 88)
        Me.Button2.Name = "Button2"
        Me.Button2.Size = New System.Drawing.Size(88, 24)
        Me.Button2.TabIndex = 5
        Me.Button2.Text = "Read da filos"
        '
        'Button3
        '
        Me.Button3.Location = New System.Drawing.Point(208, 88)
        Me.Button3.Name = "Button3"
        Me.Button3.Size = New System.Drawing.Size(64, 24)
        Me.Button3.TabIndex = 6
        Me.Button3.Text = "Copy it"
        '
        'Button4
        '
        Me.Button4.Location = New System.Drawing.Point(280, 88)
        Me.Button4.Name = "Button4"
        Me.Button4.Size = New System.Drawing.Size(80, 24)
        Me.Button4.TabIndex = 7
        Me.Button4.Text = "Get File info"
        '
        'Form1
        '
        Me.AutoScaleBaseSize = New System.Drawing.Size(5, 13)
        Me.ClientSize = New System.Drawing.Size(480, 126)
        Me.Controls.AddRange(New System.Windows.Forms.Control() {Me.Button4, Me.Button3, Me.Button2, Me.TextBox2, Me.Label2, Me.Label1, Me.TextBox1, Me.Button1})
        Me.Name = "Form1"
        Me.Text = "Form1"
        Me.ResumeLayout(False)

    End Sub

#End Region

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim sTextos As String = ""
        Dim sFilos As String = "C:\Documents and Settings\Micha\Desktop\"
        If Me.TextBox2.Text <> "" Then
            sFilos += Me.TextBox2.Text.ToString
        Else
            sFilos += "default.txt"
        End If
        Dim fFiler As File
        If fFiler.Exists(sFilos) Then
            fFiler.SetAttributes(sFilos, FileAttributes.Normal)
        End If
        Dim fStreamWriter As StreamWriter = File.AppendText(sFilos)
        sTextos = Me.TextBox1.Text
        With fStreamWriter
            .WriteLine(sTextos)
            .WriteLine("--End textos---")
            .Flush()
            .Close()
        End With
        fFiler.SetAttributes(sFilos, FileAttributes.ReadOnly)
    End Sub

    Private Sub Button2_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click
        Dim fFiler As File
        Dim sFile As String = "C:\Documents and Settings\Micha\Desktop\"
        If Me.TextBox2.Text = "" Then
            sFile += "default.txt"
        Else
            sFile += Me.TextBox2.Text.ToString
        End If
        If fFiler.Exists(sFile) Then
            Dim sFReader As StreamReader = File.OpenText(sFile)
            MessageBox.Show(sFReader.ReadToEnd.ToString)
            sFReader.Close()
            MessageBox.Show("Created on:" & fFiler.GetCreationTime(sFile).ToString)
        End If
    End Sub

    Private Sub Button3_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button3.Click
        Dim fFile As File
        fFile.Copy("C:\Documents and Settings\Micha\Desktop\default.txt", "C:\Documents and Settings\Micha\Desktop\copy of default.txt")
    End Sub

    Private Sub Button4_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button4.Click
        Dim infFile As FileInfo = New FileInfo("C:\Documents and Settings\Micha\Desktop\default.txt")
        MessageBox.Show(infFile.Length.ToString & " bytes")
    End Sub
End Class
0
 
LVL 20

Expert Comment

by:Silvers5
ID: 7095077
use the directoryInfo class too to get file names in your directory and pars them.. BTW when you get the email body in a variable you can process it thru variable.indexof (we used to use Instr function in VB6)
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 4

Accepted Solution

by:
saar2 earned 50 total points
ID: 7095185
The following code does the work.
It reads each eml file and add each email address that is found in the files to the list.

After you have the list you can do anything with it.

Imports System.IO
Imports System.Text.RegularExpressions
Imports System.Collections

Module Module1


    Sub Main()
        Const direcotry As String = "c:\temp"
        Dim filename As String

        Dim emailPattern As Regex
        emailPattern = New Regex("\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*")
        Dim emailsList As New ArrayList()

        'Loop through each eml file in the directory
        For Each filename In Directory.GetFiles(direcotry, "*.eml")
            Dim reader As StreamReader

            'Open the specified file and wrap it's stream with a StreamReader
            reader = New StreamReader(File.OpenRead(filename))

            Dim content As String

            'Read the whole file into memory
            content = reader.ReadToEnd()

            'Release the file
            reader.Close()

            Dim m As Match
            For Each m In emailPattern.Matches(content)
                emailsList.Add(m.Value)
            Next
        Next
    End Sub
End Module

Good luck.

Saar Carmi
Israel .Net Developer
saar@bigfoot.com
0
 

Author Comment

by:Molko
ID: 7116738
Cheers thanks v much
0
 
LVL 22

Expert Comment

by:CJ_S
ID: 7121769
Molko,
Can you finish this question by accepting any of the above comments as the answer (please select the one you used)

CJ
0
 

Author Comment

by:Molko
ID: 7130491
Cheers thanks very much
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

In my previous two articles we discussed Binary Serialization (http://www.experts-exchange.com/A_4362.html) and XML Serialization (http://www.experts-exchange.com/A_4425.html). In this article we will try to know more about SOAP (Simple Object Acces…
Many of us here at EE write code. Many of us write exceptional code; just as many of us write exception-prone code. As we all should know, exceptions are a mechanism for handling errors which are typically out of our control. From database errors, t…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now