Solved

pulling html code out of the body of the message.

Posted on 2003-10-23
8
433 Views
Last Modified: 2013-12-18
I am trying to script something or export all the messages in my mailbox to give me the body of the message unaltered by Notes.  If you look at document properties the body tag can contain HTML code but when opened Notes strips that out.  :-(  I am trying to take 5000 some email and put them into one text document and pull out any HTML tags for further processing and adding to our spam filter,  This is becoming a big pain in the rear.  

-Brian
0
Comment
Question by:bkirk007
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
8 Comments
 
LVL 24

Accepted Solution

by:
HemanthaKumar earned 125 total points
ID: 9611091
First of all there is no straight forward method in R5. R6 I am not sure, but it has enhanced classes to handle mime type fields more elegantly.

So, to make it easy I would suggest you to use HTTP stream to obtain the content in html. How ?

Two possibilities, you can use java io stream to do that and second use GetDocumentByURL method to get html. For both these methods http task should be up and running.

Read this forum's question for more details on this technique.
http://experts-exchange.com/Applications/Email/Lotus_Notes/Q_20772364.html
~Hemanth
0
 
LVL 31

Expert Comment

by:qwaletee
ID: 9611610
Do you need just the bodies?  You canuse the MIME_PART type on Body (only available when you set Convertime=False).  Then, just recursively pul all the parts and spit them back out. I have some sample code...
0
 
LVL 31

Assisted Solution

by:qwaletee
qwaletee earned 125 total points
ID: 9611621
This is hard coded for teh currently open NotesUiDocument, and hard-coded for c:\mime.txt.

It should not be difficult to adapt it to work with a NotesDocument cllection, and to name the otput file for each document based on the NoteID or UNID.

I would not dump all to a single file, since you will run into MIME parsing problems.


      Dim ws As New notesUiWorkspace
      Set doc = ws.currentDocument.document
      
      Dim s As New notesSession
      s.ConvertMime = False
      
      Dim body As notesItem
      Set body = doc.getFirstItem("Body")
      
      If body.type = MIME_PART Then
            Open "c:\mime.txt" For Output As #1
            Print #1 , "Mime-Version 1.0"
            Print #1 ,
            Print #1 , {This is a MIME-formatted message.  If you see this text it means that your
E-mail software does not support MIME-formatted messages.  Notes RTF MIME dump by Dovid Gross.}
            Print #1 ,
            Print #1 , "--" HeadersWIthBoundary ( body.getMimeEntity , Null , "" , doc.noteID) 'mime, headers-for-return, default boundary, filename
            DumpMIME body.getMimeEntity , "" , doc.universalID 'mime , level.level , boundary
            Close #1
      Else
            isComplexMIME = False
      End If
      
      
      Stop
      s.ConvertMime = True
      Delete body
      
      If Not isComplexMIME Then
            Dim text As String
            Set body = doc.getFirstItem("Body")
            If body.type = Richtext Then
                  Dim rtf As notesRichTextItem
                  Set rtf = Body
                  text = rtf.getFormattedText(True,72)
            Else
                  text = body.text
            End If
            Dim pos As Integer , nl As String
            nl = Chr$(13) + Chr$(10)
            pos = Instr(text,nl)
            While pos
                  Dim two As String*2
                  two = Mid$(text,pos+2,2)
                  If two <> nl And two <> "==" Then
                        two = Mid$(text,pos-2,2)
                        If two <> "==" Then
                              Mid$(text,pos,2) = "  "
                        End If
                  End If
                  pos = Instr(pos+2,text,nl)
            Wend
            Set doc = New notesDocument ( doc.parentDatabase )
            doc.form = "memo"
            doc.body = text
            ws.editDocument True , doc
      End If


Sub DumpMIME ( mime As notesMimeEntity , Byval level As String , boundary As String)
      
      Dim count As Integer
      
      If mime Is Nothing Then
            Print #1 ,
            Print #1 , ";(Empty MIME part - " & level & "x)"
            Print #1 ,
            Exit Sub
      End If
      
      If level <> "" Then
            isComplexMIME = True
      End If
      
      Do Until mime Is Nothing
            count = count + 1
            
            If (count > 1) Then
                  isComplexMIME = True
            End If
            
            Dim headers As String
            boundary = HeadersWithBoundary ( mime , headers , boundary , level & count)
            If (mime.contentType <> "text") Or (mime.contentSubType <> "plain") Then
                  isComplexMIME = True
            End If
            
            Print #1 ,
            Print #1 , "--" boundary
            Print #1 , ";Part " & level & count
            Print #1,  ";Content type:" & Chr(9) & mime.ContentType
            Print #1 , ";Content subtype:" & Chr(9) & mime.ContentSubtype
            Print #1 , ";Headers..."
            
            Print #1 ,  Headers
            Print #1 ,
            Print #1 , mime.ContentAsText
            Print #1 ,
            
            Print #1 , ";Children..."
            Print #1 ,
            DumpMIME mime.getFirstChildEntity , level & count & "." , boundary
            Set mime = mime.getNextSibling
      Loop
      Print #1 , "--" boundary "--"
End Sub
Function HeadersWithBoundary ( mime As notesMimeEntity , headers As Variant , Byval boundary As String , id As String) As String
      doc.tempEvaluate = mime.Headers
      Dim newHeaders As Variant
      newHeaders = Evaluate({@ReplaceSubString(tempEvaluate; @Char(13); @Char(13) + @Char(10))} , doc )
      doc.removeItem "tempEvaluate"
      headers = newHeaders(0)
      Const boundMark = {Boundary="}
      Dim boundStart As Integer , boundEnd As Integer, boundSize
      boundStart = Instr(headers,boundMark)
      If boundStart Then
            boundStart = boundStart + Len(boundMark)
            boundEnd = Instr(boundStart , headers , {"})
            If boundEnd Then
                  boundSize = boundEnd - boundStart
                  boundary = Mid$(headers , boundStart , boundSize )
            Else
                  boundary = Mid$(headers , boundStart )
            End If
      Else
            boundary = boundary & "___" & doc.universalID
      End If
      
      Const typeMark = {Content-type: }
      boundStart = Instr(headers,typeMark)
      If boundStart Then
            boundStart = boundStart + Len(typeMark)
            boundEnd = Instr(boundStart,headers,"; ")
            boundSize = Instr(boundStart,headers," ")
            If boundEnd < 1 Or (boundSize < boundEnd And boundSize > 0) Then
                  boundEnd = boundSize - 1
                  Print "space as marker " id
            End If
            boundSize = Instr(boundStart,headers,Chr$(10))
            If boundEnd < 1 Or (boundSize < boundEnd And boundSize > 0) Then
                  boundEnd = boundSize -1
                  Print "lf as marker " id
            End If
            boundSize = Instr(boundStart,headers,Chr$(13))
            If boundEnd < 1 Or (boundSize < boundEnd And boundSize > 0) Then
                  boundEnd = boundSize - 1
                  Print "cr as marker " id
            End If
            If boundEnd = 0 Then
                  headers = headers & "; name=" & id & ".txt"
                  Print "no marker " id
            Else
                  headers = Left$(headers, boundEnd) & " ; name=" & id & ".txt; " & Mid$(headers, boundEnd + 1)
            End If
      Else
            Print "Unable to insert " id
      End If
      
      HeadersWithBoundary = boundary
End Function
0
 
LVL 24

Expert Comment

by:HemanthaKumar
ID: 9614249
Wow, now somebody writes code for free !
0
 
LVL 31

Expert Comment

by:qwaletee
ID: 9617655
Nope, pulled that out of a test file I created to debug a MIME problem.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

For beginners of Lotus Notes user this is important to know about the types of files and their location supported by IBM Notes. Mostly users are unaware about how many file types are created and what their usages are. This Article is fully dedicated…
Notes Document Link used by IBM Notes is a link file which aids in the sharing of links to documents in email and webpages. The posts describe the importance and steps to create a Lotus Notes NDL file in brief.
In this video, viewers are given an introduction to using the Windows 10 Snipping Tool, how to quickly locate it when it's needed and also how make it always available with a single click of a mouse button, by pinning it to the Desktop Task Bar. Int…
Do you want to know how to make a graph with Microsoft Access? First, create a query with the data for the chart. Then make a blank form and add a chart control. This video also shows how to change what data is displayed on the graph as well as form…

724 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question