?
Solved

pulling html code out of the body of the message.

Posted on 2003-10-23
8
Medium Priority
?
435 Views
Last Modified: 2013-12-18
I am trying to script something or export all the messages in my mailbox to give me the body of the message unaltered by Notes.  If you look at document properties the body tag can contain HTML code but when opened Notes strips that out.  :-(  I am trying to take 5000 some email and put them into one text document and pull out any HTML tags for further processing and adding to our spam filter,  This is becoming a big pain in the rear.  

-Brian
0
Comment
Question by:bkirk007
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
8 Comments
 
LVL 24

Accepted Solution

by:
HemanthaKumar earned 500 total points
ID: 9611091
First of all there is no straight forward method in R5. R6 I am not sure, but it has enhanced classes to handle mime type fields more elegantly.

So, to make it easy I would suggest you to use HTTP stream to obtain the content in html. How ?

Two possibilities, you can use java io stream to do that and second use GetDocumentByURL method to get html. For both these methods http task should be up and running.

Read this forum's question for more details on this technique.
http://experts-exchange.com/Applications/Email/Lotus_Notes/Q_20772364.html
~Hemanth
0
 
LVL 31

Expert Comment

by:qwaletee
ID: 9611610
Do you need just the bodies?  You canuse the MIME_PART type on Body (only available when you set Convertime=False).  Then, just recursively pul all the parts and spit them back out. I have some sample code...
0
 
LVL 31

Assisted Solution

by:qwaletee
qwaletee earned 500 total points
ID: 9611621
This is hard coded for teh currently open NotesUiDocument, and hard-coded for c:\mime.txt.

It should not be difficult to adapt it to work with a NotesDocument cllection, and to name the otput file for each document based on the NoteID or UNID.

I would not dump all to a single file, since you will run into MIME parsing problems.


      Dim ws As New notesUiWorkspace
      Set doc = ws.currentDocument.document
      
      Dim s As New notesSession
      s.ConvertMime = False
      
      Dim body As notesItem
      Set body = doc.getFirstItem("Body")
      
      If body.type = MIME_PART Then
            Open "c:\mime.txt" For Output As #1
            Print #1 , "Mime-Version 1.0"
            Print #1 ,
            Print #1 , {This is a MIME-formatted message.  If you see this text it means that your
E-mail software does not support MIME-formatted messages.  Notes RTF MIME dump by Dovid Gross.}
            Print #1 ,
            Print #1 , "--" HeadersWIthBoundary ( body.getMimeEntity , Null , "" , doc.noteID) 'mime, headers-for-return, default boundary, filename
            DumpMIME body.getMimeEntity , "" , doc.universalID 'mime , level.level , boundary
            Close #1
      Else
            isComplexMIME = False
      End If
      
      
      Stop
      s.ConvertMime = True
      Delete body
      
      If Not isComplexMIME Then
            Dim text As String
            Set body = doc.getFirstItem("Body")
            If body.type = Richtext Then
                  Dim rtf As notesRichTextItem
                  Set rtf = Body
                  text = rtf.getFormattedText(True,72)
            Else
                  text = body.text
            End If
            Dim pos As Integer , nl As String
            nl = Chr$(13) + Chr$(10)
            pos = Instr(text,nl)
            While pos
                  Dim two As String*2
                  two = Mid$(text,pos+2,2)
                  If two <> nl And two <> "==" Then
                        two = Mid$(text,pos-2,2)
                        If two <> "==" Then
                              Mid$(text,pos,2) = "  "
                        End If
                  End If
                  pos = Instr(pos+2,text,nl)
            Wend
            Set doc = New notesDocument ( doc.parentDatabase )
            doc.form = "memo"
            doc.body = text
            ws.editDocument True , doc
      End If


Sub DumpMIME ( mime As notesMimeEntity , Byval level As String , boundary As String)
      
      Dim count As Integer
      
      If mime Is Nothing Then
            Print #1 ,
            Print #1 , ";(Empty MIME part - " & level & "x)"
            Print #1 ,
            Exit Sub
      End If
      
      If level <> "" Then
            isComplexMIME = True
      End If
      
      Do Until mime Is Nothing
            count = count + 1
            
            If (count > 1) Then
                  isComplexMIME = True
            End If
            
            Dim headers As String
            boundary = HeadersWithBoundary ( mime , headers , boundary , level & count)
            If (mime.contentType <> "text") Or (mime.contentSubType <> "plain") Then
                  isComplexMIME = True
            End If
            
            Print #1 ,
            Print #1 , "--" boundary
            Print #1 , ";Part " & level & count
            Print #1,  ";Content type:" & Chr(9) & mime.ContentType
            Print #1 , ";Content subtype:" & Chr(9) & mime.ContentSubtype
            Print #1 , ";Headers..."
            
            Print #1 ,  Headers
            Print #1 ,
            Print #1 , mime.ContentAsText
            Print #1 ,
            
            Print #1 , ";Children..."
            Print #1 ,
            DumpMIME mime.getFirstChildEntity , level & count & "." , boundary
            Set mime = mime.getNextSibling
      Loop
      Print #1 , "--" boundary "--"
End Sub
Function HeadersWithBoundary ( mime As notesMimeEntity , headers As Variant , Byval boundary As String , id As String) As String
      doc.tempEvaluate = mime.Headers
      Dim newHeaders As Variant
      newHeaders = Evaluate({@ReplaceSubString(tempEvaluate; @Char(13); @Char(13) + @Char(10))} , doc )
      doc.removeItem "tempEvaluate"
      headers = newHeaders(0)
      Const boundMark = {Boundary="}
      Dim boundStart As Integer , boundEnd As Integer, boundSize
      boundStart = Instr(headers,boundMark)
      If boundStart Then
            boundStart = boundStart + Len(boundMark)
            boundEnd = Instr(boundStart , headers , {"})
            If boundEnd Then
                  boundSize = boundEnd - boundStart
                  boundary = Mid$(headers , boundStart , boundSize )
            Else
                  boundary = Mid$(headers , boundStart )
            End If
      Else
            boundary = boundary & "___" & doc.universalID
      End If
      
      Const typeMark = {Content-type: }
      boundStart = Instr(headers,typeMark)
      If boundStart Then
            boundStart = boundStart + Len(typeMark)
            boundEnd = Instr(boundStart,headers,"; ")
            boundSize = Instr(boundStart,headers," ")
            If boundEnd < 1 Or (boundSize < boundEnd And boundSize > 0) Then
                  boundEnd = boundSize - 1
                  Print "space as marker " id
            End If
            boundSize = Instr(boundStart,headers,Chr$(10))
            If boundEnd < 1 Or (boundSize < boundEnd And boundSize > 0) Then
                  boundEnd = boundSize -1
                  Print "lf as marker " id
            End If
            boundSize = Instr(boundStart,headers,Chr$(13))
            If boundEnd < 1 Or (boundSize < boundEnd And boundSize > 0) Then
                  boundEnd = boundSize - 1
                  Print "cr as marker " id
            End If
            If boundEnd = 0 Then
                  headers = headers & "; name=" & id & ".txt"
                  Print "no marker " id
            Else
                  headers = Left$(headers, boundEnd) & " ; name=" & id & ".txt; " & Mid$(headers, boundEnd + 1)
            End If
      Else
            Print "Unable to insert " id
      End If
      
      HeadersWithBoundary = boundary
End Function
0
 
LVL 24

Expert Comment

by:HemanthaKumar
ID: 9614249
Wow, now somebody writes code for free !
0
 
LVL 31

Expert Comment

by:qwaletee
ID: 9617655
Nope, pulled that out of a test file I created to debug a MIME problem.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

  In today’s Arena we can’t imagine our lives without Internet as we are highly used to of it. If we consider our life style just for only 2 min we found that face to face communication is swapped by e-communication.  Every Where from Works place to…
Sometimes clients can lose connectivity with the Lotus Notes Domino Server, but there's not always an obvious answer as to why it happens.   Read this article to follow one of the first experiences I had with Lotus Notes on a client's machine, my…
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…
Add bar graphs to Access queries using Unicode block characters. Graphs appear on every record in the color you want. Give life to numbers. Hopes this gives you ideas on visualizing your data in new ways ~ Create a calculated field in a query: …
Suggested Courses
Course of the Month8 days, 22 hours left to enroll

764 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question