• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2900
  • Last Modified:

Render HTML page and send as e-mail

Hello experts,

From within an agent, I'd like to read a web page containing pictures and Java Script and send it as E-Mail.

"GetDocumentByURL" fails (don't know why - wrong client type?).

Any idea how to do this?

The URL in question is
http://theyr.net/cg/cny/I1be177/F=js*L=en*u*041010*24*uIL*usia_DesMoines

In addition, I'd like to process a number of pages from different sites and append them to the same document. Ideas?

Additional points will be granted, if required.

Thanks
J
0
fulscher
Asked:
fulscher
  • 13
  • 12
  • 4
  • +1
3 Solutions
 
p_parthaCommented:
Is the webretriever running, also i wsa not able to check the URL ,as that site was blocked in our environment,

TO send multiple pages, you can use Notesnewletter class


partha
0
 
fulscherAuthor Commented:
It appears to be ok:

If I set the browser in the location document to be "Notes with IE", the page opens in Notes correctly if opened manually. If I set the browser to "Notes", the page returned is blank. However, GetDocumentByURL fails in both cases.

J
0
 
p_parthaCommented:
Just to verify , can u give some other URL, and check whether you are getting the value returned

partha
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
fulscherAuthor Commented:
You can download a test database from this link:

http://www.fuelscher.ch/files/MetRet.zip

To try, go to view "URLs to load", select one document, run the agent Actions/Get Meteo TEST. A new document opens which is the document just retrieved. Note that HTTPStatus returned is 200 (ok), however, the Body field is empty. This is true for all three URLs in the DB.

partha - the URLs shouldn't be blocked; this is just weather information. Anyway, it won't even work if you try http://www.google.com...

Jan

0
 
Andrea ErcolinoCommented:
The "web" task must be loaded on the server AND the agent must run on the server (client settings are not relevant)

Create an agent "(Test)" with the following code inside the Initialize event. Make sure your test agent has properties "Agent list selection" and "Target: None".

      Dim s As New NotesSession
      Dim c As NotesDocument
      Set c = s.DocumentContext
      Dim d As NotesDocument
      Set d = s.CurrentDatabase.GetDocumentByURL( c.Test( 0 ) )
      If Not( d Is Nothing ) Then
            Print {<div style="font-size: 10pt">}
            Print "<h3>TEST de " & c.GetItemValue( "Test" )( 0 ) & "</h3>"
            Print "<b>Page title</b> = " & d.GetItemValue( "Title" )( 0 ) & "<br>"
            Print "<b>Doc size</b> = " & d.Size & " bytes" & "<br>"
            Print "<b>Status</b> = " & d.GetItemValue( "HTTPStatus" )( 0 ) & "<br>"
            Print "<h4>Headers</h4>"
            Dim headers As Variant
            headers = d.HTTPHeaders
            Forall h In headers
                  Print "<b>" & h & "</b> = " & d.GetItemValue( h )( 0 ) & "<br>"
            End Forall
            Print "</div>"
      Else
            Print "error accessing remote URL " & c.GetItemValue( "Test" )( 0 )
      End If

Then create a new "Test" form with a "Test" text field and a "Test" action button (formula), with code "@Command( [ToolsRunMacro]; "(Test)" )".

Then open this form from the browser in your client through an URL like the following:
  http://www.example.com/path/to/notes/database.nsf/Test?OpenForm
Finally write some URL in the Test field and then click on the Test action
0
 
Andrea ErcolinoCommented:
Previous sample code gave me this output for your URL

TEST de http://theyr.net/cg/cny/I1be177/F=js*L=en*u*041010*24*uIL*usia_DesMoines
Page title = Weather forecast Sun Oct. 10 2004 24:00 UTC
Doc size = 26054 bytes
Status = 200

Headers
HTTPDate = 09/10/2004 13:49:59
HTTPServer = Apache/2.0.40 (Red Hat Linux)
HTTPContent_Language = en
HTTPContent_Length = 24118
HTTPContent_Type = text/html
HTTPAge = 3
HTTPContent_Script_Type = text/javascript
HTTPExpires = 09/10/2004 14:04:59
0
 
Andrea ErcolinoCommented:
Also note that the notesDatabase.GetDocumentByURL method automatically adds a document to your database with an "HTMLForm" value in the form field.

If you want to send that page by email you can simply make a call like the following
  Call notesDatabase.GetDocumentByUrl( anURL ).Send( False, toSomeone )
because Notes put the Title of the page also in a Subject field and page itself is in the Body field

0
 
fulscherAuthor Commented:
RAPUTA,

your code basically does the same as mine. So, I tried to run my agent on the server and the results are better - some URLs are retrieved more or less successfully. And, of course, Notes renders the documents just awfully but that probably was to be expected.

Please try the following URLs:

http://www.wunderground.com/US/NY/Kennedy_International.html
http://www.wetter.com/v2/

In both cases, the document is only loaded partially, the BODY field contains only part of the document.

Any idea how to resolve this?

Or even better - any idea how to get a better rendering of the result?

J
0
 
p_parthaCommented:
if you say doc.body(0), u will get only partially ... you have to use something like this


forall x in body
msgbox  x
end forall
0
 
fulscherAuthor Commented:
partha,

right. The partial documents have only one Body item and the item is only 1177 bytes long for one of the damaged documents (1127 for the other). I rather think there's a problem with creation of the document...

Jan
0
 
Andrea ErcolinoCommented:
Your suggested URLs work for me (only 1 Body field, BTW):

http://www.wunderground.com/US/NY/Kennedy_International.html --> body.length = 2563 bytes
http://www.wetter.com/v2/ --> body.length = 4087 bytes

So my suggestion is: first try my solution exactly as described. In this way you can discriminate between something wrong with your script (if my solution works also for you) or something wrong with your server (if my solution doesn't work for you).

0
 
Andrea ErcolinoCommented:
no, i was wrong, sorry

The first page is much greater than what Notes receives...
0
 
Andrea ErcolinoCommented:
I don't know for sure what causes the truncated body, but I think it's not a problem with my code, neither with the notes server. In fact, I created a page in my notes database with the html from the "wunderground" url and then I tried to access it trough my agent, and it works !!

0
 
HemanthaKumarCommented:
Other solution is not to use web task on server.. this will kill the server in production environment where there are lots of other tasks are involved. Better approach would be to use local web navigator...and set the location to Notes with IE.. This will render your html page far more better !

Although I would suggest java solution for this...

It is easy to write one and format email in mime format and send it...

~Hemanth
0
 
fulscherAuthor Commented:
Hemanth - do you happen to have some sample code?

Raputa - if the problem is not with the code and neither with the server - where is it then?

J
0
 
p_parthaCommented:
when there are changes in the page through DOM or javascript, it will not be reflected in the source.... So u will get only partial output..

Also as Hemanth suggested, the best thing would be to use Java agent. It;s very easy as there are lot of methods in URL class

see this and then build upon this:

http://www-10.lotus.com/ldd/46dom.nsf/55c38d716d632d9b8525689b005ba1c0/c329f4fabec2612185256a6a000632b8?OpenDocument
0
 
fulscherAuthor Commented:
all - I really appreciate your help.

It appears that Domino isn't very well suited to this task. For example, I still don't understand why Raputa got more bytes from the sources than I got; looks like Domino doesn't handle HTML very well after all.

To get back to the requirements: I need a function / program which reads one or more HTML pages and renders it/them to an E-Mail. Background: I'd like to have weather forecasts from different sources every day. The thing may be an agent, a server task or an independent program on the server; doesn't really matter.

The thing should run on the server (after all, if it runs on the client, I could as well look up the HTML pages directly). Rendering should be more or less accurate, i.e., the e-mail should be interpretable.

p_partha: The code sample you linked to tests whether an URL is valid. The part I need (reading the web page and inserting it into an e-Mail) is not there.

Btw: I'm running Domino 6.5 on the server.

Any ideas?
0
 
HemanthaKumarCommented:
fulscher,

I don't have the code ready made.. but you can check the help documentation (java section) on how to handle mime !

The link partha proposed is one way to read html page and store it in a buffer. use that buffer to parse in the body content to the mail. You can also use sockets.. but httpurlconnection would be ideal..
0
 
Andrea ErcolinoCommented:
I thought the same as p_partha about the javascript modifying the document and tried a simple page but it worked properly, so maybe my javascript was too simple or the problem is not there or not all there...

If your server is Win32, you could use the following function to put a page in a string. It's based upon a service of the MSXML library, which is installation is easy (and maybe it's already running)

Function GetPage( url As String ) As String
      Dim objHttp As Variant
      Set objHttp = CreateObject( "Msxml2.ServerXMLHTTP" )
      Call objHttp.Open( "GET", url, False )
      Call objHttp.Send()
      If objHttp.status <> 200 Then
            GetPage = "FAILED (status: " & objHttp.status & ")"
      Else
            Dim contentType As String
            contentType = objHttp.getResponseHeader( "Content-Type" )
            If contentType = "text/html" Then
                  GetPage = objHttp.responseText
            Else
                  GetPage = "Not HTML (type: " & contentType & ")"
            End If
      End If
      Set objHttp = Nothing
End Function
0
 
fulscherAuthor Commented:
RAPUTA - thanks - I like this. However, I need the images and layout of the page...

Hemanth - I've been trying HTML-to-MIME earlier and somehow did not find it to be very easy to create a MIME mail from a HTML page. I'll check again.

Any other options / ideas? Somebody with some functioning sample code? (well, just asking...)

J
0
 
Andrea ErcolinoCommented:
0
 
Andrea ErcolinoCommented:
0
 
fulscherAuthor Commented:
Raputa - sorry, this is over my head. Can you give me a hint on how to continue from these pages?
0
 
Andrea ErcolinoCommented:
well, the images can stay where they are I think... if you let them load from their domain, ins't it ok? and the layout should be the html itself

0
 
Andrea ErcolinoCommented:
It's very easy, just download the first file and run it on the server to install.
Then start testing the function
0
 
fulscherAuthor Commented:
Actually, I'd like to have a complete rendering of the page without external links.

Reason: I want to keep the forecasts in my mail DB for comparison and later analysis. So, if the images are linked to, they might not be available any more (actually, many of them will change at least one a day).

Sorry for the complications, but keeping the images was the reason I asked for rendering them to a Notes document...

J
0
 
Andrea ErcolinoCommented:
It seems now much more difficult... you want to create a browser in Notes !!!
0
 
fulscherAuthor Commented:
Raputa - I don't want to create a browser in Notes. The question remains still the same.

Quote from the original question:
"From within an agent, I'd like to read a web page containing pictures and Java Script and send it as E-Mail."

So, I want either "GetDocumentByURL" to function correctly (it would do what I need) or find replacement code to "GetDocumentByURL".

I've not yet been looking at the Java samples (my Java is quite rusty and I would need a day or to just to get up and running again), but I'm certainly willing to do so IF it solves the problem. However, I'm not sure about this - reading a HTML string from a web site does NOT solve the problem. if I could read an image and insert it into a Notes document, the problem would be party solved.

Ideas? Suggestions?

J
0
 
Andrea ErcolinoCommented:
fulscher,

GetDocumentByURL really does not integrates the images in the rich text field like you can do with a paste from the clipboard, but it puts the html in the Body field as it is.

So if you want to get to the images, you could iterate through the IMG tags (or the values in the field $ImageList) and try to get the filed from the internet cache directory, along with all the others needed files, such as stylesheets and javascripts
0
 
fulscherAuthor Commented:
all - thank you for your input and comments so far.

I've been playing some more with various methods and I've come up so far with something that appears to work. I restrict myself to images for the time being, since the images are the most important bit of the to-be mail.

- Load the image to a temporary file with Win32 API call URLDownloadToFile
- Convert the image from PNG to GIF using IrfanView using calls to CreateProcess / WaitForSingleObject API functions
- Create a NotesDocument and insert the image as MIMEEntity.

This appears to work so far. I've yet to compose several images into one mail (basically, I'd like to see predictions from different sources in one mail) but I'm more or less optimistic that it will work.

<Flame>It's incredible that something so appearently simple is so complicated. For example, you appearently can't just add an image to a rich text item - either it's an attachment and not shown or it's an OLE object and takes ages to insert and load. So, one has to use the MIME classes to do it. This is ridiculous. </Flame>

Raputa - I had the impression that GetDocumentByURL includes the images found at the target URL and renders the doc to rich text. Well, it isn't working anyway so we don't need to waste time thinking about what it would do if it would function correctly.

I'm still open to suggestions (and ready to spend points) - does anybody see a simpler way to do this?

Jan
0
 
fulscherAuthor Commented:
Ok, the problem is more or less solved; by downloading the files and using the Notes MIME classes, I was able to put a mail together which contains all the images I want. It's a start.

Points will be split, since all of you contributed.

J
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 13
  • 12
  • 4
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now