fulscher
asked on
Render HTML page and send as e-mail
Hello experts,
From within an agent, I'd like to read a web page containing pictures and Java Script and send it as E-Mail.
"GetDocumentByURL" fails (don't know why - wrong client type?).
Any idea how to do this?
The URL in question is
http://theyr.net/cg/cny/I1be177/F=js*L=en*u*041010*24*uIL*usia_DesMoines
In addition, I'd like to process a number of pages from different sites and append them to the same document. Ideas?
Additional points will be granted, if required.
Thanks
J
From within an agent, I'd like to read a web page containing pictures and Java Script and send it as E-Mail.
"GetDocumentByURL" fails (don't know why - wrong client type?).
Any idea how to do this?
The URL in question is
http://theyr.net/cg/cny/I1be177/F=js*L=en*u*041010*24*uIL*usia_DesMoines
In addition, I'd like to process a number of pages from different sites and append them to the same document. Ideas?
Additional points will be granted, if required.
Thanks
J
ASKER
It appears to be ok:
If I set the browser in the location document to be "Notes with IE", the page opens in Notes correctly if opened manually. If I set the browser to "Notes", the page returned is blank. However, GetDocumentByURL fails in both cases.
J
If I set the browser in the location document to be "Notes with IE", the page opens in Notes correctly if opened manually. If I set the browser to "Notes", the page returned is blank. However, GetDocumentByURL fails in both cases.
J
Just to verify , can u give some other URL, and check whether you are getting the value returned
partha
partha
ASKER
You can download a test database from this link:
http://www.fuelscher.ch/files/MetRet.zip
To try, go to view "URLs to load", select one document, run the agent Actions/Get Meteo TEST. A new document opens which is the document just retrieved. Note that HTTPStatus returned is 200 (ok), however, the Body field is empty. This is true for all three URLs in the DB.
partha - the URLs shouldn't be blocked; this is just weather information. Anyway, it won't even work if you try http://www.google.com...
Jan
http://www.fuelscher.ch/files/MetRet.zip
To try, go to view "URLs to load", select one document, run the agent Actions/Get Meteo TEST. A new document opens which is the document just retrieved. Note that HTTPStatus returned is 200 (ok), however, the Body field is empty. This is true for all three URLs in the DB.
partha - the URLs shouldn't be blocked; this is just weather information. Anyway, it won't even work if you try http://www.google.com...
Jan
The "web" task must be loaded on the server AND the agent must run on the server (client settings are not relevant)
Create an agent "(Test)" with the following code inside the Initialize event. Make sure your test agent has properties "Agent list selection" and "Target: None".
Dim s As New NotesSession
Dim c As NotesDocument
Set c = s.DocumentContext
Dim d As NotesDocument
Set d = s.CurrentDatabase.GetDocum entByURL( c.Test( 0 ) )
If Not( d Is Nothing ) Then
Print {<div style="font-size: 10pt">}
Print "<h3>TEST de " & c.GetItemValue( "Test" )( 0 ) & "</h3>"
Print "<b>Page title</b> = " & d.GetItemValue( "Title" )( 0 ) & "<br>"
Print "<b>Doc size</b> = " & d.Size & " bytes" & "<br>"
Print "<b>Status</b> = " & d.GetItemValue( "HTTPStatus" )( 0 ) & "<br>"
Print "<h4>Headers</h4>"
Dim headers As Variant
headers = d.HTTPHeaders
Forall h In headers
Print "<b>" & h & "</b> = " & d.GetItemValue( h )( 0 ) & "<br>"
End Forall
Print "</div>"
Else
Print "error accessing remote URL " & c.GetItemValue( "Test" )( 0 )
End If
Then create a new "Test" form with a "Test" text field and a "Test" action button (formula), with code "@Command( [ToolsRunMacro]; "(Test)" )".
Then open this form from the browser in your client through an URL like the following:
http://www.example.com/path/to/notes/database.nsf/Test?OpenForm
Finally write some URL in the Test field and then click on the Test action
Create an agent "(Test)" with the following code inside the Initialize event. Make sure your test agent has properties "Agent list selection" and "Target: None".
Dim s As New NotesSession
Dim c As NotesDocument
Set c = s.DocumentContext
Dim d As NotesDocument
Set d = s.CurrentDatabase.GetDocum
If Not( d Is Nothing ) Then
Print {<div style="font-size: 10pt">}
Print "<h3>TEST de " & c.GetItemValue( "Test" )( 0 ) & "</h3>"
Print "<b>Page title</b> = " & d.GetItemValue( "Title" )( 0 ) & "<br>"
Print "<b>Doc size</b> = " & d.Size & " bytes" & "<br>"
Print "<b>Status</b> = " & d.GetItemValue( "HTTPStatus" )( 0 ) & "<br>"
Print "<h4>Headers</h4>"
Dim headers As Variant
headers = d.HTTPHeaders
Forall h In headers
Print "<b>" & h & "</b> = " & d.GetItemValue( h )( 0 ) & "<br>"
End Forall
Print "</div>"
Else
Print "error accessing remote URL " & c.GetItemValue( "Test" )( 0 )
End If
Then create a new "Test" form with a "Test" text field and a "Test" action button (formula), with code "@Command( [ToolsRunMacro]; "(Test)" )".
Then open this form from the browser in your client through an URL like the following:
http://www.example.com/path/to/notes/database.nsf/Test?OpenForm
Finally write some URL in the Test field and then click on the Test action
Previous sample code gave me this output for your URL
TEST de http://theyr.net/cg/cny/I1be177/F=js*L=en*u*041010*24*uIL*usia_DesMoines
Page title = Weather forecast Sun Oct. 10 2004 24:00 UTC
Doc size = 26054 bytes
Status = 200
Headers
HTTPDate = 09/10/2004 13:49:59
HTTPServer = Apache/2.0.40 (Red Hat Linux)
HTTPContent_Language = en
HTTPContent_Length = 24118
HTTPContent_Type = text/html
HTTPAge = 3
HTTPContent_Script_Type = text/javascript
HTTPExpires = 09/10/2004 14:04:59
TEST de http://theyr.net/cg/cny/I1be177/F=js*L=en*u*041010*24*uIL*usia_DesMoines
Page title = Weather forecast Sun Oct. 10 2004 24:00 UTC
Doc size = 26054 bytes
Status = 200
Headers
HTTPDate = 09/10/2004 13:49:59
HTTPServer = Apache/2.0.40 (Red Hat Linux)
HTTPContent_Language = en
HTTPContent_Length = 24118
HTTPContent_Type = text/html
HTTPAge = 3
HTTPContent_Script_Type = text/javascript
HTTPExpires = 09/10/2004 14:04:59
Also note that the notesDatabase.GetDocumentB yURL method automatically adds a document to your database with an "HTMLForm" value in the form field.
If you want to send that page by email you can simply make a call like the following
Call notesDatabase.GetDocumentB yUrl( anURL ).Send( False, toSomeone )
because Notes put the Title of the page also in a Subject field and page itself is in the Body field
If you want to send that page by email you can simply make a call like the following
Call notesDatabase.GetDocumentB
because Notes put the Title of the page also in a Subject field and page itself is in the Body field
ASKER
RAPUTA,
your code basically does the same as mine. So, I tried to run my agent on the server and the results are better - some URLs are retrieved more or less successfully. And, of course, Notes renders the documents just awfully but that probably was to be expected.
Please try the following URLs:
http://www.wunderground.com/US/NY/Kennedy_International.html
http://www.wetter.com/v2/
In both cases, the document is only loaded partially, the BODY field contains only part of the document.
Any idea how to resolve this?
Or even better - any idea how to get a better rendering of the result?
J
your code basically does the same as mine. So, I tried to run my agent on the server and the results are better - some URLs are retrieved more or less successfully. And, of course, Notes renders the documents just awfully but that probably was to be expected.
Please try the following URLs:
http://www.wunderground.com/US/NY/Kennedy_International.html
http://www.wetter.com/v2/
In both cases, the document is only loaded partially, the BODY field contains only part of the document.
Any idea how to resolve this?
Or even better - any idea how to get a better rendering of the result?
J
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
partha,
right. The partial documents have only one Body item and the item is only 1177 bytes long for one of the damaged documents (1127 for the other). I rather think there's a problem with creation of the document...
Jan
right. The partial documents have only one Body item and the item is only 1177 bytes long for one of the damaged documents (1127 for the other). I rather think there's a problem with creation of the document...
Jan
Your suggested URLs work for me (only 1 Body field, BTW):
http://www.wunderground.com/US/NY/Kennedy_International.html --> body.length = 2563 bytes
http://www.wetter.com/v2/ --> body.length = 4087 bytes
So my suggestion is: first try my solution exactly as described. In this way you can discriminate between something wrong with your script (if my solution works also for you) or something wrong with your server (if my solution doesn't work for you).
http://www.wunderground.com/US/NY/Kennedy_International.html --> body.length = 2563 bytes
http://www.wetter.com/v2/ --> body.length = 4087 bytes
So my suggestion is: first try my solution exactly as described. In this way you can discriminate between something wrong with your script (if my solution works also for you) or something wrong with your server (if my solution doesn't work for you).
no, i was wrong, sorry
The first page is much greater than what Notes receives...
The first page is much greater than what Notes receives...
I don't know for sure what causes the truncated body, but I think it's not a problem with my code, neither with the notes server. In fact, I created a page in my notes database with the html from the "wunderground" url and then I tried to access it trough my agent, and it works !!
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hemanth - do you happen to have some sample code?
Raputa - if the problem is not with the code and neither with the server - where is it then?
J
Raputa - if the problem is not with the code and neither with the server - where is it then?
J
when there are changes in the page through DOM or javascript, it will not be reflected in the source.... So u will get only partial output..
Also as Hemanth suggested, the best thing would be to use Java agent. It;s very easy as there are lot of methods in URL class
see this and then build upon this:
http://www-10.lotus.com/ldd/46dom.nsf/55c38d716d632d9b8525689b005ba1c0/c329f4fabec2612185256a6a000632b8?OpenDocument
Also as Hemanth suggested, the best thing would be to use Java agent. It;s very easy as there are lot of methods in URL class
see this and then build upon this:
http://www-10.lotus.com/ldd/46dom.nsf/55c38d716d632d9b8525689b005ba1c0/c329f4fabec2612185256a6a000632b8?OpenDocument
ASKER
all - I really appreciate your help.
It appears that Domino isn't very well suited to this task. For example, I still don't understand why Raputa got more bytes from the sources than I got; looks like Domino doesn't handle HTML very well after all.
To get back to the requirements: I need a function / program which reads one or more HTML pages and renders it/them to an E-Mail. Background: I'd like to have weather forecasts from different sources every day. The thing may be an agent, a server task or an independent program on the server; doesn't really matter.
The thing should run on the server (after all, if it runs on the client, I could as well look up the HTML pages directly). Rendering should be more or less accurate, i.e., the e-mail should be interpretable.
p_partha: The code sample you linked to tests whether an URL is valid. The part I need (reading the web page and inserting it into an e-Mail) is not there.
Btw: I'm running Domino 6.5 on the server.
Any ideas?
It appears that Domino isn't very well suited to this task. For example, I still don't understand why Raputa got more bytes from the sources than I got; looks like Domino doesn't handle HTML very well after all.
To get back to the requirements: I need a function / program which reads one or more HTML pages and renders it/them to an E-Mail. Background: I'd like to have weather forecasts from different sources every day. The thing may be an agent, a server task or an independent program on the server; doesn't really matter.
The thing should run on the server (after all, if it runs on the client, I could as well look up the HTML pages directly). Rendering should be more or less accurate, i.e., the e-mail should be interpretable.
p_partha: The code sample you linked to tests whether an URL is valid. The part I need (reading the web page and inserting it into an e-Mail) is not there.
Btw: I'm running Domino 6.5 on the server.
Any ideas?
fulscher,
I don't have the code ready made.. but you can check the help documentation (java section) on how to handle mime !
The link partha proposed is one way to read html page and store it in a buffer. use that buffer to parse in the body content to the mail. You can also use sockets.. but httpurlconnection would be ideal..
I don't have the code ready made.. but you can check the help documentation (java section) on how to handle mime !
The link partha proposed is one way to read html page and store it in a buffer. use that buffer to parse in the body content to the mail. You can also use sockets.. but httpurlconnection would be ideal..
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
RAPUTA - thanks - I like this. However, I need the images and layout of the page...
Hemanth - I've been trying HTML-to-MIME earlier and somehow did not find it to be very easy to create a MIME mail from a HTML page. I'll check again.
Any other options / ideas? Somebody with some functioning sample code? (well, just asking...)
J
Hemanth - I've been trying HTML-to-MIME earlier and somehow did not find it to be very easy to create a MIME mail from a HTML page. I'll check again.
Any other options / ideas? Somebody with some functioning sample code? (well, just asking...)
J
Last version of the msxml library is here
http://www.microsoft.com/downloads/details.aspx?FamilyID=3144b72b-b4f2-46da-b4b6-c5d7485f2b42&DisplayLang=en
http://www.microsoft.com/downloads/details.aspx?FamilyID=3144b72b-b4f2-46da-b4b6-c5d7485f2b42&DisplayLang=en
ASKER
Raputa - sorry, this is over my head. Can you give me a hint on how to continue from these pages?
well, the images can stay where they are I think... if you let them load from their domain, ins't it ok? and the layout should be the html itself
It's very easy, just download the first file and run it on the server to install.
Then start testing the function
Then start testing the function
ASKER
Actually, I'd like to have a complete rendering of the page without external links.
Reason: I want to keep the forecasts in my mail DB for comparison and later analysis. So, if the images are linked to, they might not be available any more (actually, many of them will change at least one a day).
Sorry for the complications, but keeping the images was the reason I asked for rendering them to a Notes document...
J
Reason: I want to keep the forecasts in my mail DB for comparison and later analysis. So, if the images are linked to, they might not be available any more (actually, many of them will change at least one a day).
Sorry for the complications, but keeping the images was the reason I asked for rendering them to a Notes document...
J
It seems now much more difficult... you want to create a browser in Notes !!!
ASKER
Raputa - I don't want to create a browser in Notes. The question remains still the same.
Quote from the original question:
"From within an agent, I'd like to read a web page containing pictures and Java Script and send it as E-Mail."
So, I want either "GetDocumentByURL" to function correctly (it would do what I need) or find replacement code to "GetDocumentByURL".
I've not yet been looking at the Java samples (my Java is quite rusty and I would need a day or to just to get up and running again), but I'm certainly willing to do so IF it solves the problem. However, I'm not sure about this - reading a HTML string from a web site does NOT solve the problem. if I could read an image and insert it into a Notes document, the problem would be party solved.
Ideas? Suggestions?
J
Quote from the original question:
"From within an agent, I'd like to read a web page containing pictures and Java Script and send it as E-Mail."
So, I want either "GetDocumentByURL" to function correctly (it would do what I need) or find replacement code to "GetDocumentByURL".
I've not yet been looking at the Java samples (my Java is quite rusty and I would need a day or to just to get up and running again), but I'm certainly willing to do so IF it solves the problem. However, I'm not sure about this - reading a HTML string from a web site does NOT solve the problem. if I could read an image and insert it into a Notes document, the problem would be party solved.
Ideas? Suggestions?
J
fulscher,
GetDocumentByURL really does not integrates the images in the rich text field like you can do with a paste from the clipboard, but it puts the html in the Body field as it is.
So if you want to get to the images, you could iterate through the IMG tags (or the values in the field $ImageList) and try to get the filed from the internet cache directory, along with all the others needed files, such as stylesheets and javascripts
GetDocumentByURL really does not integrates the images in the rich text field like you can do with a paste from the clipboard, but it puts the html in the Body field as it is.
So if you want to get to the images, you could iterate through the IMG tags (or the values in the field $ImageList) and try to get the filed from the internet cache directory, along with all the others needed files, such as stylesheets and javascripts
ASKER
all - thank you for your input and comments so far.
I've been playing some more with various methods and I've come up so far with something that appears to work. I restrict myself to images for the time being, since the images are the most important bit of the to-be mail.
- Load the image to a temporary file with Win32 API call URLDownloadToFile
- Convert the image from PNG to GIF using IrfanView using calls to CreateProcess / WaitForSingleObject API functions
- Create a NotesDocument and insert the image as MIMEEntity.
This appears to work so far. I've yet to compose several images into one mail (basically, I'd like to see predictions from different sources in one mail) but I'm more or less optimistic that it will work.
<Flame>It's incredible that something so appearently simple is so complicated. For example, you appearently can't just add an image to a rich text item - either it's an attachment and not shown or it's an OLE object and takes ages to insert and load. So, one has to use the MIME classes to do it. This is ridiculous. </Flame>
Raputa - I had the impression that GetDocumentByURL includes the images found at the target URL and renders the doc to rich text. Well, it isn't working anyway so we don't need to waste time thinking about what it would do if it would function correctly.
I'm still open to suggestions (and ready to spend points) - does anybody see a simpler way to do this?
Jan
I've been playing some more with various methods and I've come up so far with something that appears to work. I restrict myself to images for the time being, since the images are the most important bit of the to-be mail.
- Load the image to a temporary file with Win32 API call URLDownloadToFile
- Convert the image from PNG to GIF using IrfanView using calls to CreateProcess / WaitForSingleObject API functions
- Create a NotesDocument and insert the image as MIMEEntity.
This appears to work so far. I've yet to compose several images into one mail (basically, I'd like to see predictions from different sources in one mail) but I'm more or less optimistic that it will work.
<Flame>It's incredible that something so appearently simple is so complicated. For example, you appearently can't just add an image to a rich text item - either it's an attachment and not shown or it's an OLE object and takes ages to insert and load. So, one has to use the MIME classes to do it. This is ridiculous. </Flame>
Raputa - I had the impression that GetDocumentByURL includes the images found at the target URL and renders the doc to rich text. Well, it isn't working anyway so we don't need to waste time thinking about what it would do if it would function correctly.
I'm still open to suggestions (and ready to spend points) - does anybody see a simpler way to do this?
Jan
ASKER
Ok, the problem is more or less solved; by downloading the files and using the Notes MIME classes, I was able to put a mail together which contains all the images I want. It's a start.
Points will be split, since all of you contributed.
J
Points will be split, since all of you contributed.
J
TO send multiple pages, you can use Notesnewletter class
partha