[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 671
  • Last Modified:

C#, VS2008, NET3.5 HttpWebResponse & Image dump into files

Below code is used to load image direct from the URL

public void LoadImageFromUrl(ref string url, PictureBox pb)
{
    HttpWebRequest request = (HttpWebRequest)Net.HttpWebRequest.Create(url);
    HttpWebResponse response = (HttpWebResponse)request.GetResponse;
    Image img = Image.FromStream(response.GetResponseStream());
    response.Close();
}

However there are certain website, where it is not possible to download image directly using URL. It requires loading the main page containing the HTTL code that load the image, so I have to executes the main page HTTL which then load the image, how to dump all or selected image into the array (with image name) so I can save the image into a file for later assessment.
for example the website contains the code:-
....
<img src="showImage.file/riscy56.gif" height="100" width="300">
....

Thanks



0
Richard Payne
Asked:
Richard Payne
  • 6
  • 6
1 Solution
 
eitamaCommented:
Use regular expression to extract all the "<img src=..." tags, and get the URL and the name of the image from each one.
Put it into a List of structs.

The structs can look like this :
struct ImageObj
{
  public String url;
  public String name;
  public Byte[] data;
}

Create you list of structs :
List<ImageObj> myList = new List<ImageObj>();

then you create an object of your struct,

ImageObj myImage = new ImageObj();
myImage.url = getUrl();
myImage.name = getNameFromUrl(myImage.url);
myImage.data = getDataAsByteArray();


Now insert your image object into the list for later use,

myList.Add(myImage);

Put the whole thing into a foreach loop, and iterate through every image you have in the HTML, or every image you want.

Eitam.
0
 
Richard PayneChief Technology EngineerAuthor Commented:
Hi Eitam

How this relate to httpWebResponse, was is possible to put it together?

Thanks
0
 
eitamaCommented:
Hi,

You asked how you can search the images inside the HTML file if I am correct, when you cannot download the image directly.
httpWebResponse will hold the HTML response in those cases,
You use it to get the data of the HTML file, something like : httpWebResponse.ResponseBody

From inside the body, you get all the links for the pictures, and then you :
Create links[] array to hold all the links you wanted :

Then :

foreach (String link in links)
{
  //Send new request, and get response for the picutre
  //Create an object from the struct holding the image url, name and data
  //Insert the image object into your List
}

Later, when you want, take from your list, and write to file or what ever you want.

If you still don't understand what I mean, please try and point me to exactly what you don't understand.

Eitam.
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
Richard PayneChief Technology EngineerAuthor Commented:
Eitam

I struggle bit of difficulties

This is the link where I tried to capture comic image into files.

http://www.chron.com/apps/comics/showComic.mpl?date=2009/12/25&name=Baldo

The problem with the debug of VS2008, where I'm using step-by-step to observe the change in properties but it is not obvious to see how the image is stored within the properties (under locals window box within VS 2008).

Riscy

0
 
eitamaCommented:
This is the best I can offer,
I have not run this code cause need to write regex.

Good luck.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
using System.IO;

namespace ConsoleApplication3
{
    class Program
    {
        static void Main(string[] args)
        {
            String url = "http://www.google.com";
            HttpWebRequest request = (HttpWebRequest)Net.HttpWebRequest.Create(url);
            HttpWebResponse response = (HttpWebResponse)request.GetResponse;
            StreamReader sr = new StreamReader(response.GetResponseStream());
            String data;
            while (!sr.EndOfStream)
            {
                data = data + sr.Read(100);
            }
            response.Close();
            String imageUrl;
            //Use regular expression to get image link from inside data,
            //Use httpwebrequest again now to get the picture itself.
            //After you get the picture into a StreamReader, you can write it to file.
            request = (HttpWebRequest)Net.HttpWebRequest.Create(imageUrl);
            response = (HttpWebResponse)request.GetResponse;
            sr = new StreamReader(response.GetResponseStream());
            System.Net.Mime.MediaTypeNames.Image img = Image.FromStream(response.GetResponseStream());
            response.Close();
        }
    }
}

Open in new window

0
 
Richard PayneChief Technology EngineerAuthor Commented:
I have put the code and it does not work, here the snippet. I can see how it work....the string that is used to load gif file has no reference to httl.

            String url = "http://www.chron.com/apps/comics/showComic.mpl?date=2009/12/25&name=Baldo";
            HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
            HttpWebResponse response = (HttpWebResponse)request.GetResponse();
            StreamReader sr = new StreamReader(response.GetResponseStream());
            String data="";
            while (!sr.EndOfStream)
            {
                data = data + sr.Read();
            }
            response.Close();
            String imageUrl = "showComic.mplq_files/Baldo.gif";

            //Use regular expression to get image link from inside data,
            //Use httpwebrequest again now to get the picture itself.
            //After you get the picture into a StreamReader, you can write it to file.

            request = (HttpWebRequest)HttpWebRequest.Create(imageUrl);    <===exception..!
            response = (HttpWebResponse)request.GetResponse();
            sr = new StreamReader(response.GetResponseStream());

            Image img = Image.FromStream(response.GetResponseStream());
            response.Close();

Even so when i use imageUrl ="http://www.chron.com/apps/comics/showComic.mpl _files/Baldo.gif" and it throw up exception error at response.

Take a look into httl code and you can see why....

Suggestion?
0
 
eitamaCommented:
The code I wrote you was not expected to just "work".
As I said, I didn't run it. It needs a lot more work.

You didn't replace the comments with a regular-expression.

I hope someone else can help you.
Good Luck.
0
 
Richard PayneChief Technology EngineerAuthor Commented:
Hi Eitama

I think I can see what you mean, I was experimenting the text visualizer tool from the Visual Studio 2008 which display the httl code in the following format

<img src="http://images.chron.com/apps/comics/images/2009/12/25/Baldo.177.g.gif" width="760" height="238"><br /><br />

Which enable to download the image directly. The tricky part is the 3 digit number which is not predictable, so I have to text search te httl and then isolate the command (which you have tried to demonstrate) which make image loading possible. I will play it around today and see how it goes.

Many thanks and I hope you do not regard me as lazy type person......I'm learning a lot.....

Is there useful tool which making regular-expression easy to test?
0
 
eitamaCommented:
Hi Riscy,

No worries, I am sure you are nothing but hard worker :)

You are on the right way to solve your problem, catching the unpredictable 3 digit number is best done with regexp.
1. Here is an online tool to test regex : http://www.gskinner.com/RegExr/
2. Here is an example for the above link :
    http://images\.chron\.com/apps/comics/images/\d\d\d\d/\d*/\d*/Baldo\.(\d*)\.g\.gif
    in some programming languages you can take information out of your regular expression match.
    as you can see I put "( )" around the \d* right after "Baldo", This will be a subgroup or subexpression that you can capture.
3. Look here how to get a substring (subgroup) out of a regular expression : http://dotnetperls.com/regex-capture
4. in they link, they use "string v = match.Groups[1].Value;" to get what surrounded by "( )"
5. Once you have the link, you might not even need the random number by itself.

By the way,
You keep saying "httl" I never heard about it, are you sure you don't mean html? (:

Eitam.
0
 
Richard PayneChief Technology EngineerAuthor Commented:
sorry for confusion....still in recovery from Xmas party overhang and...yes it html....
thanks. I get busy now.
0
 
Richard PayneChief Technology EngineerAuthor Commented:
Well done, we done it!, it works very nicely!

The html output from the website is not actually  the same command as the html output from the getresponse routine.

I don't know why the html code modified compare to diirect html output, it would be good to shed some light.

The text / web visualizer tool proven to be very useful during the development.....as well as new learning.

Thanks again Eitam for wisdom and endurance..
0
 
eitamaCommented:
I am glad I could help,

About the difference between html output and html code,
The web browser reads HTML code, and displays html output, for example : "<b>SomeText</b>" is html code,
the output will be Bold text "SomeText", so the output is different then the code.

I am not familier with the application you are talking about named web visualizer.
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 6
  • 6
Tackle projects and never again get stuck behind a technical roadblock.
Join Now