Crawler in C#

I am writing an application in C#. There is table that contains lot of urls, what I want is, to take one url at a time and download the content until all of them downloaded.I think I should use thread when it send the request and download it.What to do when there is a url which is to be sent by post method.
   Please give me some rough idea how can I achieve in C#. I have little idea in VC++. Like I can use ISAPI session,connection...etc.
NavinKaushikAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

psdavisCommented:
This is the code I use to run a cgi script and download an image.
Sorry I don't have time to give you more, running to airport in a few mins.

            using( WebClient pWebClient = new WebClient( ))
            {
               pWebClient.BaseAddress = @"http://" + this.Server;
               byte[] byImage = pWebClient.DownloadData( @"cgi-bin/getimage.cgi" );
            }

0
NavinKaushikAuthor Commented:
Thanks for your crucial time :) but I want code or API'S in C# ......
0
purpleblobCommented:
The code demonstrated by psdavis is C# and should fulfill your requirements - here's the same thing but showing the download of a page which might make it more in line with what you want to do with it.

byte[] buffer = null;

using(WebClient client = new WebClient())
{
   buffer = client.DownloadData("http://www.google.com");
}

Now we could convert the buffer to a string using System.Text.ASCIIEncoding.ASCII.GetString(buffer); and then find any URL's within it to also download. Also as psdavis has shown you could assign the base URL to the BaseAddress property and thus DownloadData will expect relative paths - this is obviously useful if you do want to download images etc. from the page that are shown using relative paths.

Hope this helps
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Cloud Class® Course: Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

NavinKaushikAuthor Commented:
Thanks purpleblob. I will be much more thankful to you if you help me more. Actually I dont have any practical exp in .NET. Previously there was a desktop application in VC++ , multi-tired using MFC in UI , ATL com in middle layer, application was to crawl the urls present  in table(database), application download all the pages for those urls into a folder.
  First I create the urls Ex.
    If you login to your mail account through a webpage then it will display your inbox. What I do is
    www.abc.com/inbox.asp?login_name=navin+kaushik&pwd=xyz
   Now when I put this url
 byte[] buffer = null;

using(WebClient client = new WebClient())
{
   buffer = client.DownloadData("www.abc.com/inbox.asp?login_name=navin+kaushik&pwd=xyz");
}
 This is one of modules of my application.
 I think my objective is more clear to you now. My question is Should I go for C# Since later I may integrate it with web also.
 How can I make multi-tired application in C# and tell me the corresponding API's in .NET  of ISAPI in VC++ 6.0
  Thanks.
0
purpleblobCommented:
You can use your existing COM controls within C# by selecting Add References from a Visual Studio project and then select the COM tab and either browse for your COM objects or select from the list of registered object. VS will then create a wrapper DLL (like #import in VC++).

I'm afraid I don't think you can write ISAPI filters in C# as C# DLLs doen't expose methods in the same way to the "standard" C DLL's. However this said, if you wanted to use managed C++ you could create ISAPI filters and then call C# objects thus you could write all the main logic in C# and just have the C++ DLL acts as a proxy to your C# code.

As for should you go for C#, I would say at the moment it sounds like C# will not offer you too much that your current system doesn't already have - however if you migrated your existing system to managed C++ you would then be able to work with the .NET libraries and also write C# objects which you could then call from managed C++.

Hope this helps
0
NavinKaushikAuthor Commented:
hmmm lot more confusing!!!!.
   Is there anything which I can't do in C# ( considering application from scratch) and I can do that in VC++.NET.
   I think this will solve my problem.
0
purpleblobCommented:
Basicaly things like ISAPI filters which require standard style DLL's are not possible within C# at this time. Also C# does not create COM objects. However by using VC++.NET in conjunction you could write most/all of the functionality in C# DLL's then create VC++.NET wrappers around them. So this is very much achievable.

With regards your application, I'm not sure where ISAPI fits in with your current system (if at all) so this might not be an issue.

From what I understand of your application, i.e. you have a database of URL's, you get these URL's navigate to them and download whatever is contained at the URL - this can easily be done with C#. If you are also creating the web based application, i.e. at the URL then you could use ASP.NET with C# as your chosen language - so again probably no problems there.

Now, not wishing to confuse the issue fuether, I suppose what I was saying earlier about C# not offering you much that you don't currently have is this - if I were writing such an application from scratch I would definitely write it in C# and .NET using ADO.NET and ASP.NET. If all I was doing was adding a little bit of code to download data from URL's and the rest of the system was already written in C++ I'd probably use VC++.NET just for consistancy.

I hope I haven't confused matters further - it's difficult to say what's best for YOU to do as I do not know the ins and outs of your whole application.

Hope this helps
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C#

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.