HELP!!! CRITICAL!!! I'm trying to extract a URL from a text file...then write it to a text file with just the URL.

Using system.io or something similar I need to read a text file.  This can be done unitl end of file or readline but I need to extract the URL in the file...Its in the form of "rcrast@http://ww3.yahoo.com/main/Prh13b/ps20060325-178782/index.html".

I need to extract this and then write it to another text file line by line...



The file also contains many different characters that should be excluded from the extraction and they look like this:  

À‰e ¿  @F€p Ê  @:Ÿ8 è  À.y^   tœ© ! €åÙí ^   [I Š   ]Á‹   €ßÿ  Ä  ÀULC        À=}  r €¶8» ‘  ¥… Ì  €#&4 ð  Àž9   '˜+ X   ?†\ K  l¡a ` @F1á p @hÛË  €Fm È  @’K‡  €p+·€ @¶J"€í  Ëuó€ €Úù– “  • q   ºì }  €¶çÊ   %ã @ö”« š € , ˆ €dè;€¨ @? T  €û”« ®  €£Î÷ Á   ÁCË ì  @z÷       ý
brian_leightyAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

deightonprogCommented:
will the url always have the string http:// in it?  The what will indicate the end of the url in the file?  Can you explain in general terms how the urls can be recognised?  Will the files have multiple urls in them, or just one?
brian_leightyAuthor Commented:
i just need to pull a bunch of websites out of a index file that internet explorer has made....its to log all the websites that my users goto...the http:// doesn't matter as long as I got all the web pages
deightonprogCommented:
I was thinking you could use the http to spot the urls when parsing the file, Is this is a  systems file, e.g. index.dat?   I've had problems manipulating those files in the past, they are special files it seems.
PMI ACP® Project Management

Prepare for the PMI Agile Certified Practitioner (PMI-ACP)® exam, which formally recognizes your knowledge of agile principles and your skill with agile techniques.

brian_leightyAuthor Commented:
no it's fine...what about the "rcrast@" to spot but HTTP is fine..
deightonprogCommented:


        Dim x As IO.File

        Try
            x.Delete("c:\temp.dat")
        Catch
        End Try
        x.Copy("c:\documents and settings\andyd\cookies\index.dat", "c:\temp.dat")



        Dim fs As New IO.StreamReader("c:\temp.dat")
        Dim fw As New IO.StreamWriter("c:\output.txt")

        Dim s As String = fs.ReadToEnd
        s = s.ToLower

        Dim bdone As Boolean

        Dim i As Integer = s.IndexOf("cookie:".ToLower)

        Do Until i < 0

            s = s.Substring(i)
            i = s.IndexOf("@")
            s = s.Substring(i + 1)

            bdone = False

            Dim surl As String = ""

            Dim j As Integer = 0
            Do Until bdone

                Dim sa As String = s.Substring(j, 1)
                If Asc(sa) = 0 Then
                    bdone = True
                Else
                    surl += sa
                    j += 1
                End If

            Loop

            fw.WriteLine(surl)



            i = s.IndexOf("cookie:".ToLower)
        Loop

        fs.Close()
        fw.Close()



deightonprogCommented:
if that doesn't work, then in the two place where i search for cookie:, search for rcrast
brian_leightyAuthor Commented:
that's so perfect but what about a checkbox or something to extract by the http:\\ and something to end the url
deightonprogCommented:
what do you mean by  'something to end the url'
brian_leightyAuthor Commented:
I dont think i mean anyhting by it because it will be the same as with "cookie:"

something like ".html"
you dont have to worry about that I need to try "http://
deightonprogCommented:
Dim x As IO.File

        Try
            x.Delete("c:\temp.dat")
        Catch
        End Try
        x.Copy("c:\documents and settings\andyd\cookies\index.dat", "c:\temp.dat")



        Dim fs As New IO.StreamReader("c:\temp.dat")
        Dim fw As New IO.StreamWriter("c:\output.txt")

        Dim s As String = fs.ReadToEnd
        s = s.ToLower

        Dim bdone As Boolean

        Dim i As Integer = s.IndexOf("http".ToLower)

        Do Until i < 0

            '            s = s.Substring(i)
            '           i = s.IndexOf("@")
            s = s.Substring(i)

            bdone = False

            Dim surl As String = ""

            Dim j As Integer = 0
            Do Until bdone

                Dim sa As String = s.Substring(j, 1)
                If Asc(sa) = 0 Then
                    bdone = True
                Else
                    surl += sa

                    j += 1
                End If

            Loop

            fw.WriteLine(surl)


            i = s.IndexOf("http".ToLower, 1)
        Loop

        fs.Close()
        fw.Close()




Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic.NET

From novice to tech pro — start learning today.