Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

Finding Subparts of Strings in TXT Doc

Posted on 2007-08-02
15
Medium Priority
?
243 Views
Last Modified: 2013-11-05
Hi,
I have a TXT document that I want to search for a particular sub-string within other strings. I can currently find the sub-string in its entirety (not part of a larger string), but I also would like to find where other sub-strings exist. For example, if my sub-string is  "dri" , I would like the output to Console.Writeline () to be as follows:

drive
driver
driving
drivable
dried
drink

...and so on.  Therefore, the code would have to find the sub-string (in this case "dri") and return the larger string everywhere in my document where the sub-part is found.

Thank you,
Fulano
0
Comment
Question by:Mr_Fulano
  • 6
  • 5
  • 3
  • +1
15 Comments
 
LVL 15

Expert Comment

by:JackOfPH
ID: 19623026
Sub FindWord(Byval SearchWord as string, Byval PathOfTextFiles as string)

Dim arrWord() As String = IO.File.ReadAllLines(PathOfTextFile)
Dim arrExtractedWord = arrContacts(ctr).Split(" ")

   
For ctr as integer = 0 to arrExtractedWord.length - 1

if instr(0,SearchWord,arrExtractedWord(ctr).tostring)<> 0  then

console.writeline(arrExtractedWord(ctr).tostring)

next

end sub
0
 
LVL 53

Expert Comment

by:Dhaest
ID: 19623047
       Dim oRead As System.IO.StreamReader
        Dim LineIn As String

        oRead = IO.File.OpenText("C:\temp\subparts.txt")

        While oRead.Peek <> -1
            LineIn = oRead.ReadLine()
            Dim i As Integer
            Dim tempString As String
            i = 0
            tempString = LineIn
            While tempString.IndexOf("dri") > 0
                If tempString.IndexOf(" ", tempString.IndexOf("dri")) > 0 Then
                    Console.WriteLine(tempString.Substring(tempString.IndexOf("dri"), tempString.IndexOf(" ", tempString.IndexOf("dri")) - tempString.IndexOf("dri")))
                    tempString = tempString.Substring(tempString.IndexOf("dri") + 3)
                Else
                    Console.WriteLine(tempString.Substring(tempString.IndexOf("dri")))
                    tempString = tempString.Substring(tempString.IndexOf("dri") + 3)

                End If
            End While

        End While

        oRead.Close()
0
 
LVL 53

Expert Comment

by:Dhaest
ID: 19623062
To JackOfPh
There are some issues with your code:
1) Dim arrWord() As String = IO.File.ReadAllLines(PathOfTextFile)
   must be Dim arrWord() As String = IO.File.ReadAllLines(PathOfTextFiles)
2) How is arrExtractedWord and ctr declared ?
3) Instr is not the default OO-way.
4) when testing your code with this file below, I get errors !

this is my drive
The driver is driving an old chevy.
drivable
dried
drin
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 53

Expert Comment

by:Dhaest
ID: 19623085
Adjusted the code of JackOfPh (who had also the result from the first line in the file)

    Sub FindWord(ByVal SearchWord As String, ByVal PathOfTextFiles As String)
        Dim arrWord() As String = IO.File.ReadAllLines(PathOfTextFiles)
        Dim ctr As Integer = 0
        For ctrLines As Integer = 0 To arrWord.Length
            Dim arrExtractedWord() As String = arrWord(ctr).Split(" ")
            For ctr = 0 To arrExtractedWord.Length - 1
                If arrExtractedWord(ctr).IndexOf(SearchWord) >= 0 Then
                    Console.WriteLine(arrExtractedWord(ctr).ToString)
                End If
            Next
        Next
    End Sub

0
 
LVL 15

Expert Comment

by:JackOfPH
ID: 19623162
Dhaest Thanks,



0
 
LVL 15

Expert Comment

by:JackOfPH
ID: 19623182
Dim arrWord() As String = IO.File.ReadAllLines(PathOfTextFile)
Dim arrExtractedWord() = arrContacts(ctr).Split(" ")

   
For ctr as integer = 0 to arrExtractedWord.length - 1

if instr(0,SearchWord,arrExtractedWord(ctr).tostring)<> 0  then

console.writeline(arrExtractedWord(ctr).tostring)

next
0
 
LVL 18

Expert Comment

by:vbturbo
ID: 19623274
Mr_Fulano

You can also use regular expression

Imports System
Imports System.Text.RegularExpressions

Public Class Test

    Public Shared Sub Main()

        ' Define a regular expression for repeated words.
        Dim rx As New Regex("jum\b*")

        ' Define a test string.        
        Dim text As String = "The the quick brown fox  fox jumped over the lazy dog dog."

        ' Find matches.
        Dim matches As MatchCollection = rx.Matches(text)

        ' Report the number of matches found.
        Console.WriteLine("{0} matches found.", matches.Count)

        ' Report on each match.
        For Each match As Match In matches
            Dim word As String = match.Groups("word").Value
            Dim index As Integer = match.Index
            Console.WriteLine("{0} repeated at position {1}", word, index)
            Console.WriteLine(match.Value.ToString())
        Next

    End Sub

End Class

vbturbo
0
 

Author Comment

by:Mr_Fulano
ID: 19629961
Hi Dhaest,

I really like your code in post [ID:19623047 Author:Dhaest Date:08.03.2007 at 03:03AM EDT]. It does exactly what I need.

I will select your answer, and in fact, have increased the points to 500 for all the work you did to assist me. However, could you please explain a little bit what the code is actually doing. Also, what is the "3" for? Is it because "dri" has 3 characters?

Also, thank you to all that contributed. I appreciate your help.

Thanks,
Fulano
0
 

Author Comment

by:Mr_Fulano
ID: 19630143
Hi Dhaest, I may have spoken too soon...I tested your code a bit more and found a problem, which I attribute to the way I explained what I needed to do. Your code is finding the substring - anywhere - in any of the words in my TXT document and returning the substring + whatever follows that.

I forgot to explain that I need it to only find words that - begin - with the substring.  So, as an example, if we use "par" as our substring, I need to find,
part
party
parent
parental ... and so on.

But it wound need to skip words like:
subpart
apart
apartment
compartment ...etc.

Right now its returning things like:
part              which is a subset of subpart
partment      which is a subset of apartment
partment      which is a subset of compartment...etc.

Also, the words in the TXT document are listed one word per line, and one word after another, in case that makes a difference. Its just a long list of terms.

Thanks,
Fulano
0
 

Author Comment

by:Mr_Fulano
ID: 19630186
Hi Dhaest, I figured out the problem...

I added a String variable called subString and replaced the "dri" with subString.

So, "While tempString.IndexOf(subString) > 0, should read "While tempString.IndexOf("dri") =  0"

That way it begins at index 0 of each word. Now, it works, but you can check it and make sure I didn't goof up some how.

I'd still like you to provide a brief synopsis of what the code is doing to help me learn. I don't understand all your steps and would like to learn more about what is actually happening in the code.

Thanks VERY much again,
Fulano

0
 

Author Comment

by:Mr_Fulano
ID: 19630193
oops...typo
What I really meant was that it should read like this...

So, "While tempString.IndexOf(subString) > 0"

should read

"While tempString.IndexOf(subString) =  0"   << works on words that - begin -  with substring.
: )
 
0
 
LVL 53

Expert Comment

by:Dhaest
ID: 19630213
Hi, I'm glad you figured it that last piece .. = 0 by yourself.
Glad I could help you with the greatest part of the solution.

>> Is it because "dri" has 3 characters
If I add 3 to my substring, the those 3 characters are gone, so the indexof won't find it anymore and will go to the next word in the same sentence
0
 

Author Comment

by:Mr_Fulano
ID: 19630233
>>If I add 3 to my substring, the those 3 characters are gone, so the indexof won't find it anymore and will go to the next word in the same sentence.<<

OK, so if my subpart is 5 characters long, I need to change the 3 to a 5. Right?

Could you also briefly explain the code a little.

Thanks,
Fulano
0
 
LVL 53

Accepted Solution

by:
Dhaest earned 2000 total points
ID: 19630438
      Dim oRead As System.IO.StreamReader
        Dim LineIn As String
' Open the textfile
        oRead = IO.File.OpenText("C:\temp\subparts.txt")
' While there are line in the file
        While oRead.Peek <> -1
' Read one line
            LineIn = oRead.ReadLine()
            Dim i As Integer
            Dim tempString As String
            i = 0
            tempString = LineIn
' while there are "dri" are in the string
            While tempString.IndexOf("dri") > 0
' if the line begins with "dri"
                If tempString.IndexOf(" ", tempString.IndexOf("dri")) = 0 Then
' show the word (search for a " " after the string (or read to the end of the line)
                    Console.WriteLine(tempString.Substring(tempString.IndexOf("dri"), tempString.IndexOf(" ", tempString.IndexOf("dri")) - tempString.IndexOf("dri")))
                    tempString = tempString.Substring(tempString.IndexOf("dri") + 3)
                Else
                    Console.WriteLine(tempString.Substring(tempString.IndexOf("dri")))
                    tempString = tempString.Substring(tempString.IndexOf("dri") + 3)

                End If
            End While

        End While

        oRead.Close()
0
 

Author Comment

by:Mr_Fulano
ID: 19631709
Thanks Dhaest. Greatly appreciate the help!

FDT
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This tutorial demonstrates one way to create an application that runs without any Forms but still has a GUI presence via an Icon in the System Tray. The magic lies in Inheriting from the ApplicationContext Class and passing that to Application.Ru…
1.0 - Introduction Converting Visual Basic 6.0 (VB6) to Visual Basic 2008+ (VB.NET). If ever there was a subject full of murkiness and bad decisions, it is this one!   The first problem seems to be that people considering this task of converting…
This video shows how to quickly and easily deploy an email signature for all users in Office 365 and prevent it from being added to replies and forwards. (the resulting signature is applied on the server level in Exchange Online) The email signat…
Look below the covers at a subform control , and the form that is inside it. Explore properties and see how easy it is to aggregate, get statistics, and synchronize results for your data. A Microsoft Access subform is used to show relevant calcul…

572 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question