Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

collect number of times unique lines occur in a text file

Posted on 2014-11-14
12
121 Views
Last Modified: 2014-11-17
For the code that I accepted from before in VB6:

Private Sub Command1_Click()
 Dim f As Integer
    Dim g As Integer
    Dim strLine As String
    
    f = FreeFile
    Open "C:\Users\Alpesh\Desktop\111114.txt" For Input As #f
        g = FreeFile
        Open "c:\Users\Alpesh\Desktop\052214.txt" For Append As #g
            Do Until EOF(f)
                Line Input #f, strLine
                If InStr(strLine, "BlockedIP") > 0 Then
                    Print #g, strLine
                End If
            Loop
        Close #g
    Close #f
End Sub

Open in new window



Is it possible to count how many unique lines are collected for output?  For example, instead of pasting all 502 "BlockedIP" lines, I would rather see a count of them as seen below:  502 times for the first one and 4 times for the second, etc...


502- 00:01:05 192.168.1.100 [DNSRedir] BlockedIP response sent, keyword blogspot.com: adsense.blogspot.com. -> 192.168.1.100
4 -00:01:33 192.168.1.102 [DNSRedir] BlockedIP response sent, keyword adnxs.com: ib.adnxs.com. -> 192.168.1.100
052214.txt
0
Comment
Question by:al4629740
  • 6
  • 6
12 Comments
 
LVL 46

Expert Comment

by:Martin Liss
ID: 40443321
Using the file you attached and the code below I get the following which is not the counts that you said there should be. The code assumes that a record is unique based on the text after "keyword". Is that not correct?

7- adnxs.com: ib.adnxs.com. -> 192.168.1.100
1207- blogspot.com: adsense.blogspot.com. -> 192.168.1.100

Private Sub Command1_Click()
Dim f As Integer
Dim strLine As String
Dim lngLines As Long
Dim arrKeys() As String
Dim bFound As Boolean
Dim bFirst As Boolean
Dim intCount As Integer
Dim strParts() As String

bFirst = True
f = FreeFile

Open "C:\temp\052214.txt" For Input As #f
ReDim arrKeys(1, 0)
Do Until EOF(f)
    Line Input #f, strLine
    bFound = False
    If InStr(strLine, "BlockedIP") > 0 Then
        strParts = Split(strLine, "keyword")
        For lngLines = 0 To intCount - 1
            If arrKeys(1, lngLines) = strParts(1) Then
                arrKeys(0, intCount - 1) = arrKeys(0, intCount - 1) + 1
                bFound = True
                Exit For
            End If
        Next
        If Not bFound Then
            If Not bFirst Then
                ReDim Preserve arrKeys(1, intCount)
            End If
            arrKeys(1, intCount) = strParts(1)
            arrKeys(0, intCount) = 1
            bFirst = False
            intCount = intCount + 1
        End If
    End If
Loop
Close
For lngLines = 0 To UBound(arrKeys)
    Debug.Print arrKeys(0, lngLines) & "-" & arrKeys(1, lngLines)
Next
MsgBox "done"
End Sub

Open in new window

0
 

Author Comment

by:al4629740
ID: 40443409
It looks like I have the wrong numbers. Let me test it and get back to you
0
 
LVL 46

Expert Comment

by:Martin Liss
ID: 40444595
Any results from your testing? Did you see the message I sent you?
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 

Author Comment

by:al4629740
ID: 40445210
Martin,

Where is the output file?  It executes but I can't see where the results are.
0
 
LVL 46

Expert Comment

by:Martin Liss
ID: 40445214
They are in the Debug window which you can access if  you Goto the VBE and type Ctrl+g
0
 

Author Comment

by:al4629740
ID: 40445223
Not every blocked site shows up
0
 
LVL 46

Expert Comment

by:Martin Liss
ID: 40445897
There are 1992 lines in the file you posted that contain "BlockedIP". My results in the Immediate Window (which you may have to scroll in order to see it all) shows this:
7- adnxs.com: ib.adnxs.com. -> 192.168.1.100
1207- blogspot.com: adsense.blogspot.com. -> 192.168.1.100
20- visualwebsiteoptimizer.com: dev.visualwebsiteoptimizer.com. -> 192.168.1.100
41- ^.*s(3|e)x: expertsexchange.112.2o7.net. -> 192.168.1.100
389- ^.*\.(asp|aspx|htm|html|jsp|php|xml)-: www.xml-sitemaps.com. -> 192.168.1.100
1- adsrvr.org: match.adsrvr.org. -> 192.168.1.100
1- tube: rtd.tubemogul.com. -> 192.168.1.100
1- criteo.com: dis.criteo.com. -> 192.168.1.100
1- w55c.net: geo-lb02.w55c.net. -> 192.168.1.100
1- tube: rtb.tubemogul.com. -> 192.168.1.100
1- w55c.net: i.w55c.net. -> 192.168.1.100
7- xnxx: www.xnxx.com. -> 192.168.1.100
1- criteo.com: rtax.criteo.com. -> 192.168.1.100
4- twitt: twitter.github.io. -> 192.168.1.100
1- dailymotion.com: www.dailymotion.com. -> 192.168.1.100
1- ^(.*\.)?xvideos\.(com|net)$: www.xvideos.com. -> 192.168.1.100
1- pinterest.com: assets.pinterest.com. -> 192.168.1.100
2- taboo: cdn.taboola.com. -> 192.168.1.100
79- disqus.com: collegetimescom.disqus.com. -> 192.168.1.100
3- pinterest.com: www.pinterest.com. -> 192.168.1.100
1- tumblr.com: officegirls.tumblr.com. -> 192.168.1.100
1- tumblr.com: greekpowerlady.tumblr.com. -> 192.168.1.100
1- tumblr.com: www.tumblr.com. -> 192.168.1.100
3- tumblr.com: sandybrown121.tumblr.com. -> 192.168.1.100
3- eroti: eleganteroticdresses.tumblr.com. -> 192.168.1.100
1- tumblr.com: assets.tumblr.com. -> 192.168.1.100
1- tumblr.com: static.tumblr.com. -> 192.168.1.100
1- tumblr.com: 38.media.tumblr.com. -> 192.168.1.100
1- tumblr.com: 33.media.tumblr.com. -> 192.168.1.100
1- tumblr.com: 40.media.tumblr.com. -> 192.168.1.100
1- tumblr.com: 41.media.tumblr.com. -> 192.168.1.100
1- tumblr.com: 36.media.tumblr.com. -> 192.168.1.100
4- tumblr.com: secure.assets.tumblr.com. -> 192.168.1.100
3- lingerie: www.lingeriediva.com. -> 192.168.1.100
2- tumblr.com: platform.tumblr.com. -> 192.168.1.100
1- pinterest.com: passets-lt.pinterest.com. -> 192.168.1.100
3- lingerie: www.spicylingerie.com. -> 192.168.1.100
2- mature: elegantmatures.tumblr.com. -> 192.168.1.100
2- tumblr.com: classicwomen.tumblr.com. -> 192.168.1.100
2- tumblr.com: strictbeauties.tumblr.com. -> 192.168.1.100
1- adcash.com: www.adcash.com. -> 192.168.1.100
3- tumblr.com: api.tumblr.com. -> 192.168.1.100
1- mgid.com: jsc.mgid.com. -> 192.168.1.100
1- directrev.com: xch.directrev.com. -> 192.168.1.100
1- pinterest.com: widgets.pinterest.com. -> 192.168.1.100
2- addthisedge.com: m.addthisedge.com. -> 192.168.1.100
7- tumblr.com: 31.media.tumblr.com. -> 192.168.1.100
9- adsrvr.org: rtb.adsrvr.org. -> 192.168.1.100
7- blogspot.com: 1.bp.blogspot.com. -> 192.168.1.100
1- tumblr.com: heavenlycheesecake.tumblr.com. -> 192.168.1.100
155- tumblr.com: elegantsexy.tumblr.com. -> 192.168.1.100
and the sum of the counts displayed for each address is 1992.
0
 

Author Comment

by:al4629740
ID: 40446617
I ran the code twice in the same immediate window.  In the immediate window I got this:

7- adnxs.com: ib.adnxs.com. -> 192.168.1.100
1207- blogspot.com: adsense.blogspot.com. -> 192.168.1.100
7- adnxs.com: ib.adnxs.com. -> 192.168.1.100
1207- blogspot.com: adsense.blogspot.com. -> 192.168.1.100

This is the code:

Dim f As Integer
Dim strLine As String
Dim lngLines As Long
Dim arrKeys() As String
Dim bFound As Boolean
Dim bFirst As Boolean
Dim intCount As Integer
Dim strParts() As String

bFirst = True
f = FreeFile

Open "C:\Users\Me\Desktop\111114.txt" For Input As #f
ReDim arrKeys(1, 0)
Do Until EOF(f)
    Line Input #f, strLine
    bFound = False
    If InStr(strLine, "BlockedIP") > 0 Then
        strParts = Split(strLine, "keyword")
        For lngLines = 0 To intCount - 1
            If arrKeys(1, lngLines) = strParts(1) Then
                arrKeys(0, intCount - 1) = arrKeys(0, intCount - 1) + 1
                bFound = True
                Exit For
            End If
        Next
        If Not bFound Then
            If Not bFirst Then
                ReDim Preserve arrKeys(1, intCount)
            End If
            arrKeys(1, intCount) = strParts(1)
            arrKeys(0, intCount) = 1
            bFirst = False
            intCount = intCount + 1
        End If
    End If
Loop
Close
For lngLines = 0 To UBound(arrKeys)
    Debug.Print arrKeys(0, lngLines) & "-" & arrKeys(1, lngLines)
Next
MsgBox "done"

Open in new window

0
 

Author Comment

by:al4629740
ID: 40446618
The immediate window output is also short.  No where to scroll to
0
 
LVL 46

Accepted Solution

by:
Martin Liss earned 500 total points
ID: 40446671
The only thing that I can imagine is that you aren't using the file that you posted in your original question. Just in case something has happened to the original on your PC why don't you download it from your post. I'm attaching my whole project. Change the Open "C:\temp\052214.txt" For Input As #f statement in Command1_Click to match your file path, run it and tell me what happens. I'm mystified because as I've shown I get a different output then you.
Project1.zip
0
 

Author Comment

by:al4629740
ID: 40448978
I must have copied your code incorrectly.  

Thank you Martin
0
 
LVL 46

Expert Comment

by:Martin Liss
ID: 40449102
If you need any tweaks let me know. In any case you're welcome and I'm glad I was able to help.

In my profile you'll find links to some articles I've written that may interest you.
Marty - MVP 2009 to 2014
0

Featured Post

Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you have ever used Microsoft Word then you know that it has a good spell checker and it may have occurred to you that the ability to check spelling might be a nice piece of functionality to add to certain applications of yours. Well the code that…
You can of course define an array to hold data that is of a particular type like an array of Strings to hold customer names or an array of Doubles to hold customer sales, but what do you do if you want to coordinate that data? This article describes…
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question