Link to home
Start Free TrialLog in
Avatar of NeallyNeal
NeallyNeal

asked on

vb.net count duplicate lines then sort from highest count to lowest

Hello, I really need the following help in VB.Net. I am using VS 2008

1.) Read file c:\List.txt, which has a list of numbers per line.
2.) Need to count how many times a duplicate line item appears.
3.) Need to ouput that line item with count total to a file called c:\Amount.txt sorted by highest count to lowest count.

example of c:\list.txt -
80
92
78
64
92
59
64

example of c:\Amount.txt (sorted by count, highest to lowest) -
78 = 1
80 = 1
59 = 1
92 = 2
64 = 2

Could you please provide me with a complete coded solution for this problem in VB.Net

Thanks!
Avatar of kaufmed
kaufmed
Flag of United States of America image

Using Linq you could do:

Module Module1

  Sub Main()
    Dim query = From line In System.IO.File.ReadAllLines("input.txt")
                Group By line Into Group
                Let c = Group.Count()
                Order By c
                Select New With {.Key = line, .Count = c}

    For Each item In query
      Console.WriteLine("{0} = {1}", item.Key, item.Count)
    Next

    Console.ReadKey()

  End Sub

End Module

Open in new window

SOLUTION
Avatar of nepaluz
nepaluz
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of NeallyNeal
NeallyNeal

ASKER

Hello and thank you for your speedy replies. The above mentioned could is producing the following listed errors below. Please keep in mind that I am using Visual Studio 2008. Could using the dictionary method be better? If so.., could you please supply a full coded solution using that method. Or help me to get your supplied code working? Thanks!

Error      1      'Group' is a type and cannot be used as an expression.      
Error      2      Method arguments must be enclosed in parentheses.      
Error      3      Name 'By' is not declared.      
Error      4      Comma, ')', or a valid expression continuation expected.      
Error      5      Name 'c' is not declared.      
Error      6      'Count' is not a member of 'System.Text.RegularExpressions.Group'.      
Error      7      Name 'Order' is not declared.      
Error      8      Method arguments must be enclosed in parentheses.      
Error      9      Name 'By' is not declared.      
Error      10      Comma, ')', or a valid expression continuation expected.      
Error      11      'Select Case' must end with a matching 'End Select'.      
Error      12      'Line' statements are no longer supported. File I/O functionality is available as 'Microsoft.VisualBasic.FileSystem.LineInput' and the graphics functionality is available as 'System.Drawing.Graphics.DrawLine'.      
Error      13      Name 'c' is not declared.      
Error      14      Statements and labels are not valid between 'Select Case' and first 'Case'.      
Error      15      Statements and labels are not valid between 'Select Case' and first 'Case'.      
Error      16      Statements and labels are not valid between 'Select Case' and first 'Case'.      
Error      17      'Key' is not a member of 'String'.      
Error      18      Statements and labels are not valid between 'Select Case' and first 'Case'.
I should have mentioned this prior, but you will need the following Imports statement:

Imports System.Linq
Implemented import statement and nothing changed.
Here's a Non-LINQ (dictionary centric) code suggestion:
Dim TheDict As New SortedDictionary(Of Integer, Integer)
Using Reader As New System.IO.StreamReader("C:\List.txt")
    While Not Reader.EndOfStream
        Dim Line = Reader.ReadLine()
        If TheDict.ContainsKey(CInt(Line.Trim)) Then
            TheDict.Item(CInt(Line)) = +1
        Else
            TheDict.Add(CInt(Line), 1)
        End If
    End While
End Using
Using Writer As New System.IO.StreamWriter("C:\Amount.txt")
    For Each item In query
        Writer.WriteLine(item.Key & " = " & item.Count & vbCrLf)
    Next
End Using

Open in new window

Error      1      Name 'query' is not declared.
Hmmm!
        Dim TheDict As New SortedDictionary(Of Integer, Integer)
        Using Reader As New System.IO.StreamReader("C:\List.txt")
            While Not Reader.EndOfStream
                Dim Line = Reader.ReadLine()
                If TheDict.ContainsKey(CInt(Line.Trim)) Then
                    TheDict.Item(CInt(Line)) = +1
                Else
                    TheDict.Add(CInt(Line), 1)
                End If
            End While
        End Using
        Using Writer As New System.IO.StreamWriter("C:\Amount.txt")
            For Each x In TheDict.Keys
                Writer.WriteLine(x & " = " & TheDict.Item(x) & vbCrLf)
            Next
        End Using

Open in new window

The following results are in the Amount.txt file:
59 = 1

64 = 1

78 = 1

80 = 1

92 = 1

The results that I'm looking for should be:
78 = 1
80 = 1
59 = 1
92 = 2
64 = 2

Thanks.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
kaufmed and nepaluz, both revised codes are good. But I also need the results to be sorted by highest count to lowest count as originally requested.

Thank you so much...
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Everyone, thank you for your help. I decided to split and award the points based on the following:

1.)kaufmed - Recent solutions had sorted and counted my data as requested
2.)nepaluz - Solution showed me how to pipe the results from kaufmed solution to the file called Amount.txt, as I also requested.

Once again, thank each of you so much for your help on this one.
Thanks!