Link to home
Start Free TrialLog in
Avatar of with_out_user
with_out_user

asked on

TOP 2 BY vbSCript

Can you help me to get script to provide for me top 2 or Top 2

so if i have CSV file with the following and i want top 2


aaa   10
bbb   20
ccc   55
ddd  66
eee  55

So result will be for Top 2

ddd  66
ccc   55
eee  55

and for top - 2 will be

aaa   10
bbb   20

other example if ihave the following

aaa
bbb
aaa
aaa
aaa
ccc
bbb
aaa
fff
ddd
bbb
bbb


so the top 2 will be as following

aaa 5 Times
bbb 4 Times

Thank you






Avatar of Bill Prew
Bill Prew

Okay, this would likely be 2 different scripts, since the logic is different, is that what you have in mind?

In the first case, can there ever be duplicate of the first column, or will those always only exist once?

~bp
Avatar of with_out_user

ASKER

in both cas can be duplicate
Here is an example of each of the three ideas you had...
Use the sample csv files as for testing the script.

I used double quotes and commas in the CSV file along with column names.

'On Error Resume Next
Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H0001

Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")

set objFSO = CreateObject("Scripting.FileSystemObject") 
Set WSHshell = CreateObject("WScript.Shell") 
strPathtoTextFile = WSHShell.SpecialFolders("Desktop")
'strPathtoTextFile = "C:\csv-files\"
strFileName1 = "sample1.csv"
strFileName2 = "sample2.csv"

'Example 1
'so if i have CSV file with the following and i want top 2
'"Col1","Col2"
'"aaa","10"
'"bbb","20"
'"ccc","55"
'"ddd","66"
'"eee","55"
'So result will be for Top 2
'ddd 66
'ccc 55
'eee 55
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""
objRecordset.Open "SELECT TOP 2 * FROM " & strFileName1 & " ORDER BY Col2 DESC", objConnection, adOpenStatic, adLockOptimistic, adCmdText
wscript.echo "Result for Top 2"
Do Until objRecordset.EOF
    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2")
    objRecordset.MoveNext
Loop
objConnection.Close


'Example 2
'and for top - 2 will be
'aaa 10
'bbb 20
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""
objRecordset.Open "SELECT DISTINCT TOP 2 * FROM " & strFileName1 & " ORDER BY Col1 ASC", objConnection, adOpenStatic, adLockOptimistic, adCmdText
wscript.echo "Result for top 2"
Do Until objRecordset.EOF
    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2")
    objRecordset.MoveNext
Loop
objConnection.Close

'Example 3
'other example if ihave the following
'"Col1","Col2"
'"aaa","10"
'"bbb","20"
'"aaa","10"
'"aaa","10"
'"aaa","10"
'"ccc","55"
'"bbb","20"
'"aaa","10"
'"fff","10"
'"ddd","66"
'"bbb","20"
'"bbb","20"
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""
objRecordset.Open "SELECT TOP 2 COUNT(*) AS Col2, Col1 FROM " & strFileName2 & " GROUP BY Col1", objConnection, adOpenStatic, adLockOptimistic, adCmdText
wscript.echo "Result for top 2"
Do Until objRecordset.EOF
    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2") & " Times"
    objRecordset.MoveNext
Loop
objConnection.Close

Open in new window

sample2.csv
sample1.csv
Thank you but i get attach error

i Have question, is it possible to find the top 2 by vbscript only and with out use Provider=Microsoft.Jet.OLEDB.4.0

 User generated image
Ah - the error is because the file "sample1.csv" is not on the desktop.  Put the file there and everything should work just fine.  The jet database stuff should be on your machine already - no need to do anything more.  The error is caused because it cannot find the file.
If you want, comment out line 11 and remove the remove the apostrophe from line 12 making sure the path/string points to the folder containing the csv file.
Well, I was going to play around with doing this with dictionaries and itterating through them to find highest values and counts and such, but I have to say, rejoinder's approach seems like a great way to approach this problem to me.  Well done.

~bp
Thank you

but i get same error

This is really odd... I tested this script on two machines and everything worked.  All three files are on the desktop, the vbs and the two csv sample files.

Bill, if you are tracking this ticket, can you try the script out?
I am tracking, and I will test out.  I did test it out placing the CSV files in a local folder and uncommenting:

'strPathtoTextFile = "C:\csv-files\"

and that worked fine for me (Windows 7), but I will try it with the files on the Desktop and see how that goes.

~bp
Okay, sorry for the delay.  I ran the test on two systems here, a Windows 7 system, and an older XP system.  Both test worked fine.  I did add a display of the "strPathtoTextFile" variable right after it is set just to make sure it was getting the proper value, which it is.

here's the output from the Windows 7 test, with the data files (CSV) on the Desktop, and the VBS script in a Temp folder:

[c:\temp\EE26647091]cscript EE26647091.vbs
Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.

C:\Users\username\Desktop
Result for Top 2
ddd 66
eee 55
ccc 55
Result for top 2
aaa 10
bbb 20
Result for top 2
aaa 5 Times
bbb 4 Times

[c:\temp\EE26647091]

Open in new window

Then I ran another test on a Windows XP machine, and placed the CSV abd VBS files all on the Desktop.  This worked fine as well per the output below:

C:\Documents and Settings\username\Desktop>cscript EE26647091.vbs
Microsoft (R) Windows Script Host Version 5.7
Copyright (C) Microsoft Corporation. All rights reserved.

C:\Documents and Settings\username\Desktop
Result for Top 2
ddd 66
eee 55
ccc 55
Result for top 2
aaa 10
bbb 20
Result for top 2
aaa 5 Times
bbb 4 Times

C:\Documents and Settings\username\Desktop>

Open in new window

At this point it might be worth you posting up the exact script you are running perhaps a small change is needed.

~bp
Too funny - I just tested it on an XP machine and it worked there as well.

Back to you with_out... let us know how you make out.
Hi All,

Thank you for reply, i get other error now when i run script in win 7.

is it possible to find the top 2 by vbscript only and with out use Provider=Microsoft.Jet.OLEDB.4.0



Thank you
Error.jpg
How large will these input files be?

~bp
the above error was from the sample1.csv  & sample2.csv
==> How large will these input files be?

I meant when you run the script against real data.  Depending on how large the data to be sorted / grouped, that could affect the approach taken.

~bp
íts large data

the number of line in the file about 1000000 line
That feels like it would be unpractical to pull into a VBS script in memory in the form of either arrays or dictionaries, and then try and sort and count iteratively.  I don't think performance would be satisfactory with a script approach like that.

~bp
even it slow but i need the script importantly if it's possible
If you want a better solution, you could use MS Access, pull the data in then do any kind of query you want.  One step above that would be to install SQL Express (http://www.microsoft.com/express/Database/) and have all the SQL tools you would need to tear through millions of records.
ASKER CERTIFIED SOLUTION
Avatar of rejoinder
rejoinder
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Very Thanx MR. rejoinder

It's work well

Thank you
==> rejoinder

Thanks for the interesting Recordset usage in VBS, nicely done, and very informative for me.

~bp