with_out_user
asked on
TOP 2 BY vbSCript
Can you help me to get script to provide for me top 2 or Top 2
so if i have CSV file with the following and i want top 2
aaa 10
bbb 20
ccc 55
ddd 66
eee 55
So result will be for Top 2
ddd 66
ccc 55
eee 55
and for top - 2 will be
aaa 10
bbb 20
other example if ihave the following
aaa
bbb
aaa
aaa
aaa
ccc
bbb
aaa
fff
ddd
bbb
bbb
so the top 2 will be as following
aaa 5 Times
bbb 4 Times
Thank you
so if i have CSV file with the following and i want top 2
aaa 10
bbb 20
ccc 55
ddd 66
eee 55
So result will be for Top 2
ddd 66
ccc 55
eee 55
and for top - 2 will be
aaa 10
bbb 20
other example if ihave the following
aaa
bbb
aaa
aaa
aaa
ccc
bbb
aaa
fff
ddd
bbb
bbb
so the top 2 will be as following
aaa 5 Times
bbb 4 Times
Thank you
ASKER
in both cas can be duplicate
Here is an example of each of the three ideas you had...
Use the sample csv files as for testing the script.
I used double quotes and commas in the CSV file along with column names.
sample1.csv
Use the sample csv files as for testing the script.
I used double quotes and commas in the CSV file along with column names.
'On Error Resume Next
Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H0001
Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")
set objFSO = CreateObject("Scripting.FileSystemObject")
Set WSHshell = CreateObject("WScript.Shell")
strPathtoTextFile = WSHShell.SpecialFolders("Desktop")
'strPathtoTextFile = "C:\csv-files\"
strFileName1 = "sample1.csv"
strFileName2 = "sample2.csv"
'Example 1
'so if i have CSV file with the following and i want top 2
'"Col1","Col2"
'"aaa","10"
'"bbb","20"
'"ccc","55"
'"ddd","66"
'"eee","55"
'So result will be for Top 2
'ddd 66
'ccc 55
'eee 55
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""
objRecordset.Open "SELECT TOP 2 * FROM " & strFileName1 & " ORDER BY Col2 DESC", objConnection, adOpenStatic, adLockOptimistic, adCmdText
wscript.echo "Result for Top 2"
Do Until objRecordset.EOF
Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2")
objRecordset.MoveNext
Loop
objConnection.Close
'Example 2
'and for top - 2 will be
'aaa 10
'bbb 20
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""
objRecordset.Open "SELECT DISTINCT TOP 2 * FROM " & strFileName1 & " ORDER BY Col1 ASC", objConnection, adOpenStatic, adLockOptimistic, adCmdText
wscript.echo "Result for top 2"
Do Until objRecordset.EOF
Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2")
objRecordset.MoveNext
Loop
objConnection.Close
'Example 3
'other example if ihave the following
'"Col1","Col2"
'"aaa","10"
'"bbb","20"
'"aaa","10"
'"aaa","10"
'"aaa","10"
'"ccc","55"
'"bbb","20"
'"aaa","10"
'"fff","10"
'"ddd","66"
'"bbb","20"
'"bbb","20"
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""
objRecordset.Open "SELECT TOP 2 COUNT(*) AS Col2, Col1 FROM " & strFileName2 & " GROUP BY Col1", objConnection, adOpenStatic, adLockOptimistic, adCmdText
wscript.echo "Result for top 2"
Do Until objRecordset.EOF
Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2") & " Times"
objRecordset.MoveNext
Loop
objConnection.Close
sample2.csvsample1.csv
ASKER
Ah - the error is because the file "sample1.csv" is not on the desktop. Put the file there and everything should work just fine. The jet database stuff should be on your machine already - no need to do anything more. The error is caused because it cannot find the file.
If you want, comment out line 11 and remove the remove the apostrophe from line 12 making sure the path/string points to the folder containing the csv file.
If you want, comment out line 11 and remove the remove the apostrophe from line 12 making sure the path/string points to the folder containing the csv file.
Well, I was going to play around with doing this with dictionaries and itterating through them to find highest values and counts and such, but I have to say, rejoinder's approach seems like a great way to approach this problem to me. Well done.
~bp
~bp
ASKER
Thank you
but i get same error
but i get same error
This is really odd... I tested this script on two machines and everything worked. All three files are on the desktop, the vbs and the two csv sample files.
Bill, if you are tracking this ticket, can you try the script out?
Bill, if you are tracking this ticket, can you try the script out?
I am tracking, and I will test out. I did test it out placing the CSV files in a local folder and uncommenting:
'strPathtoTextFile = "C:\csv-files\"
and that worked fine for me (Windows 7), but I will try it with the files on the Desktop and see how that goes.
~bp
'strPathtoTextFile = "C:\csv-files\"
and that worked fine for me (Windows 7), but I will try it with the files on the Desktop and see how that goes.
~bp
Okay, sorry for the delay. I ran the test on two systems here, a Windows 7 system, and an older XP system. Both test worked fine. I did add a display of the "strPathtoTextFile" variable right after it is set just to make sure it was getting the proper value, which it is.
here's the output from the Windows 7 test, with the data files (CSV) on the Desktop, and the VBS script in a Temp folder:
~bp
here's the output from the Windows 7 test, with the data files (CSV) on the Desktop, and the VBS script in a Temp folder:
[c:\temp\EE26647091]cscript EE26647091.vbs
Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.
C:\Users\username\Desktop
Result for Top 2
ddd 66
eee 55
ccc 55
Result for top 2
aaa 10
bbb 20
Result for top 2
aaa 5 Times
bbb 4 Times
[c:\temp\EE26647091]
Then I ran another test on a Windows XP machine, and placed the CSV abd VBS files all on the Desktop. This worked fine as well per the output below:C:\Documents and Settings\username\Desktop>cscript EE26647091.vbs
Microsoft (R) Windows Script Host Version 5.7
Copyright (C) Microsoft Corporation. All rights reserved.
C:\Documents and Settings\username\Desktop
Result for Top 2
ddd 66
eee 55
ccc 55
Result for top 2
aaa 10
bbb 20
Result for top 2
aaa 5 Times
bbb 4 Times
C:\Documents and Settings\username\Desktop>
At this point it might be worth you posting up the exact script you are running perhaps a small change is needed.~bp
Too funny - I just tested it on an XP machine and it worked there as well.
Back to you with_out... let us know how you make out.
Back to you with_out... let us know how you make out.
ASKER
Hi All,
Thank you for reply, i get other error now when i run script in win 7.
is it possible to find the top 2 by vbscript only and with out use Provider=Microsoft.Jet.OLE DB.4.0
Thank you
Error.jpg
Thank you for reply, i get other error now when i run script in win 7.
is it possible to find the top 2 by vbscript only and with out use Provider=Microsoft.Jet.OLE
Thank you
Error.jpg
How large will these input files be?
~bp
~bp
ASKER
the above error was from the sample1.csv & sample2.csv
==> How large will these input files be?
I meant when you run the script against real data. Depending on how large the data to be sorted / grouped, that could affect the approach taken.
~bp
I meant when you run the script against real data. Depending on how large the data to be sorted / grouped, that could affect the approach taken.
~bp
ASKER
íts large data
the number of line in the file about 1000000 line
the number of line in the file about 1000000 line
That feels like it would be unpractical to pull into a VBS script in memory in the form of either arrays or dictionaries, and then try and sort and count iteratively. I don't think performance would be satisfactory with a script approach like that.
~bp
~bp
ASKER
even it slow but i need the script importantly if it's possible
If you want a better solution, you could use MS Access, pull the data in then do any kind of query you want. One step above that would be to install SQL Express (http://www.microsoft.com/express/Database/) and have all the SQL tools you would need to tear through millions of records.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Very Thanx MR. rejoinder
It's work well
Thank you
It's work well
Thank you
==> rejoinder
Thanks for the interesting Recordset usage in VBS, nicely done, and very informative for me.
~bp
Thanks for the interesting Recordset usage in VBS, nicely done, and very informative for me.
~bp
In the first case, can there ever be duplicate of the first column, or will those always only exist once?
~bp