Solved

TOP 2 BY vbSCript

Posted on 2010-11-30
22
487 Views
Last Modified: 2012-05-10
Can you help me to get script to provide for me top 2 or Top 2

so if i have CSV file with the following and i want top 2


aaa   10
bbb   20
ccc   55
ddd  66
eee  55

So result will be for Top 2

ddd  66
ccc   55
eee  55

and for top - 2 will be

aaa   10
bbb   20

other example if ihave the following

aaa
bbb
aaa
aaa
aaa
ccc
bbb
aaa
fff
ddd
bbb
bbb


so the top 2 will be as following

aaa 5 Times
bbb 4 Times

Thank you






0
Comment
Question by:with_out_user
  • 8
  • 8
  • 6
22 Comments
 
LVL 53

Expert Comment

by:Bill Prew
ID: 34241858
Okay, this would likely be 2 different scripts, since the logic is different, is that what you have in mind?

In the first case, can there ever be duplicate of the first column, or will those always only exist once?

~bp
0
 

Author Comment

by:with_out_user
ID: 34245524
in both cas can be duplicate
0
 
LVL 14

Expert Comment

by:rejoinder
ID: 34254073
Here is an example of each of the three ideas you had...
Use the sample csv files as for testing the script.

I used double quotes and commas in the CSV file along with column names.

'On Error Resume Next
Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H0001

Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")

set objFSO = CreateObject("Scripting.FileSystemObject") 
Set WSHshell = CreateObject("WScript.Shell") 
strPathtoTextFile = WSHShell.SpecialFolders("Desktop")
'strPathtoTextFile = "C:\csv-files\"
strFileName1 = "sample1.csv"
strFileName2 = "sample2.csv"

'Example 1
'so if i have CSV file with the following and i want top 2
'"Col1","Col2"
'"aaa","10"
'"bbb","20"
'"ccc","55"
'"ddd","66"
'"eee","55"
'So result will be for Top 2
'ddd 66
'ccc 55
'eee 55
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""
objRecordset.Open "SELECT TOP 2 * FROM " & strFileName1 & " ORDER BY Col2 DESC", objConnection, adOpenStatic, adLockOptimistic, adCmdText
wscript.echo "Result for Top 2"
Do Until objRecordset.EOF
    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2")
    objRecordset.MoveNext
Loop
objConnection.Close


'Example 2
'and for top - 2 will be
'aaa 10
'bbb 20
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""
objRecordset.Open "SELECT DISTINCT TOP 2 * FROM " & strFileName1 & " ORDER BY Col1 ASC", objConnection, adOpenStatic, adLockOptimistic, adCmdText
wscript.echo "Result for top 2"
Do Until objRecordset.EOF
    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2")
    objRecordset.MoveNext
Loop
objConnection.Close

'Example 3
'other example if ihave the following
'"Col1","Col2"
'"aaa","10"
'"bbb","20"
'"aaa","10"
'"aaa","10"
'"aaa","10"
'"ccc","55"
'"bbb","20"
'"aaa","10"
'"fff","10"
'"ddd","66"
'"bbb","20"
'"bbb","20"
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""
objRecordset.Open "SELECT TOP 2 COUNT(*) AS Col2, Col1 FROM " & strFileName2 & " GROUP BY Col1", objConnection, adOpenStatic, adLockOptimistic, adCmdText
wscript.echo "Result for top 2"
Do Until objRecordset.EOF
    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2") & " Times"
    objRecordset.MoveNext
Loop
objConnection.Close

Open in new window

sample2.csv
sample1.csv
0
Networking for the Cloud Era

Join Microsoft and Riverbed for a discussion and demonstration of enhancements to SteelConnect:
-One-click orchestration and cloud connectivity in Azure environments
-Tight integration of SD-WAN and WAN optimization capabilities
-Scalability and resiliency equal to a data center

 

Author Comment

by:with_out_user
ID: 34255623
Thank you but i get attach error

i Have question, is it possible to find the top 2 by vbscript only and with out use Provider=Microsoft.Jet.OLEDB.4.0

 Error
0
 
LVL 14

Expert Comment

by:rejoinder
ID: 34255931
Ah - the error is because the file "sample1.csv" is not on the desktop.  Put the file there and everything should work just fine.  The jet database stuff should be on your machine already - no need to do anything more.  The error is caused because it cannot find the file.
If you want, comment out line 11 and remove the remove the apostrophe from line 12 making sure the path/string points to the folder containing the csv file.
0
 
LVL 53

Expert Comment

by:Bill Prew
ID: 34260504
Well, I was going to play around with doing this with dictionaries and itterating through them to find highest values and counts and such, but I have to say, rejoinder's approach seems like a great way to approach this problem to me.  Well done.

~bp
0
 

Author Comment

by:with_out_user
ID: 34264928
Thank you

but i get same error

0
 
LVL 14

Expert Comment

by:rejoinder
ID: 34268028
This is really odd... I tested this script on two machines and everything worked.  All three files are on the desktop, the vbs and the two csv sample files.

Bill, if you are tracking this ticket, can you try the script out?
0
 
LVL 53

Expert Comment

by:Bill Prew
ID: 34268059
I am tracking, and I will test out.  I did test it out placing the CSV files in a local folder and uncommenting:

'strPathtoTextFile = "C:\csv-files\"

and that worked fine for me (Windows 7), but I will try it with the files on the Desktop and see how that goes.

~bp
0
 
LVL 53

Expert Comment

by:Bill Prew
ID: 34269488
Okay, sorry for the delay.  I ran the test on two systems here, a Windows 7 system, and an older XP system.  Both test worked fine.  I did add a display of the "strPathtoTextFile" variable right after it is set just to make sure it was getting the proper value, which it is.

here's the output from the Windows 7 test, with the data files (CSV) on the Desktop, and the VBS script in a Temp folder:

[c:\temp\EE26647091]cscript EE26647091.vbs
Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.

C:\Users\username\Desktop
Result for Top 2
ddd 66
eee 55
ccc 55
Result for top 2
aaa 10
bbb 20
Result for top 2
aaa 5 Times
bbb 4 Times

[c:\temp\EE26647091]

Open in new window

Then I ran another test on a Windows XP machine, and placed the CSV abd VBS files all on the Desktop.  This worked fine as well per the output below:

C:\Documents and Settings\username\Desktop>cscript EE26647091.vbs
Microsoft (R) Windows Script Host Version 5.7
Copyright (C) Microsoft Corporation. All rights reserved.

C:\Documents and Settings\username\Desktop
Result for Top 2
ddd 66
eee 55
ccc 55
Result for top 2
aaa 10
bbb 20
Result for top 2
aaa 5 Times
bbb 4 Times

C:\Documents and Settings\username\Desktop>

Open in new window

At this point it might be worth you posting up the exact script you are running perhaps a small change is needed.

~bp
0
 
LVL 14

Expert Comment

by:rejoinder
ID: 34269552
Too funny - I just tested it on an XP machine and it worked there as well.

Back to you with_out... let us know how you make out.
0
 

Author Comment

by:with_out_user
ID: 34274597
Hi All,

Thank you for reply, i get other error now when i run script in win 7.

is it possible to find the top 2 by vbscript only and with out use Provider=Microsoft.Jet.OLEDB.4.0



Thank you
Error.jpg
0
 
LVL 53

Expert Comment

by:Bill Prew
ID: 34276004
How large will these input files be?

~bp
0
 

Author Comment

by:with_out_user
ID: 34276552
the above error was from the sample1.csv  & sample2.csv
0
 
LVL 53

Expert Comment

by:Bill Prew
ID: 34277368
==> How large will these input files be?

I meant when you run the script against real data.  Depending on how large the data to be sorted / grouped, that could affect the approach taken.

~bp
0
 

Author Comment

by:with_out_user
ID: 34277497
íts large data

the number of line in the file about 1000000 line
0
 
LVL 53

Expert Comment

by:Bill Prew
ID: 34278226
That feels like it would be unpractical to pull into a VBS script in memory in the form of either arrays or dictionaries, and then try and sort and count iteratively.  I don't think performance would be satisfactory with a script approach like that.

~bp
0
 

Author Comment

by:with_out_user
ID: 34278487
even it slow but i need the script importantly if it's possible
0
 
LVL 14

Expert Comment

by:rejoinder
ID: 34284216
If you want a better solution, you could use MS Access, pull the data in then do any kind of query you want.  One step above that would be to install SQL Express (http://www.microsoft.com/express/Database/) and have all the SQL tools you would need to tear through millions of records.
0
 
LVL 14

Accepted Solution

by:
rejoinder earned 500 total points
ID: 34285944
If the first script was James Bond, this one is Maxwell Smart.  Both will save the day, one just does it so much better.

This script uses the same two files from above - just save the script into the folder as the two sample csv files and run.

Let us know how you make out.

Const ForReading = 1
Const ForWriting =2

Const adVarChar = 200
Const MaxCharacters = 255
Const adFldIsNullable = 32
Const adInteger = 3

Const boolSkipFirstLine = True

Set DataList = CreateObject("ADOR.Recordset")
DataList.Fields.Append "Column1", adVarChar, MaxCharacters, adFldIsNullable
DataList.Fields.Append "Column2", adInteger, , adFldIsNullable
DataList.Open

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("sample1.csv", ForReading)

if boolSkipFirstLine then
	objFile.SkipLine
end if

Do Until objFile.AtEndOfStream
	arrColumns = split(replace(objFile.Readline,Chr(34),""),",")
	DataList.AddNew
	DataList("Column1") = arrColumns(0)
	DataList("Column2") = arrColumns(1)
	DataList.Update
loop
objFile.Close

'Example 1
'so if i have CSV file with the following and i want top 2
'"Col1","Col2"
'"aaa","10"
'"bbb","20"
'"ccc","55"
'"ddd","66"
'"eee","55"
'So result will be for Top 2
'ddd 66
'ccc 55
'eee 55
wscript.echo "Result for Top 2"
DataList.Sort = "Column2 DESC, Column1 ASC"
DataList.MoveFirst
intCount = 0
strPreviousItem = ""
Do Until DataList.EOF
	if strPreviousItem <> DataList.Fields.Item("Column2") then
		strPreviousItem = DataList.Fields.Item("Column2")
		intCount = intCount + 1
	end if
	if intCount <= 2 then
		wscript.echo DataList.Fields.Item("Column1") & " " & DataList.Fields.Item("Column2")
	else
		Exit Do
	end if
	DataList.MoveNext
Loop

'Example 2
'and for top - 2 will be
'aaa 10
'bbb 20
wscript.echo "Result for top 2"
DataList.Sort = "Column1 ASC"

DataList.MoveFirst
intCount = 0
Do Until DataList.EOF
	intCount = intCount + 1
	if intCount <= 2 then
		wscript.echo DataList.Fields.Item("Column1") & " " & DataList.Fields.Item("Column2")
	else
		Exit Do
	end if
	DataList.MoveNext
Loop

'Example 3
'Using data such as...
'aaa
'bbb
'aaa
'aaa
'aaa
'ccc
'bbb
'aaa
'fff
'ddd
'bbb
'bbb
'so the top 2 will be as following 
'aaa 5 Times
'bbb 4 Times


wscript.echo "Result for top 2 by count"
Set DataList2 = CreateObject("ADOR.Recordset")
DataList2.Fields.Append "Column1", adVarChar, MaxCharacters, adFldIsNullable
DataList2.Fields.Append "Column2", adInteger, , adFldIsNullable
DataList2.Open

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("sample2.csv", ForReading)

if boolSkipFirstLine then
	objFile.SkipLine
end if

Do Until objFile.AtEndOfStream
	arrColumns = split(replace(objFile.Readline,Chr(34),""),",")
	DataList2.AddNew
	DataList2("Column1") = arrColumns(0)
	DataList2("Column2") = arrColumns(1)
	DataList2.Update
loop
objFile.Close

Set countList = CreateObject("ADOR.Recordset")
countList.Fields.Append "Column1", adVarChar, MaxCharacters, adFldIsNullable
countList.Fields.Append "Column2", adInteger, , adFldIsNullable
countList.Open

DataList2.Sort = "Column1 ASC"
boolFirstRecord = True
intCount = 0
strPreviousItem = ""
Do Until DataList2.EOF
	if strPreviousItem <> DataList2.Fields.Item("Column1") then
		if NOT boolFirstRecord then
			countList.AddNew
			countList("Column1") = strPreviousItem
			countList("Column2") = intCount
			countList.Update
			strPreviousItem = DataList2.Fields.Item("Column1")
			intCount = 1
		else
			strPreviousItem = DataList2.Fields.Item("Column1")
			intCount = intCount + 1
		end if
	else
		intCount = intCount + 1
	end if
	boolFirstRecord = False
	DataList2.MoveNext
	if DataList2.EOF then
			countList.AddNew
			countList("Column1") = strPreviousItem
			countList("Column2") = intCount
			countList.Update
	end if
Loop

countList.Sort = "Column2 Desc"
countList.MoveFirst
intCount = 0
Do Until countList.EOF
	intCount = intCount + 1
	if intCount <= 2 then
		wscript.echo countList.Fields.Item("Column1") & " " & countList.Fields.Item("Column2") & " Times"
	else
		Exit Do
	end if
	countList.MoveNext
Loop

Open in new window

0
 

Author Closing Comment

by:with_out_user
ID: 34289487
Very Thanx MR. rejoinder

It's work well

Thank you
0
 
LVL 53

Expert Comment

by:Bill Prew
ID: 34291734
==> rejoinder

Thanks for the interesting Recordset usage in VBS, nicely done, and very informative for me.

~bp
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

In this article we want to have a look at the directory attributes which are used by Microsoft to store the so called Security Identifiers (SID). These SIDs plays an important role in delegating and granting permissions and in authentication of trus…
Not long ago I saw a question in the VB Script forum that I thought would not take much time. You can read that question (Question ID  (http://www.experts-exchange.com/Programming/Languages/Visual_Basic/VB_Script/Q_28455246.html)28455246) Here (http…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question