Solved

TOP 2 BY vbSCript

Posted on 2010-11-30
22
483 Views
Last Modified: 2012-05-10
Can you help me to get script to provide for me top 2 or Top 2

so if i have CSV file with the following and i want top 2


aaa   10
bbb   20
ccc   55
ddd  66
eee  55

So result will be for Top 2

ddd  66
ccc   55
eee  55

and for top - 2 will be

aaa   10
bbb   20

other example if ihave the following

aaa
bbb
aaa
aaa
aaa
ccc
bbb
aaa
fff
ddd
bbb
bbb


so the top 2 will be as following

aaa 5 Times
bbb 4 Times

Thank you






0
Comment
Question by:with_out_user
  • 8
  • 8
  • 6
22 Comments
 
LVL 52

Expert Comment

by:Bill Prew
ID: 34241858
Okay, this would likely be 2 different scripts, since the logic is different, is that what you have in mind?

In the first case, can there ever be duplicate of the first column, or will those always only exist once?

~bp
0
 

Author Comment

by:with_out_user
ID: 34245524
in both cas can be duplicate
0
 
LVL 14

Expert Comment

by:rejoinder
ID: 34254073
Here is an example of each of the three ideas you had...
Use the sample csv files as for testing the script.

I used double quotes and commas in the CSV file along with column names.

'On Error Resume Next

Const adOpenStatic = 3

Const adLockOptimistic = 3

Const adCmdText = &H0001



Set objConnection = CreateObject("ADODB.Connection")

Set objRecordSet = CreateObject("ADODB.Recordset")



set objFSO = CreateObject("Scripting.FileSystemObject") 

Set WSHshell = CreateObject("WScript.Shell") 

strPathtoTextFile = WSHShell.SpecialFolders("Desktop")

'strPathtoTextFile = "C:\csv-files\"

strFileName1 = "sample1.csv"

strFileName2 = "sample2.csv"



'Example 1

'so if i have CSV file with the following and i want top 2

'"Col1","Col2"

'"aaa","10"

'"bbb","20"

'"ccc","55"

'"ddd","66"

'"eee","55"

'So result will be for Top 2

'ddd 66

'ccc 55

'eee 55

objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""

objRecordset.Open "SELECT TOP 2 * FROM " & strFileName1 & " ORDER BY Col2 DESC", objConnection, adOpenStatic, adLockOptimistic, adCmdText

wscript.echo "Result for Top 2"

Do Until objRecordset.EOF

    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2")

    objRecordset.MoveNext

Loop

objConnection.Close





'Example 2

'and for top - 2 will be

'aaa 10

'bbb 20

objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""

objRecordset.Open "SELECT DISTINCT TOP 2 * FROM " & strFileName1 & " ORDER BY Col1 ASC", objConnection, adOpenStatic, adLockOptimistic, adCmdText

wscript.echo "Result for top 2"

Do Until objRecordset.EOF

    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2")

    objRecordset.MoveNext

Loop

objConnection.Close



'Example 3

'other example if ihave the following

'"Col1","Col2"

'"aaa","10"

'"bbb","20"

'"aaa","10"

'"aaa","10"

'"aaa","10"

'"ccc","55"

'"bbb","20"

'"aaa","10"

'"fff","10"

'"ddd","66"

'"bbb","20"

'"bbb","20"

objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""

objRecordset.Open "SELECT TOP 2 COUNT(*) AS Col2, Col1 FROM " & strFileName2 & " GROUP BY Col1", objConnection, adOpenStatic, adLockOptimistic, adCmdText

wscript.echo "Result for top 2"

Do Until objRecordset.EOF

    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2") & " Times"

    objRecordset.MoveNext

Loop

objConnection.Close

Open in new window

sample2.csv
sample1.csv
0
 

Author Comment

by:with_out_user
ID: 34255623
Thank you but i get attach error

i Have question, is it possible to find the top 2 by vbscript only and with out use Provider=Microsoft.Jet.OLEDB.4.0

 Error
0
 
LVL 14

Expert Comment

by:rejoinder
ID: 34255931
Ah - the error is because the file "sample1.csv" is not on the desktop.  Put the file there and everything should work just fine.  The jet database stuff should be on your machine already - no need to do anything more.  The error is caused because it cannot find the file.
If you want, comment out line 11 and remove the remove the apostrophe from line 12 making sure the path/string points to the folder containing the csv file.
0
 
LVL 52

Expert Comment

by:Bill Prew
ID: 34260504
Well, I was going to play around with doing this with dictionaries and itterating through them to find highest values and counts and such, but I have to say, rejoinder's approach seems like a great way to approach this problem to me.  Well done.

~bp
0
 

Author Comment

by:with_out_user
ID: 34264928
Thank you

but i get same error

0
 
LVL 14

Expert Comment

by:rejoinder
ID: 34268028
This is really odd... I tested this script on two machines and everything worked.  All three files are on the desktop, the vbs and the two csv sample files.

Bill, if you are tracking this ticket, can you try the script out?
0
 
LVL 52

Expert Comment

by:Bill Prew
ID: 34268059
I am tracking, and I will test out.  I did test it out placing the CSV files in a local folder and uncommenting:

'strPathtoTextFile = "C:\csv-files\"

and that worked fine for me (Windows 7), but I will try it with the files on the Desktop and see how that goes.

~bp
0
 
LVL 52

Expert Comment

by:Bill Prew
ID: 34269488
Okay, sorry for the delay.  I ran the test on two systems here, a Windows 7 system, and an older XP system.  Both test worked fine.  I did add a display of the "strPathtoTextFile" variable right after it is set just to make sure it was getting the proper value, which it is.

here's the output from the Windows 7 test, with the data files (CSV) on the Desktop, and the VBS script in a Temp folder:

[c:\temp\EE26647091]cscript EE26647091.vbs
Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.

C:\Users\username\Desktop
Result for Top 2
ddd 66
eee 55
ccc 55
Result for top 2
aaa 10
bbb 20
Result for top 2
aaa 5 Times
bbb 4 Times

[c:\temp\EE26647091]

Open in new window

Then I ran another test on a Windows XP machine, and placed the CSV abd VBS files all on the Desktop.  This worked fine as well per the output below:

C:\Documents and Settings\username\Desktop>cscript EE26647091.vbs
Microsoft (R) Windows Script Host Version 5.7
Copyright (C) Microsoft Corporation. All rights reserved.

C:\Documents and Settings\username\Desktop
Result for Top 2
ddd 66
eee 55
ccc 55
Result for top 2
aaa 10
bbb 20
Result for top 2
aaa 5 Times
bbb 4 Times

C:\Documents and Settings\username\Desktop>

Open in new window

At this point it might be worth you posting up the exact script you are running perhaps a small change is needed.

~bp
0
 
LVL 14

Expert Comment

by:rejoinder
ID: 34269552
Too funny - I just tested it on an XP machine and it worked there as well.

Back to you with_out... let us know how you make out.
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:with_out_user
ID: 34274597
Hi All,

Thank you for reply, i get other error now when i run script in win 7.

is it possible to find the top 2 by vbscript only and with out use Provider=Microsoft.Jet.OLEDB.4.0



Thank you
Error.jpg
0
 
LVL 52

Expert Comment

by:Bill Prew
ID: 34276004
How large will these input files be?

~bp
0
 

Author Comment

by:with_out_user
ID: 34276552
the above error was from the sample1.csv  & sample2.csv
0
 
LVL 52

Expert Comment

by:Bill Prew
ID: 34277368
==> How large will these input files be?

I meant when you run the script against real data.  Depending on how large the data to be sorted / grouped, that could affect the approach taken.

~bp
0
 

Author Comment

by:with_out_user
ID: 34277497
íts large data

the number of line in the file about 1000000 line
0
 
LVL 52

Expert Comment

by:Bill Prew
ID: 34278226
That feels like it would be unpractical to pull into a VBS script in memory in the form of either arrays or dictionaries, and then try and sort and count iteratively.  I don't think performance would be satisfactory with a script approach like that.

~bp
0
 

Author Comment

by:with_out_user
ID: 34278487
even it slow but i need the script importantly if it's possible
0
 
LVL 14

Expert Comment

by:rejoinder
ID: 34284216
If you want a better solution, you could use MS Access, pull the data in then do any kind of query you want.  One step above that would be to install SQL Express (http://www.microsoft.com/express/Database/) and have all the SQL tools you would need to tear through millions of records.
0
 
LVL 14

Accepted Solution

by:
rejoinder earned 500 total points
ID: 34285944
If the first script was James Bond, this one is Maxwell Smart.  Both will save the day, one just does it so much better.

This script uses the same two files from above - just save the script into the folder as the two sample csv files and run.

Let us know how you make out.

Const ForReading = 1

Const ForWriting =2



Const adVarChar = 200

Const MaxCharacters = 255

Const adFldIsNullable = 32

Const adInteger = 3



Const boolSkipFirstLine = True



Set DataList = CreateObject("ADOR.Recordset")

DataList.Fields.Append "Column1", adVarChar, MaxCharacters, adFldIsNullable

DataList.Fields.Append "Column2", adInteger, , adFldIsNullable

DataList.Open



Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objFile = objFSO.OpenTextFile("sample1.csv", ForReading)



if boolSkipFirstLine then

	objFile.SkipLine

end if



Do Until objFile.AtEndOfStream

	arrColumns = split(replace(objFile.Readline,Chr(34),""),",")

	DataList.AddNew

	DataList("Column1") = arrColumns(0)

	DataList("Column2") = arrColumns(1)

	DataList.Update

loop

objFile.Close



'Example 1

'so if i have CSV file with the following and i want top 2

'"Col1","Col2"

'"aaa","10"

'"bbb","20"

'"ccc","55"

'"ddd","66"

'"eee","55"

'So result will be for Top 2

'ddd 66

'ccc 55

'eee 55

wscript.echo "Result for Top 2"

DataList.Sort = "Column2 DESC, Column1 ASC"

DataList.MoveFirst

intCount = 0

strPreviousItem = ""

Do Until DataList.EOF

	if strPreviousItem <> DataList.Fields.Item("Column2") then

		strPreviousItem = DataList.Fields.Item("Column2")

		intCount = intCount + 1

	end if

	if intCount <= 2 then

		wscript.echo DataList.Fields.Item("Column1") & " " & DataList.Fields.Item("Column2")

	else

		Exit Do

	end if

	DataList.MoveNext

Loop



'Example 2

'and for top - 2 will be

'aaa 10

'bbb 20

wscript.echo "Result for top 2"

DataList.Sort = "Column1 ASC"



DataList.MoveFirst

intCount = 0

Do Until DataList.EOF

	intCount = intCount + 1

	if intCount <= 2 then

		wscript.echo DataList.Fields.Item("Column1") & " " & DataList.Fields.Item("Column2")

	else

		Exit Do

	end if

	DataList.MoveNext

Loop



'Example 3

'Using data such as...

'aaa

'bbb

'aaa

'aaa

'aaa

'ccc

'bbb

'aaa

'fff

'ddd

'bbb

'bbb

'so the top 2 will be as following 

'aaa 5 Times

'bbb 4 Times





wscript.echo "Result for top 2 by count"

Set DataList2 = CreateObject("ADOR.Recordset")

DataList2.Fields.Append "Column1", adVarChar, MaxCharacters, adFldIsNullable

DataList2.Fields.Append "Column2", adInteger, , adFldIsNullable

DataList2.Open



Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objFile = objFSO.OpenTextFile("sample2.csv", ForReading)



if boolSkipFirstLine then

	objFile.SkipLine

end if



Do Until objFile.AtEndOfStream

	arrColumns = split(replace(objFile.Readline,Chr(34),""),",")

	DataList2.AddNew

	DataList2("Column1") = arrColumns(0)

	DataList2("Column2") = arrColumns(1)

	DataList2.Update

loop

objFile.Close



Set countList = CreateObject("ADOR.Recordset")

countList.Fields.Append "Column1", adVarChar, MaxCharacters, adFldIsNullable

countList.Fields.Append "Column2", adInteger, , adFldIsNullable

countList.Open



DataList2.Sort = "Column1 ASC"

boolFirstRecord = True

intCount = 0

strPreviousItem = ""

Do Until DataList2.EOF

	if strPreviousItem <> DataList2.Fields.Item("Column1") then

		if NOT boolFirstRecord then

			countList.AddNew

			countList("Column1") = strPreviousItem

			countList("Column2") = intCount

			countList.Update

			strPreviousItem = DataList2.Fields.Item("Column1")

			intCount = 1

		else

			strPreviousItem = DataList2.Fields.Item("Column1")

			intCount = intCount + 1

		end if

	else

		intCount = intCount + 1

	end if

	boolFirstRecord = False

	DataList2.MoveNext

	if DataList2.EOF then

			countList.AddNew

			countList("Column1") = strPreviousItem

			countList("Column2") = intCount

			countList.Update

	end if

Loop



countList.Sort = "Column2 Desc"

countList.MoveFirst

intCount = 0

Do Until countList.EOF

	intCount = intCount + 1

	if intCount <= 2 then

		wscript.echo countList.Fields.Item("Column1") & " " & countList.Fields.Item("Column2") & " Times"

	else

		Exit Do

	end if

	countList.MoveNext

Loop

Open in new window

0
 

Author Closing Comment

by:with_out_user
ID: 34289487
Very Thanx MR. rejoinder

It's work well

Thank you
0
 
LVL 52

Expert Comment

by:Bill Prew
ID: 34291734
==> rejoinder

Thanks for the interesting Recordset usage in VBS, nicely done, and very informative for me.

~bp
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Script to copy or move mouse-selected collection of files plus targets referenced by shortcuts (.lnk) The purpose of this article is to help illuminate the real challenges and options available (where they may exist) for utilizing simple scriptin…
Introduction During my participation as a VBScript contributor at Experts Exchange, one of the most common questions I come across is this: "I have a script that runs against only one computer. How can I make it run against a list of computers in …
Along with being a a promotional video for my three-day Annielytics Dashboard Seminor, this Micro Tutorial is an intro to Google Analytics API data.
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, just open a new email message. In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…

895 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now