Solved

TOP 2 BY vbSCript

Posted on 2010-11-30
22
481 Views
Last Modified: 2012-05-10
Can you help me to get script to provide for me top 2 or Top 2

so if i have CSV file with the following and i want top 2


aaa   10
bbb   20
ccc   55
ddd  66
eee  55

So result will be for Top 2

ddd  66
ccc   55
eee  55

and for top - 2 will be

aaa   10
bbb   20

other example if ihave the following

aaa
bbb
aaa
aaa
aaa
ccc
bbb
aaa
fff
ddd
bbb
bbb


so the top 2 will be as following

aaa 5 Times
bbb 4 Times

Thank you






0
Comment
Question by:with_out_user
  • 8
  • 8
  • 6
22 Comments
 
LVL 51

Expert Comment

by:Bill Prew
Comment Utility
Okay, this would likely be 2 different scripts, since the logic is different, is that what you have in mind?

In the first case, can there ever be duplicate of the first column, or will those always only exist once?

~bp
0
 

Author Comment

by:with_out_user
Comment Utility
in both cas can be duplicate
0
 
LVL 14

Expert Comment

by:rejoinder
Comment Utility
Here is an example of each of the three ideas you had...
Use the sample csv files as for testing the script.

I used double quotes and commas in the CSV file along with column names.

'On Error Resume Next

Const adOpenStatic = 3

Const adLockOptimistic = 3

Const adCmdText = &H0001



Set objConnection = CreateObject("ADODB.Connection")

Set objRecordSet = CreateObject("ADODB.Recordset")



set objFSO = CreateObject("Scripting.FileSystemObject") 

Set WSHshell = CreateObject("WScript.Shell") 

strPathtoTextFile = WSHShell.SpecialFolders("Desktop")

'strPathtoTextFile = "C:\csv-files\"

strFileName1 = "sample1.csv"

strFileName2 = "sample2.csv"



'Example 1

'so if i have CSV file with the following and i want top 2

'"Col1","Col2"

'"aaa","10"

'"bbb","20"

'"ccc","55"

'"ddd","66"

'"eee","55"

'So result will be for Top 2

'ddd 66

'ccc 55

'eee 55

objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""

objRecordset.Open "SELECT TOP 2 * FROM " & strFileName1 & " ORDER BY Col2 DESC", objConnection, adOpenStatic, adLockOptimistic, adCmdText

wscript.echo "Result for Top 2"

Do Until objRecordset.EOF

    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2")

    objRecordset.MoveNext

Loop

objConnection.Close





'Example 2

'and for top - 2 will be

'aaa 10

'bbb 20

objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""

objRecordset.Open "SELECT DISTINCT TOP 2 * FROM " & strFileName1 & " ORDER BY Col1 ASC", objConnection, adOpenStatic, adLockOptimistic, adCmdText

wscript.echo "Result for top 2"

Do Until objRecordset.EOF

    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2")

    objRecordset.MoveNext

Loop

objConnection.Close



'Example 3

'other example if ihave the following

'"Col1","Col2"

'"aaa","10"

'"bbb","20"

'"aaa","10"

'"aaa","10"

'"aaa","10"

'"ccc","55"

'"bbb","20"

'"aaa","10"

'"fff","10"

'"ddd","66"

'"bbb","20"

'"bbb","20"

objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & strPathtoTextFile & ";" & "Extended Properties=""text;HDR=YES;FMT=Delimited"""

objRecordset.Open "SELECT TOP 2 COUNT(*) AS Col2, Col1 FROM " & strFileName2 & " GROUP BY Col1", objConnection, adOpenStatic, adLockOptimistic, adCmdText

wscript.echo "Result for top 2"

Do Until objRecordset.EOF

    Wscript.Echo objRecordset.Fields.Item("Col1") & " " & objRecordset.Fields.Item("Col2") & " Times"

    objRecordset.MoveNext

Loop

objConnection.Close

Open in new window

sample2.csv
sample1.csv
0
 

Author Comment

by:with_out_user
Comment Utility
Thank you but i get attach error

i Have question, is it possible to find the top 2 by vbscript only and with out use Provider=Microsoft.Jet.OLEDB.4.0

 Error
0
 
LVL 14

Expert Comment

by:rejoinder
Comment Utility
Ah - the error is because the file "sample1.csv" is not on the desktop.  Put the file there and everything should work just fine.  The jet database stuff should be on your machine already - no need to do anything more.  The error is caused because it cannot find the file.
If you want, comment out line 11 and remove the remove the apostrophe from line 12 making sure the path/string points to the folder containing the csv file.
0
 
LVL 51

Expert Comment

by:Bill Prew
Comment Utility
Well, I was going to play around with doing this with dictionaries and itterating through them to find highest values and counts and such, but I have to say, rejoinder's approach seems like a great way to approach this problem to me.  Well done.

~bp
0
 

Author Comment

by:with_out_user
Comment Utility
Thank you

but i get same error

0
 
LVL 14

Expert Comment

by:rejoinder
Comment Utility
This is really odd... I tested this script on two machines and everything worked.  All three files are on the desktop, the vbs and the two csv sample files.

Bill, if you are tracking this ticket, can you try the script out?
0
 
LVL 51

Expert Comment

by:Bill Prew
Comment Utility
I am tracking, and I will test out.  I did test it out placing the CSV files in a local folder and uncommenting:

'strPathtoTextFile = "C:\csv-files\"

and that worked fine for me (Windows 7), but I will try it with the files on the Desktop and see how that goes.

~bp
0
 
LVL 51

Expert Comment

by:Bill Prew
Comment Utility
Okay, sorry for the delay.  I ran the test on two systems here, a Windows 7 system, and an older XP system.  Both test worked fine.  I did add a display of the "strPathtoTextFile" variable right after it is set just to make sure it was getting the proper value, which it is.

here's the output from the Windows 7 test, with the data files (CSV) on the Desktop, and the VBS script in a Temp folder:

[c:\temp\EE26647091]cscript EE26647091.vbs
Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.

C:\Users\username\Desktop
Result for Top 2
ddd 66
eee 55
ccc 55
Result for top 2
aaa 10
bbb 20
Result for top 2
aaa 5 Times
bbb 4 Times

[c:\temp\EE26647091]

Open in new window

Then I ran another test on a Windows XP machine, and placed the CSV abd VBS files all on the Desktop.  This worked fine as well per the output below:

C:\Documents and Settings\username\Desktop>cscript EE26647091.vbs
Microsoft (R) Windows Script Host Version 5.7
Copyright (C) Microsoft Corporation. All rights reserved.

C:\Documents and Settings\username\Desktop
Result for Top 2
ddd 66
eee 55
ccc 55
Result for top 2
aaa 10
bbb 20
Result for top 2
aaa 5 Times
bbb 4 Times

C:\Documents and Settings\username\Desktop>

Open in new window

At this point it might be worth you posting up the exact script you are running perhaps a small change is needed.

~bp
0
 
LVL 14

Expert Comment

by:rejoinder
Comment Utility
Too funny - I just tested it on an XP machine and it worked there as well.

Back to you with_out... let us know how you make out.
0
What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 

Author Comment

by:with_out_user
Comment Utility
Hi All,

Thank you for reply, i get other error now when i run script in win 7.

is it possible to find the top 2 by vbscript only and with out use Provider=Microsoft.Jet.OLEDB.4.0



Thank you
Error.jpg
0
 
LVL 51

Expert Comment

by:Bill Prew
Comment Utility
How large will these input files be?

~bp
0
 

Author Comment

by:with_out_user
Comment Utility
the above error was from the sample1.csv  & sample2.csv
0
 
LVL 51

Expert Comment

by:Bill Prew
Comment Utility
==> How large will these input files be?

I meant when you run the script against real data.  Depending on how large the data to be sorted / grouped, that could affect the approach taken.

~bp
0
 

Author Comment

by:with_out_user
Comment Utility
íts large data

the number of line in the file about 1000000 line
0
 
LVL 51

Expert Comment

by:Bill Prew
Comment Utility
That feels like it would be unpractical to pull into a VBS script in memory in the form of either arrays or dictionaries, and then try and sort and count iteratively.  I don't think performance would be satisfactory with a script approach like that.

~bp
0
 

Author Comment

by:with_out_user
Comment Utility
even it slow but i need the script importantly if it's possible
0
 
LVL 14

Expert Comment

by:rejoinder
Comment Utility
If you want a better solution, you could use MS Access, pull the data in then do any kind of query you want.  One step above that would be to install SQL Express (http://www.microsoft.com/express/Database/) and have all the SQL tools you would need to tear through millions of records.
0
 
LVL 14

Accepted Solution

by:
rejoinder earned 500 total points
Comment Utility
If the first script was James Bond, this one is Maxwell Smart.  Both will save the day, one just does it so much better.

This script uses the same two files from above - just save the script into the folder as the two sample csv files and run.

Let us know how you make out.

Const ForReading = 1

Const ForWriting =2



Const adVarChar = 200

Const MaxCharacters = 255

Const adFldIsNullable = 32

Const adInteger = 3



Const boolSkipFirstLine = True



Set DataList = CreateObject("ADOR.Recordset")

DataList.Fields.Append "Column1", adVarChar, MaxCharacters, adFldIsNullable

DataList.Fields.Append "Column2", adInteger, , adFldIsNullable

DataList.Open



Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objFile = objFSO.OpenTextFile("sample1.csv", ForReading)



if boolSkipFirstLine then

	objFile.SkipLine

end if



Do Until objFile.AtEndOfStream

	arrColumns = split(replace(objFile.Readline,Chr(34),""),",")

	DataList.AddNew

	DataList("Column1") = arrColumns(0)

	DataList("Column2") = arrColumns(1)

	DataList.Update

loop

objFile.Close



'Example 1

'so if i have CSV file with the following and i want top 2

'"Col1","Col2"

'"aaa","10"

'"bbb","20"

'"ccc","55"

'"ddd","66"

'"eee","55"

'So result will be for Top 2

'ddd 66

'ccc 55

'eee 55

wscript.echo "Result for Top 2"

DataList.Sort = "Column2 DESC, Column1 ASC"

DataList.MoveFirst

intCount = 0

strPreviousItem = ""

Do Until DataList.EOF

	if strPreviousItem <> DataList.Fields.Item("Column2") then

		strPreviousItem = DataList.Fields.Item("Column2")

		intCount = intCount + 1

	end if

	if intCount <= 2 then

		wscript.echo DataList.Fields.Item("Column1") & " " & DataList.Fields.Item("Column2")

	else

		Exit Do

	end if

	DataList.MoveNext

Loop



'Example 2

'and for top - 2 will be

'aaa 10

'bbb 20

wscript.echo "Result for top 2"

DataList.Sort = "Column1 ASC"



DataList.MoveFirst

intCount = 0

Do Until DataList.EOF

	intCount = intCount + 1

	if intCount <= 2 then

		wscript.echo DataList.Fields.Item("Column1") & " " & DataList.Fields.Item("Column2")

	else

		Exit Do

	end if

	DataList.MoveNext

Loop



'Example 3

'Using data such as...

'aaa

'bbb

'aaa

'aaa

'aaa

'ccc

'bbb

'aaa

'fff

'ddd

'bbb

'bbb

'so the top 2 will be as following 

'aaa 5 Times

'bbb 4 Times





wscript.echo "Result for top 2 by count"

Set DataList2 = CreateObject("ADOR.Recordset")

DataList2.Fields.Append "Column1", adVarChar, MaxCharacters, adFldIsNullable

DataList2.Fields.Append "Column2", adInteger, , adFldIsNullable

DataList2.Open



Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objFile = objFSO.OpenTextFile("sample2.csv", ForReading)



if boolSkipFirstLine then

	objFile.SkipLine

end if



Do Until objFile.AtEndOfStream

	arrColumns = split(replace(objFile.Readline,Chr(34),""),",")

	DataList2.AddNew

	DataList2("Column1") = arrColumns(0)

	DataList2("Column2") = arrColumns(1)

	DataList2.Update

loop

objFile.Close



Set countList = CreateObject("ADOR.Recordset")

countList.Fields.Append "Column1", adVarChar, MaxCharacters, adFldIsNullable

countList.Fields.Append "Column2", adInteger, , adFldIsNullable

countList.Open



DataList2.Sort = "Column1 ASC"

boolFirstRecord = True

intCount = 0

strPreviousItem = ""

Do Until DataList2.EOF

	if strPreviousItem <> DataList2.Fields.Item("Column1") then

		if NOT boolFirstRecord then

			countList.AddNew

			countList("Column1") = strPreviousItem

			countList("Column2") = intCount

			countList.Update

			strPreviousItem = DataList2.Fields.Item("Column1")

			intCount = 1

		else

			strPreviousItem = DataList2.Fields.Item("Column1")

			intCount = intCount + 1

		end if

	else

		intCount = intCount + 1

	end if

	boolFirstRecord = False

	DataList2.MoveNext

	if DataList2.EOF then

			countList.AddNew

			countList("Column1") = strPreviousItem

			countList("Column2") = intCount

			countList.Update

	end if

Loop



countList.Sort = "Column2 Desc"

countList.MoveFirst

intCount = 0

Do Until countList.EOF

	intCount = intCount + 1

	if intCount <= 2 then

		wscript.echo countList.Fields.Item("Column1") & " " & countList.Fields.Item("Column2") & " Times"

	else

		Exit Do

	end if

	countList.MoveNext

Loop

Open in new window

0
 

Author Closing Comment

by:with_out_user
Comment Utility
Very Thanx MR. rejoinder

It's work well

Thank you
0
 
LVL 51

Expert Comment

by:Bill Prew
Comment Utility
==> rejoinder

Thanks for the interesting Recordset usage in VBS, nicely done, and very informative for me.

~bp
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Well hello again!  Glad to see you've made it this far without giving up.  In this, the fourth installment of my popular series, I'm going to cover functions and subroutines, what they are, and why they are useful.  Just in case you stumbled onto th…
I met Paul Devereux (@pdevereux) today when I responded to his tweet asking “Anybody know how to automate adding files from disk to a folder in #outlook  ?”.  I replied back and told Paul that using automation, in this case scripting, to add files t…
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now