Solved

Powershell query of multiple CSV files for Product details

Posted on 2013-01-26
6
488 Views
Last Modified: 2013-02-05
Hi

Every week, we get an export of data from our vendor in CSV format that shows these columns:

Product
Items Sold
Items Reserved
Reference number
SAP Code

These get stored in G:\Historical Data in the format <date>.csv, e.g. 01-25-2013.csv

Sometimes, we get a query from our users saying, when was the last time we had a certain Product (e.g. Battery) listed under SAP Code=141 . We need a way to query all these CSV files and state the last time

Product was equal to Battery
AND
SAP Code was equal to 141

Is there a Powershell that can do that? Running Windows 2008 Server.
0
Comment
Question by:cpancamo
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 38821932
If the sometimes is often enough, and the data not too big, I would store the content of the files in a database. The query would be trivial. Inspecting all those files seems inefficient...
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 38822046
I'm wondering how they get that data, and why they are not able to query their own DB the export certainly comes from ...
I have to agree that this is usually a DB question, and best to handle it that way. But if you have to:
function FindRecent([String] $Product, [String] $SAPCode)
{
get-childitem 'G:\Historical Data\*.csv' |
  sort { $_.Name[6..9] + $_.Name[0,1] + $_.Name[3,4] } -desc | % {
    $cnt = @(import-csv $_.FullName | where { $_.Product -eq $Product -and $_."SAP Code" -eq $SAPcode }).count
    if ($cnt -gt 0) 
    {
      write-output $_.Name.Replace('.csv','')
      break
     }
  }
}

findrecent 'Battery' '141'

Open in new window

0
 

Author Comment

by:cpancamo
ID: 38822634
Thanks Qlemo.

I'm not following your script though could you explain it to me?
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 70

Accepted Solution

by:
Qlemo earned 500 total points
ID: 38823122
The last line executes the function defined in the remainder of the script, providing the values to search for (product and SAP code).

The function works like this:
3: Get all *.csv files located in "G:\Historical Data" folder (more precisely, get the filesystem entries). The result is a list of filesystem objects containing name, path, size aso.

4: Sort that list for date. We extract the date from the filename, hence it looks more complicated. For proper sorting we need to sort for year, month, day, in descending order. The newest file is coming first now.
The sorted list is then processed one object after another, using foreach-object (alias is %).

5: Generate object collations (arrays) from each CSV file, then filter for the rows we search for, and count how many rows that are.

6: We are only interested if the count is greater than 0, which means we found records.

8-9: If so, we are finished, and only need to send the result to the pipeline (which will eventually print the result on screen, if not catching into a variable).

If you want to fully understand the script, it is best to use something like the PowerShell ISE, and step thru the code line-wise, monitoring variable contents.
0
 

Author Comment

by:cpancamo
ID: 38823934
Thanks Qlemo, makes perfect sense :)

Just one query (and I'm more than happy to open another question for this), if we wanted to have an exported list of ALL the times that Battery was listed as SAP Code=141, is that easy to do? So, we're not saying the most recent, but any time?
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 38823994
Remove the break to get all occurrences in descending order, and remove the -desc from sort (which is the alias for sort-object, btw) if you want them in ascending order.
While your original question might perform reasonably, the extended one will probably not, if there are a lot of files (more than 100). As said above, if you want to do that more often you should allow for something more DB-like.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Create and license users in Office 365 in bulk based on a CSV file. A step-by-step guide with PowerShell script examples.
The Nano Server Image Builder helps you create a custom Nano Server image and bootable USB media with the aid of a graphical interface. Based on the inputs you provide, it generates images for deployment and creates reusable PowerShell scripts that …
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…
In this video we outline the Physical Segments view of NetCrunch network monitor. By following this brief how-to video, you will be able to learn how NetCrunch visualizes your network, how granular is the information collected, as well as where to f…

729 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question