Solved

Powershell query of multiple CSV files for Product details

Posted on 2013-01-26
6
493 Views
Last Modified: 2013-02-05
Hi

Every week, we get an export of data from our vendor in CSV format that shows these columns:

Product
Items Sold
Items Reserved
Reference number
SAP Code

These get stored in G:\Historical Data in the format <date>.csv, e.g. 01-25-2013.csv

Sometimes, we get a query from our users saying, when was the last time we had a certain Product (e.g. Battery) listed under SAP Code=141 . We need a way to query all these CSV files and state the last time

Product was equal to Battery
AND
SAP Code was equal to 141

Is there a Powershell that can do that? Running Windows 2008 Server.
0
Comment
Question by:cpancamo
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 38821932
If the sometimes is often enough, and the data not too big, I would store the content of the files in a database. The query would be trivial. Inspecting all those files seems inefficient...
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 38822046
I'm wondering how they get that data, and why they are not able to query their own DB the export certainly comes from ...
I have to agree that this is usually a DB question, and best to handle it that way. But if you have to:
function FindRecent([String] $Product, [String] $SAPCode)
{
get-childitem 'G:\Historical Data\*.csv' |
  sort { $_.Name[6..9] + $_.Name[0,1] + $_.Name[3,4] } -desc | % {
    $cnt = @(import-csv $_.FullName | where { $_.Product -eq $Product -and $_."SAP Code" -eq $SAPcode }).count
    if ($cnt -gt 0) 
    {
      write-output $_.Name.Replace('.csv','')
      break
     }
  }
}

findrecent 'Battery' '141'

Open in new window

0
 

Author Comment

by:cpancamo
ID: 38822634
Thanks Qlemo.

I'm not following your script though could you explain it to me?
0
NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

 
LVL 70

Accepted Solution

by:
Qlemo earned 500 total points
ID: 38823122
The last line executes the function defined in the remainder of the script, providing the values to search for (product and SAP code).

The function works like this:
3: Get all *.csv files located in "G:\Historical Data" folder (more precisely, get the filesystem entries). The result is a list of filesystem objects containing name, path, size aso.

4: Sort that list for date. We extract the date from the filename, hence it looks more complicated. For proper sorting we need to sort for year, month, day, in descending order. The newest file is coming first now.
The sorted list is then processed one object after another, using foreach-object (alias is %).

5: Generate object collations (arrays) from each CSV file, then filter for the rows we search for, and count how many rows that are.

6: We are only interested if the count is greater than 0, which means we found records.

8-9: If so, we are finished, and only need to send the result to the pipeline (which will eventually print the result on screen, if not catching into a variable).

If you want to fully understand the script, it is best to use something like the PowerShell ISE, and step thru the code line-wise, monitoring variable contents.
0
 

Author Comment

by:cpancamo
ID: 38823934
Thanks Qlemo, makes perfect sense :)

Just one query (and I'm more than happy to open another question for this), if we wanted to have an exported list of ALL the times that Battery was listed as SAP Code=141, is that easy to do? So, we're not saying the most recent, but any time?
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 38823994
Remove the break to get all occurrences in descending order, and remove the -desc from sort (which is the alias for sort-object, btw) if you want them in ascending order.
While your original question might perform reasonably, the extended one will probably not, if there are a lot of files (more than 100). As said above, if you want to do that more often you should allow for something more DB-like.
0

Featured Post

Are You Headed to Black Hat USA 2017?

Getting ready for Black Hat next week? Kick things off with the WatchGuard Badge Challenge and test your puzzle and cipher skills. Do you have what it takes to earn our limited edition Firebox Badge? Get started today - https://crimsonthorn.net

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A procedure for exporting installed hotfix details of remote computers using powershell
Recently we ran in to an issue while running some SQL jobs where we were trying to process the cubes.  We got an error saying failure stating 'NT SERVICE\SQLSERVERAGENT does not have access to Analysis Services. So this is a way to automate that wit…
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…
NetCrunch network monitor is a highly extensive platform for network monitoring and alert generation. In this video you'll see a live demo of NetCrunch with most notable features explained in a walk-through manner. You'll also get to know the philos…

617 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question