Remove rows from csv file based on date column

I have a csv file with columns below.  The second column is the only field that is an actual date.  What we would like to do is remove all rows from the csv file where the date in that column is < 3 days from today whenever run.  Meaning, delete all records where the date is 3 days or older then today.  Don't have any special preference, but figured some type of bat file utilizing DOS commands, or VB script, powershell, etc.  We've used things like 'FINDSTR' etc in the past for certain strings, but not a date so not sure where to start.  Any help would greatly be appreciated.  
Columns:

PackageReference1,ShipmentInformationCollectiondate,ShipmentInformationActualWeight,ShipmentInformationLeadTrackingNumber,ShipmentInformationServiceType,ThirdPartyUPSAccountNumber
"15231622","11/6/2013","1.00","581089305388","3","444444444"
"15242251","11/6/2013","1.00","581089305399","92","555555555"
"15282587","11/16/2013","1.00","581089305403","92","555555555"
"15278493","11/18/2013","1.00","581089305414","92","444444444"
LVL 1
ibgadminAsked:
Who is Participating?
 
Dan CraciunConnect With a Mentor IT ConsultantCommented:
Further modification. No longer matters what the first line contains, it will keep it.

Param([Int32]$history = 3,                 # number of days to keep
      [string]$log = "Y:\path\to\log.csv") # path to log file
$dte = Get-Date             # current date
$searchstring = ""          # 
$period = @(0) * $history   # array to store formatted dates

for ( $i = 0; $i -lt $history; $i++ ) {
  $period[$i] = $dte.Month, $dte.Day, $dte.Year -join "\/"
  $searchstring += "\|" + $period[$i]
  $dte = $dte.AddDays(-1)
}
$searchstring = $searchstring.Substring(2) # remove first \| 

&"X:\path\to\sed\sed.exe" -i -e1p -e `""/"$searchstring"/!d"`" $log

Open in new window


Replace path\to\sed, save it as whatever.ps1, then run it as
whatever.ps1 -history 5 -log "Y:\path\to\log.csv"
0
 
Dan CraciunIT ConsultantCommented:
Why not simply import the csv into Excel and filter the date?

You have attached a test Excel file, using your sample.
test.xlsx
0
 
ibgadminAuthor Commented:
I'm looking for something to run fully automated via code within a bat file, etc.  The file is pretty much like a log file I receive from a vendor so I don't extract or have control over writing the file nor do we run Excel on the on windows server where I need it to run unattended.
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
ibgadminAuthor Commented:
Also, the csv file needs to remain intact as a csv file once records are removed.  Again think of it more of as log file that you want to remove older records from but keep everything else as is.
0
 
Dan CraciunIT ConsultantCommented:
Looks like the case for grep.
If you're on Linux you probably already have it installed, if on Windows get it from here: http://gnuwin32.sourceforge.net/packages/grep.htm

The command would be something like this:
grep '11/18/2013\|11/19/2013\|11/20/2013' log-file.csv

That basically says: parse log-file and return only the lines that contain "11/18/2013", "11/19/2013" or "11/20/2013".

Then you can pipe/redirect the output to the file of your choice, and you can programatically set the dates using the script language you have on that server.

LE: forgot to escape the "|"
0
 
ibgadminAuthor Commented:
I was kind of looking at grep, but am no expert with grep and I need to keep everything but the last 3 days in the csv log file dynamically.  So the grep script would need to be something where it's like TODAY-3 only, so if today beeing 11/20/2013, it would need to remove any record LE to 11/17/2013 on the fly.  I won't have an opportunity to fill in the dates as you recommend.  Again, I need it to be fully un-attended/automated through a bat script, etc.
0
 
Dan CraciunIT ConsultantCommented:
OK, I was too lazy to write the whole script. Powershell.

Here it is:
$dte = Get-Date
$today = $dte.Month, $dte.Day, $dte.Year -join "\/"
$dte = $dte.AddDays(-1)
$yesterday = $dte.Month, $dte.Day, $dte.Year -join "\/"
$dte = $dte.AddDays(-1)
$otherday = $dte.Month, $dte.Day, $dte.Year -join "\/"

&"X:\path\to\sed\sed.exe" -i `""/"$today"\|"$yesterday"\|"$otherday"/!d"`" "Y:\path\to\log\log.csv"

Open in new window

Tested and working with your sample data.

This uses sed -i with the d command, to delete lines from the logfile.
You can get sed from here: http://gnuwin32.sourceforge.net/packages/sed.htm

Replace X:\path\to\sed\sed.exe and Y:\path\to\log\log.csv with the actual paths.
0
 
ibgadminAuthor Commented:
That worked but unfortunately it removed the Header row with the column headings.  I need to keep that intact, so is there a way to NOT remove the header row?
0
 
ibgadminAuthor Commented:
Also, if I understand how this is working, how would one say change this to keep 7 days vs. 3?  I guess I was expecting some kind of variable that contained how many days to exclude where that could be changed on the fly.  That would be preferred and would make it usable for other files with different rules if that's possible.  Sorry if I wasn't clear about that...
0
 
Dan CraciunIT ConsultantCommented:
Yup.

&"X:\path\to\sed\sed.exe" -i `""/PackageReference\|"$today"\|"$yesterday"\|"$otherday"/!d"`" "Y:\path\to\log\log.csv"

Open in new window


As long as there is a column named PackageReference, it will keep the header.
0
 
ibgadminAuthor Commented:
That worked perfectly!  Now as long as it can be changed a bit to accommodate the question to keeping 7 days, etc and making it usable over and over for other files to parse and modify, that would be great!
0
 
Dan CraciunIT ConsultantCommented:
$dte = Get-Date             #current date
$history = 3                # number of days to keep
$searchstring = "PackageReference"   # must be an actual value from the header
$period = @(0) * $history   #array to store formatted dates
for ( $i = 0; $i -lt $history; $i++ ) {
  $period[$i] = $dte.Month, $dte.Day, $dte.Year -join "\/"
  $searchstring += "\|" + $period[$i]
  $dte = $dte.AddDays(-1)
}

&"X:\path\to\sed\sed.exe" -i `""/"$searchstring"/!d"`" "X:\path\to\log\log.csv"

Open in new window

You'll need to modify $history and $searchstring to accomodate your needs.

LE: refactored code
0
 
ibgadminAuthor Commented:
Thanks so much Dan for your help - this works perfectly.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.