?
Solved

Extract date from string in PHP

Posted on 2009-04-11
5
Medium Priority
?
1,009 Views
Last Modified: 2013-12-12
I am trying to extract dates in PHP from a large number of existing files.

The trick is this:  the dates are many different formats, and they're embedded within a file name in different ways.  Here are a few of the real-world cases I'm trying to fit:

// Case 1:  "Morning Meeting 4-29-08.xls" : Typical format, 1-digit month
// Case 2:  "Photon_DPR_04-25-2008_Atlantic.xls" : 2-digit month
// Case 3:  "Canyon DPR April 22 08.xlsx" : Month is spelled out
// Case 4:  "CDI_Time_Ticket_April 02_08.xls" : Month is spelled out, underscore used to delimit day/year
// Case 5:  "CCS_Atlantic_Log_Sheet_5-12-08.xls" : Underscore and 1-digit month value
// Case 6:  "CCS_Atlantic_DPR_05-14-08.xls" : Underscore and 2-digit month value
// Case 7:  "Integra_Daily_Report_4-21-2008.pdf" : Underscore and 2-digit month value, with 4-digit year
// Case 8:  "05132008 Wachs DPR Atlantic.xls" : No delimiters, date at front of string
// Case 9:  "Kest-15-03292008_Injury_Report.xls" : No delimiters, date mid-string, "-" characters do not delimit dates!
// Case 10:  "Sample-032908_Report.xls" : 2-digit year, no delimiters, date mid-string, "-" characters do not delimit dates!

My current approach is this:

1.  Starting from the left and right of the file name string, search forwards and backwards until a numeric value is encountered and extract the characters between to search using a simple strtotime call, as below in the code section.

The above approach will fail, though in certain cases, such as:
"Sample File 04-04-2008 version 2008" :  In this case, it fails as digits are not expected to the right of the date.
"Kest-15-03292008_Injury_Report.xls":  This fails as the digits are not expected to the left of the date string.

Any advise on a robust way to parse dates for these kinds of unstructured file names?

Many thanks,

-Kevin
echo ("Embedded date: 4-29-08: ".date("Y-m-d", strtotime("4-29-08"))."\n");

Open in new window

0
Comment
Question by:Kevin_Cain
  • 4
5 Comments
 
LVL 18

Expert Comment

by:Hube02
ID: 24124072
I would start with a regular expression to extract the date and then parse the date from there. Try the attached code. The regular expression used will extract all of the possibilities given in your example. I took your examples made them in to a string and then grab all of your dates out of it.

You could use the following:

preg_matchl($regex, $string, $matches);

and your date would be stored in $matches[2];

<?php
	
$string = '// Case 1:  "Morning Meeting 4-29-08.xls" : Typical format, 1-digit month
// Case 2:  "Photon_DPR_04-25-2008_Atlantic.xls" : 2-digit month
// Case 3:  "Canyon DPR April 22 08.xlsx" : Month is spelled out
// Case 4:  "CDI_Time_Ticket_April 02_08.xls" : Month is spelled out, underscore used to delimit day/year
// Case 5:  "CCS_Atlantic_Log_Sheet_5-12-08.xls" : Underscore and 1-digit month value
// Case 6:  "CCS_Atlantic_DPR_05-14-08.xls" : Underscore and 2-digit month value
// Case 7:  "Integra_Daily_Report_4-21-2008.pdf" : Underscore and 2-digit month value, with 4-digit year
// Case 8:  "05132008 Wachs DPR Atlantic.xls" : No delimiters, date at front of string
// Case 9:  "Kest-15-03292008_Injury_Report.xls" : No delimiters, date mid-string, "-" characters do not delimit dates!
// Case 10:  "Sample-032908_Report.xls" : 2-digit year, no delimiters, date mid-string, "-" characters do not delimit dates!';
 
$regex = '/(:\d+-)?((\d{1,2}|january|february|march|april|may|june|july|august|september|october|november|december)[- _]*\d{1,2}[- _]*\d{2,4})/i';
preg_match_all($regex, $string, $dates);
print_r($dates);
	
?>

Open in new window

0
 
LVL 18

Expert Comment

by:Hube02
ID: 24124073
I take that back, I'm having trouble with your Case 9.........
0
 
LVL 18

Accepted Solution

by:
Hube02 earned 500 total points
ID: 24124079
Got it, as long as your dates conform the only those cases given above you can use this regex

$regex = '/(-\d{2}-)?((\d{1,2}|january|february|march|april|may|june|july|august|september|october|november|december)[- _]*\d{1,2}[- _]*\d{2,4})/i';

Open in new window

0
 

Author Closing Comment

by:Kevin_Cain
ID: 31569249
Very fast response, and a well composed answer.
0
 
LVL 18

Expert Comment

by:Hube02
ID: 24124094
Thanks for the question.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Things That Drive Us Nuts Have you noticed the use of the reCaptcha feature at EE and other web sites?  It wants you to read and retype something that looks like this. Insanity!  It's not EE's fault - that's just the way reCaptcha works.  But it i…
Build an array called $myWeek which will hold the array elements Today, Yesterday and then builds up the rest of the week by the name of the day going back 1 week.   (CODE) (CODE) Then you just need to pass your date to the function. If i…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
Suggested Courses

569 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question