Solved

Extract date from string in PHP

Posted on 2009-04-11
5
868 Views
Last Modified: 2013-12-12
I am trying to extract dates in PHP from a large number of existing files.

The trick is this:  the dates are many different formats, and they're embedded within a file name in different ways.  Here are a few of the real-world cases I'm trying to fit:

// Case 1:  "Morning Meeting 4-29-08.xls" : Typical format, 1-digit month
// Case 2:  "Photon_DPR_04-25-2008_Atlantic.xls" : 2-digit month
// Case 3:  "Canyon DPR April 22 08.xlsx" : Month is spelled out
// Case 4:  "CDI_Time_Ticket_April 02_08.xls" : Month is spelled out, underscore used to delimit day/year
// Case 5:  "CCS_Atlantic_Log_Sheet_5-12-08.xls" : Underscore and 1-digit month value
// Case 6:  "CCS_Atlantic_DPR_05-14-08.xls" : Underscore and 2-digit month value
// Case 7:  "Integra_Daily_Report_4-21-2008.pdf" : Underscore and 2-digit month value, with 4-digit year
// Case 8:  "05132008 Wachs DPR Atlantic.xls" : No delimiters, date at front of string
// Case 9:  "Kest-15-03292008_Injury_Report.xls" : No delimiters, date mid-string, "-" characters do not delimit dates!
// Case 10:  "Sample-032908_Report.xls" : 2-digit year, no delimiters, date mid-string, "-" characters do not delimit dates!

My current approach is this:

1.  Starting from the left and right of the file name string, search forwards and backwards until a numeric value is encountered and extract the characters between to search using a simple strtotime call, as below in the code section.

The above approach will fail, though in certain cases, such as:
"Sample File 04-04-2008 version 2008" :  In this case, it fails as digits are not expected to the right of the date.
"Kest-15-03292008_Injury_Report.xls":  This fails as the digits are not expected to the left of the date string.

Any advise on a robust way to parse dates for these kinds of unstructured file names?

Many thanks,

-Kevin
echo ("Embedded date: 4-29-08: ".date("Y-m-d", strtotime("4-29-08"))."\n");

Open in new window

0
Comment
Question by:Kevin_Cain
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
5 Comments
 
LVL 18

Expert Comment

by:Hube02
ID: 24124072
I would start with a regular expression to extract the date and then parse the date from there. Try the attached code. The regular expression used will extract all of the possibilities given in your example. I took your examples made them in to a string and then grab all of your dates out of it.

You could use the following:

preg_matchl($regex, $string, $matches);

and your date would be stored in $matches[2];

<?php
	
$string = '// Case 1:  "Morning Meeting 4-29-08.xls" : Typical format, 1-digit month
// Case 2:  "Photon_DPR_04-25-2008_Atlantic.xls" : 2-digit month
// Case 3:  "Canyon DPR April 22 08.xlsx" : Month is spelled out
// Case 4:  "CDI_Time_Ticket_April 02_08.xls" : Month is spelled out, underscore used to delimit day/year
// Case 5:  "CCS_Atlantic_Log_Sheet_5-12-08.xls" : Underscore and 1-digit month value
// Case 6:  "CCS_Atlantic_DPR_05-14-08.xls" : Underscore and 2-digit month value
// Case 7:  "Integra_Daily_Report_4-21-2008.pdf" : Underscore and 2-digit month value, with 4-digit year
// Case 8:  "05132008 Wachs DPR Atlantic.xls" : No delimiters, date at front of string
// Case 9:  "Kest-15-03292008_Injury_Report.xls" : No delimiters, date mid-string, "-" characters do not delimit dates!
// Case 10:  "Sample-032908_Report.xls" : 2-digit year, no delimiters, date mid-string, "-" characters do not delimit dates!';
 
$regex = '/(:\d+-)?((\d{1,2}|january|february|march|april|may|june|july|august|september|october|november|december)[- _]*\d{1,2}[- _]*\d{2,4})/i';
preg_match_all($regex, $string, $dates);
print_r($dates);
	
?>

Open in new window

0
 
LVL 18

Expert Comment

by:Hube02
ID: 24124073
I take that back, I'm having trouble with your Case 9.........
0
 
LVL 18

Accepted Solution

by:
Hube02 earned 125 total points
ID: 24124079
Got it, as long as your dates conform the only those cases given above you can use this regex

$regex = '/(-\d{2}-)?((\d{1,2}|january|february|march|april|may|june|july|august|september|october|november|december)[- _]*\d{1,2}[- _]*\d{2,4})/i';

Open in new window

0
 

Author Closing Comment

by:Kevin_Cain
ID: 31569249
Very fast response, and a well composed answer.
0
 
LVL 18

Expert Comment

by:Hube02
ID: 24124094
Thanks for the question.
0

Featured Post

Online Training Solution

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action. Forget about retraining and skyrocket knowledge retention rates.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Help with mod_substitute 18 73
Why isn't my DIV and Form centering? 1 25
Help installing Laravel app on MAMP on MAC 7 51
Ajax success not firing alert 6 36
Developers of all skill levels should learn to use current best practices when developing websites. However many developers, new and old, fall into the trap of using deprecated features because this is what so many tutorials and books tell them to u…
Many old projects have bad code, but the budget doesn't exist to rewrite the codebase. You can update this code to be safer by introducing contemporary input validation, sanitation, and safer database queries.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

738 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question