Solved

Parsing Time String with Timezone info

Posted on 2007-03-23
5
1,459 Views
Last Modified: 2012-05-05
I'm trying to parse Apache log files with Python and I'm having difficulty parsing the time zone information.

import time
t = "23/Mar/2007:02:01:14 -0500"
pdate = time.strptime(t, "%d/%b/%Y:%H:%M:%S %Z")

The above code fails.  How can I take the above string and parse it to get a unix timestamp in GMT?  I looked at mx.DateTime but the parser documentation didn't seem to say anything about timezone information.  Does mx.DateTime handle this correctly or do I need something different?
0
Comment
Question by:phasevar
  • 4
5 Comments
 
LVL 28

Expert Comment

by:pepr
ID: 18782202
The problem is in timezone %Z. Try the following:

import time
# t = "23/Mar/2007:02:01:14 -0500"
t = "23/Mar/2007:02:01:14"
pdate = time.strptime(t, "%d/%b/%Y:%H:%M:%S")
print time.strftime("%d/%b/%Y:%H:%M:%S %Z", pdate)

Your value string was commented out and simplified. Also the formatting string for strptime() was simplified. Then the pdate is extracted correctly (in my case). Converting it back using the strftime() and the original format string reveals what was expected for the extraction.

In my case it look really strange (you know MS Windows and their view on how the timezone should be displayed):

23/Mar/2007:02:01:14 Střední Evropa (běžný čas)

It is in Czech and it says Middle Europe (normal time)

When I put back that string into your t, set the coding of the Python source file correctly and use your original code, it works. Now the question is how to accept the -0500.
0
 
LVL 28

Expert Comment

by:pepr
ID: 18782484
You can split manually the timezone offset string and convert it to int:

t = "23/Mar/2007:02:01:14 -0500"
tlocStr, zoneStr = t.split(" ")
zoneOffset = int(zoneStr[:3])
print tlocStr
print zoneStr
print zoneOffset

You can get the local time from tlocStr using the simplified format string ("%d/%b/%Y:%H:%M:%S") and add manually the zone offset. If the log contains always the same value of the zone offset, you may want to ignore it.
0
 
LVL 28

Expert Comment

by:pepr
ID: 18782523
Have a look at RFC2822 (http://www.faqs.org/rfcs/rfc2822.html) what the -0500 exactly means. (To put it simply, it means minus 5 hours to UTC a.k.a. Greenwich Mean Time.)
0
 

Author Comment

by:phasevar
ID: 18782562
Yeah, the TZ is the issue.   Without parsing the TZ info I'll have incorrect dates.
0
 
LVL 28

Accepted Solution

by:
pepr earned 500 total points
ID: 18783206
Try the standard module datetime for the addition of the timezone delta in hours:

==========================================================
import datetime

t = "23/Mar/2007:02:01:14 -0500"
tloc, zone = t.split(" ")
zoneOffset = int(zone[:3])
print tloc
print zoneOffset

dt = datetime.datetime.strptime(tloc, "%d/%b/%Y:%H:%M:%S")
delta = datetime.timedelta(hours=zoneOffset)

print dt.strftime("%d/%b/%Y:%H:%M:%S")
dt = dt + delta
print dt.strftime("%d/%b/%Y:%H:%M:%S")
==========================================================

For me it produces
==========================================================
23/Mar/2007:02:01:14
-5
23/Mar/2007:02:01:14
22/Mar/2007:21:01:14
==========================================================

0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Flask is a microframework for Python based on Werkzeug and Jinja 2. This requires you to have a good understanding of Python 2.7. Lets install Flask! To install Flask you can use a python repository for libraries tool called pip. Download this f…
The purpose of this article is to demonstrate how we can upgrade Python from version 2.7.6 to Python 2.7.10 on the Linux Mint operating system. I am using an Oracle Virtual Box where I have installed Linux Mint operating system version 17.2. Once yo…
Learn the basics of strings in Python: declaration, operations, indices, and slicing. Strings are declared with quotations; for example: s = "string": Strings are immutable.: Strings may be concatenated or multiplied using the addition and multiplic…
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now