tel2
asked on
Converting seconds to hh:mm:ss
I'm trying to convert raw phone call data collected from a PABX, into a format suitable for loading into a database. As part of this, I need to be able to convert a 4 digit field which represents the call duration, in 6 second periods, into hours:minutes:seconds. Eg: "0011" should be converted to "0:1:6" (or alternatively "00:01:06", ie: 11 x 6 = 66, which = 0 hours, 1 minutes & 6 seconds).
This field is the 3rd "|" delimited field in a record, and there may be thousands of records (lines) in the input file. For every 3rd field of each record, the conversion should be made.
Here's a sample input file:
01/01/1999|01:20|0011|12|6 78...
01/01/1999|03:20|0621||4.. .
01/01/1999|05:41|0001|1|77 ...
And here's the expected output file:
01/01/1999|01:20|0:1:6|12| 678...
01/01/1999|03:20|1:2:6||4. ..
01/01/1999|05:41|0:0:6|1|7 7...
Your mission, is to write a script (preferably using Perl, or some standard (fast) UNIX utility (preferably not C) which will do this very quickly (ie: NOT with a shell loop)), which takes the 1st file as standard input, and writes the 2nd as standard output. You should preferably take the field as being between the 2nd and 3rd "|" delimiters, or if you can't do that, take bytes 18-21.
For extra points, subtract this newly generated call duration from the call end date/time (fields 1 and 2), to generate call start date/time fields, which should replace the original fields 1 and 2.
Thanks. Any questions?
This field is the 3rd "|" delimited field in a record, and there may be thousands of records (lines) in the input file. For every 3rd field of each record, the conversion should be made.
Here's a sample input file:
01/01/1999|01:20|0011|12|6
01/01/1999|03:20|0621||4..
01/01/1999|05:41|0001|1|77
And here's the expected output file:
01/01/1999|01:20|0:1:6|12|
01/01/1999|03:20|1:2:6||4.
01/01/1999|05:41|0:0:6|1|7
Your mission, is to write a script (preferably using Perl, or some standard (fast) UNIX utility (preferably not C) which will do this very quickly (ie: NOT with a shell loop)), which takes the 1st file as standard input, and writes the 2nd as standard output. You should preferably take the field as being between the 2nd and 3rd "|" delimiters, or if you can't do that, take bytes 18-21.
For extra points, subtract this newly generated call duration from the call end date/time (fields 1 and 2), to generate call start date/time fields, which should replace the original fields 1 and 2.
Thanks. Any questions?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Instead of using awk, the nawk is used. It is more powerful than awk. I had built an application with 10k lines of codes to create Oracle tables, load data into Oracle tables, and build a web interface. What else you want to do with nawk?
If you do not cross the year, the following code should do what you have asked for. If you do have cases of crossing years, please let me know.
How many points the following codes worth? :-)
GeoTiger
#!/bin/nawk -f
#
BEGIN { FS="|"; OFS="|";
}
{ str = $1 "|" $2
tot = $3 * 6 ;
hr = int(tot / 3600); rst = tot - hr * 3600;
mm = int (rst / 60);
ss = rst - mm * 60;
str = str "|" hr ":" mm ":" ss;
str = str "|" getCallTime($1,$2,tot);
for (i=4; i<=NF; i++) { str = str "|" $i }
print str;
}
END {}
function getCallTime(dt,tm,gap, ymd,yr,a,b,t,r) {
# Input variables:
# dt - in the format of MM/DD/YYYY
# tm - in the format of hh:mm
# gap - in seconds
#
a[0] = split(dt, a, "/")
b[0] = split(tm, b, ":")
ymd = sprintf("%04d%02d%02d.%02d %02d%02d", a[3], a[1], a[2],
b[1], b[2], 0)
yr = int(ymd) # get year
t = getSecondsInYear(ymd) # year got dropped
r = t - gap;
return yr cvtSecondsToTime(r)
}
# <a name="cvtSecondsToTime"></ a>
function cvtSecondsToTime(sec, rst,Y,M,D,hh,mm,ss,tm,ds,f ,a,r) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms s")
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
Y = int(sec / f["ys"]); rst = sec - Y * f["ys"]
ds = int(rst / f["ds"]); rst = rst - ds * f["ds"]
f["ymd"] = getNewDate(sprintf("%04d00 00", Y), ds, 10)
hh = int(rst / f["hs"]); rst = rst - hh * f["hs"]
mm = int(rst / f["ms"]); rst = rst - hh * f["ms"]
ss = rst
if (Y==0) {
r = substr(ymd, 5,4) "." sprintf("%02d%02d%02d", hh, mm, ss)
} else {
r = f["ymd"] "." sprintf("%02d%02d%02d", hh, mm, ss)
}
return r
}
# <a name="getSecondsInYear"></ a>
function getSecondsInYear(tm, f,a,sec) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms s")
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
# ymd is in format of YYYYMMDD or YYYYMMDD.hhmmss
tm=tm ""; split(decompTime(tm), a, ":")
sec = getDaysInYear(tm) * f["ds"]
sec = sec + a[4] * f["hs"] + a[5] * f["ms"] + a[6]
return sec
}
# <a name="decompTime"></a>
function decompTime(tm, a) {
# Input variable:
# tm - time in the format of YYYYMMDD.hhmmss or YYYYMMDD
# Local variable:
# a - a string variable
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return:
# a - the string of YYYY:MM:DD:hh:mm:ss
#
a = substr(tm, 1, 4) # current year: 0 ~ N
a = a ":" substr(tm, 5, 2) # current month: 1~12
a = a ":" substr(tm, 7, 2) # current date: 1~31
a = a ":" substr(tm, 10, 2) # current hour: 0~23
a = a ":" substr(tm, 12, 2) # current minute: 0~59
a = a ":" substr(tm, 14, 2) # current second: 0~59
return a
}
# <a name="getDaysInYear"></a>
function getDaysInYear(ymd, year,mon,days,mds,i) {
# Input variables:
# ymd - input date in format of YYYYMMDD or YYYYMMDD.hhmmss
# Local variables:
# year - a string of YYYY
# mon - a string of MM
# days - a string of DD
# mds - array of days in months
# i - loop index
# Global variables used: None
# Output global variables: None
# Calls-to: None
# Return: days
#
year = substr(ymd, 1, 4) + 0 # current year: 0 ~ N
mon = substr(ymd, 5, 2) + 0 # current month: 1~12
days = substr(ymd, 7, 2) + 0 # current date: 1~31
if (year%4==0)
split("31 29 31 30 31 30 31 31 30 31 30 31", mds, " ")
else
split("31 28 31 30 31 30 31 31 30 31 30 31", mds, " ")
for (i=1; i<=(mon - 1); ++i)
days += mds[i]
return days
}
# <a name="obtain_ymd"></a>
function getNewDate(ymd,gap,it, cy,a,b,Y,M,D,i,n,r,z) {
# Input variables:
# ymd - date in the format of YY/MM/DD or MM/DD/YY or DD/MM/YY
# gap - number of days from ymd
# it - input date format: default is YY/MM/DD or YY MM DD
# 1=MM/DD/YY 2=DD/MM/YY 10=YYYYMMDD
# Local variables:
# cy - Y2K cutoff year
# a,b - date and time array
# Y - year in YYYY
# M - month in MM
# D - date in DD
# n - total days in the year ( ymd + gap)
# i - loop index
# z - days in each month
# Global varialbes used: None
# Output global variables: None
# Return: date in the format of YYYY/MM/DD
#
cy = 31
split(ymd, a, " "); split(a[1], b, "/")
if (it==1) { # in_type is MM/DD/YY
M = b[1]; D = b[2]; Y = b[3]
} else if (it==2) { # in_type is DD/MM/YY
D = b[1]; M = b[2]; Y = b[3]
} else if (it==10) { # in_type is YYYYMMDD.hhmmss
Y = substr(ymd, 1, 4)+0
M = substr(ymd, 5, 2)+0
D = substr(ymd, 7, 2)+0
} else {
Y = b[1]; M = b[2]; D = b[3]
}
if (Y>cy && Y<100 ) Y += 1900
else if (Y<=cy) Y += 2000
n = getDaysInYear(sprintf("%04 d%02d%02d" , Y, M, D)) + gap + 0
while (n < 0) {
if (Y%4==0) { n += 366; --Y }
else { n += 365; --Y }
}
while (n > 365) {
if (Y%4 == 0) { n -= 366; ++Y }
else { n -= 365; ++Y }
}
z[0] = split("31 28 31 30 31 30 31 31 30 31 30 31", z, " ")
if( Y%4 == 0 ) { z[2]=29 }
else z[2]=28
for (i=1; i<=12; ++i) {
if (n<=z[i]) {M=i; D=n; break}
n -= z[i]
}
if (it==10) {
return sprintf("%04d%02d%02d", Y, M, D)
} else {
return sprintf("%04d/%02d/%02d", Y, M, D)
}
}
The output is
$ ./tst03 tst03.txt
01/01/1999|01:20|0:1:6|199 90101.0118 1074|12|67 8...
01/01/1999|03:20|1:2:6|199 90101.0217 954||4...
01/01/1999|05:41|0:0:6|199 90101.0540 2154|1|77. ..
If you do not cross the year, the following code should do what you have asked for. If you do have cases of crossing years, please let me know.
How many points the following codes worth? :-)
GeoTiger
#!/bin/nawk -f
#
BEGIN { FS="|"; OFS="|";
}
{ str = $1 "|" $2
tot = $3 * 6 ;
hr = int(tot / 3600); rst = tot - hr * 3600;
mm = int (rst / 60);
ss = rst - mm * 60;
str = str "|" hr ":" mm ":" ss;
str = str "|" getCallTime($1,$2,tot);
for (i=4; i<=NF; i++) { str = str "|" $i }
print str;
}
END {}
function getCallTime(dt,tm,gap, ymd,yr,a,b,t,r) {
# Input variables:
# dt - in the format of MM/DD/YYYY
# tm - in the format of hh:mm
# gap - in seconds
#
a[0] = split(dt, a, "/")
b[0] = split(tm, b, ":")
ymd = sprintf("%04d%02d%02d.%02d
b[1], b[2], 0)
yr = int(ymd) # get year
t = getSecondsInYear(ymd) # year got dropped
r = t - gap;
return yr cvtSecondsToTime(r)
}
# <a name="cvtSecondsToTime"></
function cvtSecondsToTime(sec, rst,Y,M,D,hh,mm,ss,tm,ds,f
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
Y = int(sec / f["ys"]); rst = sec - Y * f["ys"]
ds = int(rst / f["ds"]); rst = rst - ds * f["ds"]
f["ymd"] = getNewDate(sprintf("%04d00
hh = int(rst / f["hs"]); rst = rst - hh * f["hs"]
mm = int(rst / f["ms"]); rst = rst - hh * f["ms"]
ss = rst
if (Y==0) {
r = substr(ymd, 5,4) "." sprintf("%02d%02d%02d", hh, mm, ss)
} else {
r = f["ymd"] "." sprintf("%02d%02d%02d", hh, mm, ss)
}
return r
}
# <a name="getSecondsInYear"></
function getSecondsInYear(tm, f,a,sec) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
# ymd is in format of YYYYMMDD or YYYYMMDD.hhmmss
tm=tm ""; split(decompTime(tm), a, ":")
sec = getDaysInYear(tm) * f["ds"]
sec = sec + a[4] * f["hs"] + a[5] * f["ms"] + a[6]
return sec
}
# <a name="decompTime"></a>
function decompTime(tm, a) {
# Input variable:
# tm - time in the format of YYYYMMDD.hhmmss or YYYYMMDD
# Local variable:
# a - a string variable
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return:
# a - the string of YYYY:MM:DD:hh:mm:ss
#
a = substr(tm, 1, 4) # current year: 0 ~ N
a = a ":" substr(tm, 5, 2) # current month: 1~12
a = a ":" substr(tm, 7, 2) # current date: 1~31
a = a ":" substr(tm, 10, 2) # current hour: 0~23
a = a ":" substr(tm, 12, 2) # current minute: 0~59
a = a ":" substr(tm, 14, 2) # current second: 0~59
return a
}
# <a name="getDaysInYear"></a>
function getDaysInYear(ymd, year,mon,days,mds,i) {
# Input variables:
# ymd - input date in format of YYYYMMDD or YYYYMMDD.hhmmss
# Local variables:
# year - a string of YYYY
# mon - a string of MM
# days - a string of DD
# mds - array of days in months
# i - loop index
# Global variables used: None
# Output global variables: None
# Calls-to: None
# Return: days
#
year = substr(ymd, 1, 4) + 0 # current year: 0 ~ N
mon = substr(ymd, 5, 2) + 0 # current month: 1~12
days = substr(ymd, 7, 2) + 0 # current date: 1~31
if (year%4==0)
split("31 29 31 30 31 30 31 31 30 31 30 31", mds, " ")
else
split("31 28 31 30 31 30 31 31 30 31 30 31", mds, " ")
for (i=1; i<=(mon - 1); ++i)
days += mds[i]
return days
}
# <a name="obtain_ymd"></a>
function getNewDate(ymd,gap,it, cy,a,b,Y,M,D,i,n,r,z) {
# Input variables:
# ymd - date in the format of YY/MM/DD or MM/DD/YY or DD/MM/YY
# gap - number of days from ymd
# it - input date format: default is YY/MM/DD or YY MM DD
# 1=MM/DD/YY 2=DD/MM/YY 10=YYYYMMDD
# Local variables:
# cy - Y2K cutoff year
# a,b - date and time array
# Y - year in YYYY
# M - month in MM
# D - date in DD
# n - total days in the year ( ymd + gap)
# i - loop index
# z - days in each month
# Global varialbes used: None
# Output global variables: None
# Return: date in the format of YYYY/MM/DD
#
cy = 31
split(ymd, a, " "); split(a[1], b, "/")
if (it==1) { # in_type is MM/DD/YY
M = b[1]; D = b[2]; Y = b[3]
} else if (it==2) { # in_type is DD/MM/YY
D = b[1]; M = b[2]; Y = b[3]
} else if (it==10) { # in_type is YYYYMMDD.hhmmss
Y = substr(ymd, 1, 4)+0
M = substr(ymd, 5, 2)+0
D = substr(ymd, 7, 2)+0
} else {
Y = b[1]; M = b[2]; D = b[3]
}
if (Y>cy && Y<100 ) Y += 1900
else if (Y<=cy) Y += 2000
n = getDaysInYear(sprintf("%04
while (n < 0) {
if (Y%4==0) { n += 366; --Y }
else { n += 365; --Y }
}
while (n > 365) {
if (Y%4 == 0) { n -= 366; ++Y }
else { n -= 365; ++Y }
}
z[0] = split("31 28 31 30 31 30 31 31 30 31 30 31", z, " ")
if( Y%4 == 0 ) { z[2]=29 }
else z[2]=28
for (i=1; i<=12; ++i) {
if (n<=z[i]) {M=i; D=n; break}
n -= z[i]
}
if (it==10) {
return sprintf("%04d%02d%02d", Y, M, D)
} else {
return sprintf("%04d/%02d/%02d", Y, M, D)
}
}
The output is
$ ./tst03 tst03.txt
01/01/1999|01:20|0:1:6|199
01/01/1999|03:20|1:2:6|199
01/01/1999|05:41|0:0:6|199
ASKER
geotiger,
WOW! That's an amazing amount of work you've done for me, and I appreciate it. I wasn't expecting such a prompt and lengthy answer!
I'm sorry I took so long to get back to you on this.
I still haven't used your first answer yet (though I have tested it), and I may never get the chance to, but I'll definitely keep it in case I need it.
The 2nd answer is less likely to be used, but it would also be good to have.
Here are my comments:
1. I don't recognise your date/time format. Looks like: YYYYMMDD.HHMM???? or something. What is it exactly?
2. In the first example, I would have expected 1:20:00 - 0:01:06 = 1:18:54, but your answer is ...01181074. Please explain.
3. The answer should be in "DD/MM/YYYY|HH:MM:SS" format, because it is meant to "replace the original fields 1 & 2". In the above example, the output should be:
01/01/1999|01:18:54|0:1:6| 12|678...
4. Yes, it would need to handle year boundaries.
If you're still interested, and you can do all the above, I'll give you 100 points. Otherwise, I'll give you 30 points for what you've done so far.
Which do you prefer?
I know it's not many points for all that work, but it was meant to be just an option add-on extra.
Anyway, keep up the good work, and it's good to have you on the EE team!
WOW! That's an amazing amount of work you've done for me, and I appreciate it. I wasn't expecting such a prompt and lengthy answer!
I'm sorry I took so long to get back to you on this.
I still haven't used your first answer yet (though I have tested it), and I may never get the chance to, but I'll definitely keep it in case I need it.
The 2nd answer is less likely to be used, but it would also be good to have.
Here are my comments:
1. I don't recognise your date/time format. Looks like: YYYYMMDD.HHMM???? or something. What is it exactly?
2. In the first example, I would have expected 1:20:00 - 0:01:06 = 1:18:54, but your answer is ...01181074. Please explain.
3. The answer should be in "DD/MM/YYYY|HH:MM:SS" format, because it is meant to "replace the original fields 1 & 2". In the above example, the output should be:
01/01/1999|01:18:54|0:1:6|
4. Yes, it would need to handle year boundaries.
If you're still interested, and you can do all the above, I'll give you 100 points. Otherwise, I'll give you 30 points for what you've done so far.
Which do you prefer?
I know it's not many points for all that work, but it was meant to be just an option add-on extra.
Anyway, keep up the good work, and it's good to have you on the EE team!
Ok, you asked for it. :-)
1. I had a typo in the previous program, that is why you got more digits in the date output. The correct format should be YYYYMMDD.hhmmss.
2. The answer to 1 should correct this one.
3. Running the program as ./tst03.awk your_data_file, you should get the format you requested.
4. It handles leap years and year boundaries. See the last record I added. It gets the correct date (2/29) and time.
$ more tst03.txt
01/01/1999|01:20|0011|12|6 78...
01/01/1999|03:20|0621||4.. .
01/01/1999|05:41|0001|1|77 ...
01/01/1999|00:41|0015|1|77 ...
03/01/1996|00:01|0135|1|77 ...
$ more ./tst03.awk
#!/bin/nawk -f
# File name tst03.awk
# Purpose: 1. convert seconds into hh:mm:ss format
# 2. get dt_tm from dt_tm minus gap in seconds
#
#
BEGIN { FS="|"; OFS="|";
}
{ str = $1 "|" $2 # $1 - year in MM/DD/YYYY
# $2 - call end time in hh:mm
tot = $3 * 6 ; # $3 - count of each 6 second
tm1 = cvtSec2Tm(tot)
tm2 = getCallTime($1,$2,tot)
str = str "|" $3 "|" tot
str = str "|" tm1
str = str "|" tm2
tel = cvtTimeFormat(tm2) "|" tm1
for (i=4; i<=NF; i++) {
str = str "|" $i; tel = tel "|" $i
}
if (mytest==1) { print str} else { print tel }
}
END {}
function getCallTime(dt,tm,gap, ymd,yr,a,b,t,r) {
# Input variables:
# dt - in the format of MM/DD/YYYY
# tm - in the format of hh:mm
# gap - in seconds
#
a[0] = split(dt, a, "/")
b[0] = split(tm, b, ":")
ymd = sprintf("%04d%02d%02d.%02d %02d%02d", a[3], a[1], a[2],
b[1], b[2], 0)
yr = substr(ymd,1,4) # get year
t = getSecondsInYear(ymd) # year got dropped
r = t - gap;
if (r < 0) { # not enough seconds in the year
if (isLeapYear(yr)) {
t = t + 366 * 86400 # 1 day = 86400 seconds
} else {
t = t + 365 * 86400
}
--yr; # reduce one year
r = t - gap
}
return yr cvtSecondsToTime(r)
}
function cvtTimeFormat(tm, Y, M, D, hh, mm, ss) {
# convert YYYYMMDD.hhmmss to DD/MM/YYYY|hh:mm:ss
Y = substr(tm, 1, 4)
M = substr(tm, 5, 2)
D = substr(tm, 7, 2)
hh = substr(tm, 10,2)
mm = substr(tm, 12,2)
ss = substr(tm, 14,2)
return sprintf("%02d/%02d/%04d|%0 2d:%02d:%0 2d",D,M,Y, hh,mm,ss)
}
function cvtSec2Tm(sec, hh, mm, ss, rst) {
hr = int(sec / 3600); rst = sec - hr * 3600;
mm = int (rst / 60);
ss = rst - mm * 60;
return hr ":" mm ":" ss
}
# <a name="cvtSecondsToTime"></ a>
function cvtSecondsToTime(sec, rst,Y,M,D,hh,mm,ss,tm,ds,f ,a,r) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms s")
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
Y = int(sec / f["ys"]); rst = sec - Y * f["ys"]
ds = int(rst / f["ds"]); rst = rst - ds * f["ds"]
f["ymd"] = getNewDate(sprintf("%04d00 00", Y), ds, 10)
hh = int(rst / f["hs"]); rst = rst - hh * f["hs"]
mm = int(rst / f["ms"]); rst = rst - mm * f["ms"]
ss = rst
if (Y==0) {
r = substr(f["ymd"], 5,4) "." sprintf("%02d%02d%02d",hh, mm,ss)
} else {
r = f["ymd"] "." sprintf("%02d%02d%02d", hh, mm, ss)
}
return r
}
# <a name="getSecondsInYear"></ a>
function getSecondsInYear(tm, f,a,sec) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms s")
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
# ymd is in format of YYYYMMDD or YYYYMMDD.hhmmss
tm=tm ""; split(decompTime(tm), a, ":")
sec = getDaysInYear(tm) * f["ds"]
sec = sec + a[4] * f["hs"] + a[5] * f["ms"] + a[6]
return sec
}
# <a name="decompTime"></a>
function decompTime(tm, a) {
# Input variable:
# tm - time in the format of YYYYMMDD.hhmmss or YYYYMMDD
# Local variable:
# a - a string variable
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return:
# a - the string of YYYY:MM:DD:hh:mm:ss
#
a = substr(tm, 1, 4) # current year: 0 ~ N
a = a ":" substr(tm, 5, 2) # current month: 1~12
a = a ":" substr(tm, 7, 2) # current date: 1~31
a = a ":" substr(tm, 10, 2) # current hour: 0~23
a = a ":" substr(tm, 12, 2) # current minute: 0~59
a = a ":" substr(tm, 14, 2) # current second: 0~59
return a
}
# <a name="getDaysInYear"></a>
function getDaysInYear(ymd, year,mon,days,mds,i) {
# Input variables:
# ymd - input date in format of YYYYMMDD or YYYYMMDD.hhmmss
# Local variables:
# year - a string of YYYY
# mon - a string of MM
# days - a string of DD
# mds - array of days in months
# i - loop index
# Global variables used: None
# Output global variables: None
# Calls-to: None
# Return: days
#
year = substr(ymd, 1, 4) + 0 # current year: 0 ~ N
mon = substr(ymd, 5, 2) + 0 # current month: 1~12
days = substr(ymd, 7, 2) + 0 # current date: 1~31
if (year%4==0)
split("31 29 31 30 31 30 31 31 30 31 30 31", mds, " ")
else
split("31 28 31 30 31 30 31 31 30 31 30 31", mds, " ")
for (i=1; i<=(mon - 1); ++i)
days += mds[i]
return days
}
# <a name="obtain_ymd"></a>
function getNewDate(ymd,gap,it, cy,a,b,Y,M,D,i,n,r,z) {
# Input variables:
# ymd - date in the format of YY/MM/DD or MM/DD/YY or DD/MM/YY
# gap - number of days from ymd
# it - input date format: default is YY/MM/DD or YY MM DD
# 1=MM/DD/YY 2=DD/MM/YY 10=YYYYMMDD
# Local variables:
# cy - Y2K cutoff year
# a,b - date and time array
# Y - year in YYYY
# M - month in MM
# D - date in DD
# n - total days in the year ( ymd + gap)
# i - loop index
# z - days in each month
# Global varialbes used: None
# Output global variables: None
# Return: date in the format of YYYY/MM/DD
#
cy = 31
split(ymd, a, " "); split(a[1], b, "/")
if (it==1) { # in_type is MM/DD/YY
M = b[1]; D = b[2]; Y = b[3]
} else if (it==2) { # in_type is DD/MM/YY
D = b[1]; M = b[2]; Y = b[3]
} else if (it==10) { # in_type is YYYYMMDD.hhmmss
Y = substr(ymd, 1, 4)+0
M = substr(ymd, 5, 2)+0
D = substr(ymd, 7, 2)+0
} else {
Y = b[1]; M = b[2]; D = b[3]
}
if (Y>cy && Y<100 ) Y += 1900
else if (Y<=cy) Y += 2000
n = getDaysInYear(sprintf("%04 d%02d%02d" , Y, M, D)) + gap + 0
while (n < 0) {
if (Y%4==0) { n += 366; --Y }
else { n += 365; --Y }
}
while (n > 365) {
if (Y%4 == 0) { n -= 366; ++Y }
else { n -= 365; ++Y }
}
z[0] = split("31 28 31 30 31 30 31 31 30 31 30 31", z, " ")
if( Y%4 == 0 ) { z[2]=29 }
else z[2]=28
for (i=1; i<=12; ++i) {
if (n<=z[i]) {M=i; D=n; break}
n -= z[i]
}
if (it==10) {
return sprintf("%04d%02d%02d", Y, M, D)
} else {
return sprintf("%04d/%02d/%02d", Y, M, D)
}
}
# <a name="isLeapYear"></a>
function isLeapYear(year) {
# Input variables:
# year - year in the format of YYYY
# Local variables: None
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return: 1 or 0
#
if ((year % 400) == 0) return 1
else if ((year % 100) == 0) return 1
else if ((year % 4) == 0) return 1
else return 0
}
$ ./tst03.awk tst03.txt
01/01/1999|01:18:54|0:1:6| 12|678...
01/01/1999|02:17:54|1:2:6| |4...
01/01/1999|05:40:54|0:0:6| 1|77...
01/01/1999|00:39:30|0:1:30 |1|77...
29/02/1996|23:47:30|0:13:3 0|1|77...
$ ./tst03.awk mytest=1 tst03.txt
01/01/1999|01:20|0011|66|0 :1:6|19990 101.011854 |12|678...
01/01/1999|03:20|0621|3726 |1:2:6|199 90101.0217 54||4...
01/01/1999|05:41|0001|6|0: 0:6|199901 01.054054| 1|77...
01/01/1999|00:41|0015|90|0 :1:30|1999 0101.00393 0|1|77...
03/01/1996|00:01|0135|810| 0:13:30|19 960229.234 730|1|77.. .
$
1. I had a typo in the previous program, that is why you got more digits in the date output. The correct format should be YYYYMMDD.hhmmss.
2. The answer to 1 should correct this one.
3. Running the program as ./tst03.awk your_data_file, you should get the format you requested.
4. It handles leap years and year boundaries. See the last record I added. It gets the correct date (2/29) and time.
$ more tst03.txt
01/01/1999|01:20|0011|12|6
01/01/1999|03:20|0621||4..
01/01/1999|05:41|0001|1|77
01/01/1999|00:41|0015|1|77
03/01/1996|00:01|0135|1|77
$ more ./tst03.awk
#!/bin/nawk -f
# File name tst03.awk
# Purpose: 1. convert seconds into hh:mm:ss format
# 2. get dt_tm from dt_tm minus gap in seconds
#
#
BEGIN { FS="|"; OFS="|";
}
{ str = $1 "|" $2 # $1 - year in MM/DD/YYYY
# $2 - call end time in hh:mm
tot = $3 * 6 ; # $3 - count of each 6 second
tm1 = cvtSec2Tm(tot)
tm2 = getCallTime($1,$2,tot)
str = str "|" $3 "|" tot
str = str "|" tm1
str = str "|" tm2
tel = cvtTimeFormat(tm2) "|" tm1
for (i=4; i<=NF; i++) {
str = str "|" $i; tel = tel "|" $i
}
if (mytest==1) { print str} else { print tel }
}
END {}
function getCallTime(dt,tm,gap, ymd,yr,a,b,t,r) {
# Input variables:
# dt - in the format of MM/DD/YYYY
# tm - in the format of hh:mm
# gap - in seconds
#
a[0] = split(dt, a, "/")
b[0] = split(tm, b, ":")
ymd = sprintf("%04d%02d%02d.%02d
b[1], b[2], 0)
yr = substr(ymd,1,4) # get year
t = getSecondsInYear(ymd) # year got dropped
r = t - gap;
if (r < 0) { # not enough seconds in the year
if (isLeapYear(yr)) {
t = t + 366 * 86400 # 1 day = 86400 seconds
} else {
t = t + 365 * 86400
}
--yr; # reduce one year
r = t - gap
}
return yr cvtSecondsToTime(r)
}
function cvtTimeFormat(tm, Y, M, D, hh, mm, ss) {
# convert YYYYMMDD.hhmmss to DD/MM/YYYY|hh:mm:ss
Y = substr(tm, 1, 4)
M = substr(tm, 5, 2)
D = substr(tm, 7, 2)
hh = substr(tm, 10,2)
mm = substr(tm, 12,2)
ss = substr(tm, 14,2)
return sprintf("%02d/%02d/%04d|%0
}
function cvtSec2Tm(sec, hh, mm, ss, rst) {
hr = int(sec / 3600); rst = sec - hr * 3600;
mm = int (rst / 60);
ss = rst - mm * 60;
return hr ":" mm ":" ss
}
# <a name="cvtSecondsToTime"></
function cvtSecondsToTime(sec, rst,Y,M,D,hh,mm,ss,tm,ds,f
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
Y = int(sec / f["ys"]); rst = sec - Y * f["ys"]
ds = int(rst / f["ds"]); rst = rst - ds * f["ds"]
f["ymd"] = getNewDate(sprintf("%04d00
hh = int(rst / f["hs"]); rst = rst - hh * f["hs"]
mm = int(rst / f["ms"]); rst = rst - mm * f["ms"]
ss = rst
if (Y==0) {
r = substr(f["ymd"], 5,4) "." sprintf("%02d%02d%02d",hh,
} else {
r = f["ymd"] "." sprintf("%02d%02d%02d", hh, mm, ss)
}
return r
}
# <a name="getSecondsInYear"></
function getSecondsInYear(tm, f,a,sec) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
# ymd is in format of YYYYMMDD or YYYYMMDD.hhmmss
tm=tm ""; split(decompTime(tm), a, ":")
sec = getDaysInYear(tm) * f["ds"]
sec = sec + a[4] * f["hs"] + a[5] * f["ms"] + a[6]
return sec
}
# <a name="decompTime"></a>
function decompTime(tm, a) {
# Input variable:
# tm - time in the format of YYYYMMDD.hhmmss or YYYYMMDD
# Local variable:
# a - a string variable
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return:
# a - the string of YYYY:MM:DD:hh:mm:ss
#
a = substr(tm, 1, 4) # current year: 0 ~ N
a = a ":" substr(tm, 5, 2) # current month: 1~12
a = a ":" substr(tm, 7, 2) # current date: 1~31
a = a ":" substr(tm, 10, 2) # current hour: 0~23
a = a ":" substr(tm, 12, 2) # current minute: 0~59
a = a ":" substr(tm, 14, 2) # current second: 0~59
return a
}
# <a name="getDaysInYear"></a>
function getDaysInYear(ymd, year,mon,days,mds,i) {
# Input variables:
# ymd - input date in format of YYYYMMDD or YYYYMMDD.hhmmss
# Local variables:
# year - a string of YYYY
# mon - a string of MM
# days - a string of DD
# mds - array of days in months
# i - loop index
# Global variables used: None
# Output global variables: None
# Calls-to: None
# Return: days
#
year = substr(ymd, 1, 4) + 0 # current year: 0 ~ N
mon = substr(ymd, 5, 2) + 0 # current month: 1~12
days = substr(ymd, 7, 2) + 0 # current date: 1~31
if (year%4==0)
split("31 29 31 30 31 30 31 31 30 31 30 31", mds, " ")
else
split("31 28 31 30 31 30 31 31 30 31 30 31", mds, " ")
for (i=1; i<=(mon - 1); ++i)
days += mds[i]
return days
}
# <a name="obtain_ymd"></a>
function getNewDate(ymd,gap,it, cy,a,b,Y,M,D,i,n,r,z) {
# Input variables:
# ymd - date in the format of YY/MM/DD or MM/DD/YY or DD/MM/YY
# gap - number of days from ymd
# it - input date format: default is YY/MM/DD or YY MM DD
# 1=MM/DD/YY 2=DD/MM/YY 10=YYYYMMDD
# Local variables:
# cy - Y2K cutoff year
# a,b - date and time array
# Y - year in YYYY
# M - month in MM
# D - date in DD
# n - total days in the year ( ymd + gap)
# i - loop index
# z - days in each month
# Global varialbes used: None
# Output global variables: None
# Return: date in the format of YYYY/MM/DD
#
cy = 31
split(ymd, a, " "); split(a[1], b, "/")
if (it==1) { # in_type is MM/DD/YY
M = b[1]; D = b[2]; Y = b[3]
} else if (it==2) { # in_type is DD/MM/YY
D = b[1]; M = b[2]; Y = b[3]
} else if (it==10) { # in_type is YYYYMMDD.hhmmss
Y = substr(ymd, 1, 4)+0
M = substr(ymd, 5, 2)+0
D = substr(ymd, 7, 2)+0
} else {
Y = b[1]; M = b[2]; D = b[3]
}
if (Y>cy && Y<100 ) Y += 1900
else if (Y<=cy) Y += 2000
n = getDaysInYear(sprintf("%04
while (n < 0) {
if (Y%4==0) { n += 366; --Y }
else { n += 365; --Y }
}
while (n > 365) {
if (Y%4 == 0) { n -= 366; ++Y }
else { n -= 365; ++Y }
}
z[0] = split("31 28 31 30 31 30 31 31 30 31 30 31", z, " ")
if( Y%4 == 0 ) { z[2]=29 }
else z[2]=28
for (i=1; i<=12; ++i) {
if (n<=z[i]) {M=i; D=n; break}
n -= z[i]
}
if (it==10) {
return sprintf("%04d%02d%02d", Y, M, D)
} else {
return sprintf("%04d/%02d/%02d", Y, M, D)
}
}
# <a name="isLeapYear"></a>
function isLeapYear(year) {
# Input variables:
# year - year in the format of YYYY
# Local variables: None
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return: 1 or 0
#
if ((year % 400) == 0) return 1
else if ((year % 100) == 0) return 1
else if ((year % 4) == 0) return 1
else return 0
}
$ ./tst03.awk tst03.txt
01/01/1999|01:18:54|0:1:6|
01/01/1999|02:17:54|1:2:6|
01/01/1999|05:40:54|0:0:6|
01/01/1999|00:39:30|0:1:30
29/02/1996|23:47:30|0:13:3
$ ./tst03.awk mytest=1 tst03.txt
01/01/1999|01:20|0011|66|0
01/01/1999|03:20|0621|3726
01/01/1999|05:41|0001|6|0:
01/01/1999|00:41|0015|90|0
03/01/1996|00:01|0135|810|
$
ASKER
Amazing response geotiger! Looks really good on the test data I provided, but I don't understand why it's not working on some real data of mine (see below). Note that the 2 spaces at the start of my previous test data were just to indent the records for "clarity", but I don't think this is the issue, since it still failed when I inserted spaces before my real data. Real data also has digits etc in place of the "..." at the end of each line. Here it is:
# cat real1
21/11/1999|17:28|0014||042 382258|433 63||
21/11/1999|17:57|0001||078 967783|433 63||
21/11/1999|17:58|0003||078 966767|433 63||
# ./tst03.awk real1
01/20/1999|11:01:17|0:1:24 ||04238225 8|43363||
01/20/1999|11:01:17|0:0:6| |078967783 |43363||
01/20/1999|11:01:17|0:0:18 ||07896676 7|43363||
Eg: 21/11/1999 17:28:00 - 00:01:24 = 21/11/1999 17:26:36 (not 11/20/1999 11:01:17). Note that both the date and time are incorrect.
Any ideas, geotiger?
# cat real1
21/11/1999|17:28|0014||042
21/11/1999|17:57|0001||078
21/11/1999|17:58|0003||078
# ./tst03.awk real1
01/20/1999|11:01:17|0:1:24
01/20/1999|11:01:17|0:0:6|
01/20/1999|11:01:17|0:0:18
Eg: 21/11/1999 17:28:00 - 00:01:24 = 21/11/1999 17:26:36 (not 11/20/1999 11:01:17). Note that both the date and time are incorrect.
Any ideas, geotiger?
ASKER
geotiger,
In my previous message, lines 2-3 of the data under "# cat real1" have spaces in front of them. Please ignore these spaces. Somehow I must have pasted them wrongly. None of my data actually starts with spaces.
Thanks.
In my previous message, lines 2-3 of the data under "# cat real1" have spaces in front of them. Please ignore these spaces. Somehow I must have pasted them wrongly. None of my data actually starts with spaces.
Thanks.
Sorry, I thought your input date in the first column was MM/DD/YYYY. It is very easy to fix, just use the following line to replace the corresponding line in getCallTime function:
ymd = sprintf("%04d%02d%02d.%02d %02d%02d", a[3], a[2], a[1], b[1], b[2], 0)
You should not worry about the leading spaces. After splitting the field, the program treats the split fields as numerical. Try the about change and let me know.
ymd = sprintf("%04d%02d%02d.%02d
You should not worry about the leading spaces. After splitting the field, the program treats the split fields as numerical. Try the about change and let me know.
ASKER
geotiger,
I made that change, and here's some sample input:
21/11/1999|17:28|0014||042 382258|433 63||
21/11/1999|17:57|0001||078 967783|433 63||
01/01/1999|00:00|0006|NJ|9 8274|43363 ||
which results in this output:
20/11/1999|17:26:36|0:1:24 ||04238225 8|43363||
20/11/1999|17:56:54|0:0:6| |078967783 |43363||
00/01/1999|23:59:24|0:0:36 |NJ|98274| 43363||
As you can see, the 1st 2 dates are 1 day early, and the 3rd is an invalid date. I look forward to hearing from you. Thanks.
I made that change, and here's some sample input:
21/11/1999|17:28|0014||042
21/11/1999|17:57|0001||078
01/01/1999|00:00|0006|NJ|9
which results in this output:
20/11/1999|17:26:36|0:1:24
20/11/1999|17:56:54|0:0:6|
00/01/1999|23:59:24|0:0:36
As you can see, the 1st 2 dates are 1 day early, and the 3rd is an invalid date. I look forward to hearing from you. Thanks.
Change one place is not enough. Use the program I posted in this message. Look the example, the dates are coming out correctly.
$ more tst03.awk
#!/bin/nawk -f
# File name tst03.awk
# Purpose: 1. convert seconds into hh:mm:ss format
# 2. get dt_tm from dt_tm minus gap in seconds
#
#
BEGIN { FS="|"; OFS="|";
}
{ str = $1 "|" $2 # $1 - year in MM/DD/YYYY
# $2 - call end time in hh:mm
tot = $3 * 6 ; # $3 - count of each 6 second
tm1 = cvtSec2Tm(tot)
tm2 = getCallTime($1,$2,tot)
str = str "|" $3 "|" tot
str = str "|" tm1
str = str "|" tm2
tel = cvtTimeFormat(tm2) "|" tm1
for (i=4; i<=NF; i++) {
str = str "|" $i; tel = tel "|" $i
}
if (mytest==1) { print str} else { print tel }
}
END {}
function getCallTime(dt,tm,gap, ymd,yr,a,b,t,r) {
# Input variables:
# dt - in the format of MM/DD/YYYY
# tm - in the format of hh:mm
# gap - in seconds
#
a[0] = split(dt, a, "/")
b[0] = split(tm, b, ":")
ymd = sprintf("%04d%02d%02d.%02d %02d%02d", a[3], a[2], a[1],
b[1], b[2], 0)
yr = substr(ymd,1,4) # get year
t = getSecondsInYear(ymd) # year got dropped
# print ymd ":" t ":" yr ":" gap
r = t - gap;
if (r < 0) { # not enough seconds in the year
if (isLeapYear(yr)) {
t = t + 366 * 86400 # 1 day = 86400 seconds
} else {
t = t + 365 * 86400
}
--yr; # reduce one year
r = t - gap
}
# print t " - " gap " = " r
return yr cvtSecondsToTime(yr, r)
}
function cvtTimeFormat(tm, Y, M, D, hh, mm, ss) {
# convert YYYYMMDD.hhmmss to DD/MM/YYYY|hh:mm:ss
Y = substr(tm, 1, 4)
M = substr(tm, 5, 2)
D = substr(tm, 7, 2)
hh = substr(tm, 10,2)
mm = substr(tm, 12,2)
ss = substr(tm, 14,2)
return sprintf("%02d/%02d/%04d|%0 2d:%02d:%0 2d",D,M,Y, hh,mm,ss)
}
function cvtSec2Tm(sec, hh, mm, ss, rst) {
hr = int(sec / 3600); rst = sec - hr * 3600;
mm = int (rst / 60);
ss = rst - mm * 60;
return hr ":" mm ":" ss
}
# <a name="cvtSecondsToTime"></ a>
function cvtSecondsToTime(yr,sec, rst,Y,M,D,hh,mm,ss,tm,ds,f ,a,r) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms s")
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
Y = int(sec / f["ys"]); rst = sec - Y * f["ys"]
ds = int(rst / f["ds"]); rst = rst - ds * f["ds"]
f["ymd"] = getNewDate(sprintf("%04d01 01", yr), ds, 10)
hh = int(rst / f["hs"]); rst = rst - hh * f["hs"]
mm = int(rst / f["ms"]); rst = rst - mm * f["ms"]
ss = rst
if (Y==0) {
r = substr(f["ymd"], 5,4) "." sprintf("%02d%02d%02d",hh, mm,ss)
} else {
r = f["ymd"] "." sprintf("%02d%02d%02d", hh, mm, ss)
}
return r
}
# <a name="getSecondsInYear"></ a>
function getSecondsInYear(tm, f,a,sec,t) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms s")
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
# ymd is in format of YYYYMMDD or YYYYMMDD.hhmmss
tm=tm ""; split(decompTime(tm), a, ":")
t = getDaysInYear(tm)
if (t>0) { sec = (t-1) * f["ds"] } # so we get accurate seconds
sec = sec + a[4] * f["hs"] + a[5] * f["ms"] + a[6]
return sec
}
# <a name="decompTime"></a>
function decompTime(tm, a) {
# Input variable:
# tm - time in the format of YYYYMMDD.hhmmss or YYYYMMDD
# Local variable:
# a - a string variable
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return:
# a - the string of YYYY:MM:DD:hh:mm:ss
#
a = substr(tm, 1, 4) # current year: 0 ~ N
a = a ":" substr(tm, 5, 2) # current month: 1~12
a = a ":" substr(tm, 7, 2) # current date: 1~31
a = a ":" substr(tm, 10, 2) # current hour: 0~23
a = a ":" substr(tm, 12, 2) # current minute: 0~59
a = a ":" substr(tm, 14, 2) # current second: 0~59
return a
}
# <a name="getDaysInYear"></a>
function getDaysInYear(ymd, year,mon,days,mds,i) {
# Input variables:
# ymd - input date in format of YYYYMMDD or YYYYMMDD.hhmmss
# Local variables:
# year - a string of YYYY
# mon - a string of MM
# days - a string of DD
# mds - array of days in months
# i - loop index
# Global variables used: None
# Output global variables: None
# Calls-to: None
# Return: days
#
year = substr(ymd, 1, 4) + 0 # current year: 0 ~ N
mon = substr(ymd, 5, 2) + 0 # current month: 1~12
days = substr(ymd, 7, 2) + 0 # current date: 1~31
if (year%4==0)
split("31 29 31 30 31 30 31 31 30 31 30 31", mds, " ")
else
split("31 28 31 30 31 30 31 31 30 31 30 31", mds, " ")
for (i=1; i<=(mon - 1); ++i)
days += mds[i]
return days
}
# <a name="obtain_ymd"></a>
function getNewDate(ymd,gap,it, cy,a,b,Y,M,D,i,n,r,z) {
# Input variables:
# ymd - date in the format of YY/MM/DD or MM/DD/YY or DD/MM/YY
# gap - number of days from ymd
# it - input date format: default is YY/MM/DD or YY MM DD
# 1=MM/DD/YY 2=DD/MM/YY 10=YYYYMMDD
# Local variables:
# cy - Y2K cutoff year
# a,b - date and time array
# Y - year in YYYY
# M - month in MM
# D - date in DD
# n - total days in the year ( ymd + gap)
# i - loop index
# z - days in each month
# Global varialbes used: None
# Output global variables: None
# Return: date in the format of YYYY/MM/DD
#
cy = 31
split(ymd, a, " "); split(a[1], b, "/")
if (it==1) { # in_type is MM/DD/YY
M = b[1]; D = b[2]; Y = b[3]
} else if (it==2) { # in_type is DD/MM/YY
D = b[1]; M = b[2]; Y = b[3]
} else if (it==10) { # in_type is YYYYMMDD.hhmmss
Y = substr(ymd, 1, 4)+0
M = substr(ymd, 5, 2)+0
D = substr(ymd, 7, 2)+0
} else {
Y = b[1]; M = b[2]; D = b[3]
}
if (Y>cy && Y<100 ) Y += 1900
else if (Y<=cy) Y += 2000
# if (M==0) { M=1 }
# if (D==0) { D=1 }
n = getDaysInYear(sprintf("%04 d%02d%02d" , Y, M, D)) + gap + 0
while (n < 0) {
if (Y%4==0) { n += 366; --Y }
else { n += 365; --Y }
}
while (n > 365) {
if (Y%4 == 0) { n -= 366; ++Y }
else { n -= 365; ++Y }
}
z[0] = split("31 28 31 30 31 30 31 31 30 31 30 31", z, " ")
if( Y%4 == 0 ) { z[2]=29 }
else z[2]=28
for (i=1; i<=12; ++i) {
if (n<=z[i]) {M=i; D=n; break}
n -= z[i]
}
if (it==10) {
return sprintf("%04d%02d%02d", Y, M, D)
} else {
return sprintf("%04d/%02d/%02d", Y, M, D)
}
}
# <a name="isLeapYear"></a>
function isLeapYear(year) {
# Input variables:
# year - year in the format of YYYY
# Local variables: None
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return: 1 or 0
#
if ((year % 400) == 0) return 1
else if ((year % 100) == 0) return 1
else if ((year % 4) == 0) return 1
else return 0
}
$ more tst03.txt
01/01/1999|01:20|0011|12|6 78...
01/01/1999|03:20|0621||4.. .
01/01/1999|05:41|0001|1|77 ...
01/01/1999|00:41|0015|1|77 ...
01/01/1999|00:01|1015|1|77 ...
01/03/1996|00:01|0135|1|77 ...
$ ./tst03.awk tst03.txt
01/01/1999|01:18:54|0:1:6| 12|678...
01/01/1999|02:17:54|1:2:6| |4...
01/01/1999|05:40:54|0:0:6| 1|77...
01/01/1999|00:39:30|0:1:30 |1|77...
31/12/1998|22:19:30|1:41:3 0|1|77...
29/02/1996|23:47:30|0:13:3 0|1|77...
$ more tst03.awk
#!/bin/nawk -f
# File name tst03.awk
# Purpose: 1. convert seconds into hh:mm:ss format
# 2. get dt_tm from dt_tm minus gap in seconds
#
#
BEGIN { FS="|"; OFS="|";
}
{ str = $1 "|" $2 # $1 - year in MM/DD/YYYY
# $2 - call end time in hh:mm
tot = $3 * 6 ; # $3 - count of each 6 second
tm1 = cvtSec2Tm(tot)
tm2 = getCallTime($1,$2,tot)
str = str "|" $3 "|" tot
str = str "|" tm1
str = str "|" tm2
tel = cvtTimeFormat(tm2) "|" tm1
for (i=4; i<=NF; i++) {
str = str "|" $i; tel = tel "|" $i
}
if (mytest==1) { print str} else { print tel }
}
END {}
function getCallTime(dt,tm,gap, ymd,yr,a,b,t,r) {
# Input variables:
# dt - in the format of MM/DD/YYYY
# tm - in the format of hh:mm
# gap - in seconds
#
a[0] = split(dt, a, "/")
b[0] = split(tm, b, ":")
ymd = sprintf("%04d%02d%02d.%02d
b[1], b[2], 0)
yr = substr(ymd,1,4) # get year
t = getSecondsInYear(ymd) # year got dropped
# print ymd ":" t ":" yr ":" gap
r = t - gap;
if (r < 0) { # not enough seconds in the year
if (isLeapYear(yr)) {
t = t + 366 * 86400 # 1 day = 86400 seconds
} else {
t = t + 365 * 86400
}
--yr; # reduce one year
r = t - gap
}
# print t " - " gap " = " r
return yr cvtSecondsToTime(yr, r)
}
function cvtTimeFormat(tm, Y, M, D, hh, mm, ss) {
# convert YYYYMMDD.hhmmss to DD/MM/YYYY|hh:mm:ss
Y = substr(tm, 1, 4)
M = substr(tm, 5, 2)
D = substr(tm, 7, 2)
hh = substr(tm, 10,2)
mm = substr(tm, 12,2)
ss = substr(tm, 14,2)
return sprintf("%02d/%02d/%04d|%0
}
function cvtSec2Tm(sec, hh, mm, ss, rst) {
hr = int(sec / 3600); rst = sec - hr * 3600;
mm = int (rst / 60);
ss = rst - mm * 60;
return hr ":" mm ":" ss
}
# <a name="cvtSecondsToTime"></
function cvtSecondsToTime(yr,sec, rst,Y,M,D,hh,mm,ss,tm,ds,f
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
Y = int(sec / f["ys"]); rst = sec - Y * f["ys"]
ds = int(rst / f["ds"]); rst = rst - ds * f["ds"]
f["ymd"] = getNewDate(sprintf("%04d01
hh = int(rst / f["hs"]); rst = rst - hh * f["hs"]
mm = int(rst / f["ms"]); rst = rst - mm * f["ms"]
ss = rst
if (Y==0) {
r = substr(f["ymd"], 5,4) "." sprintf("%02d%02d%02d",hh,
} else {
r = f["ymd"] "." sprintf("%02d%02d%02d", hh, mm, ss)
}
return r
}
# <a name="getSecondsInYear"></
function getSecondsInYear(tm, f,a,sec,t) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
# ymd is in format of YYYYMMDD or YYYYMMDD.hhmmss
tm=tm ""; split(decompTime(tm), a, ":")
t = getDaysInYear(tm)
if (t>0) { sec = (t-1) * f["ds"] } # so we get accurate seconds
sec = sec + a[4] * f["hs"] + a[5] * f["ms"] + a[6]
return sec
}
# <a name="decompTime"></a>
function decompTime(tm, a) {
# Input variable:
# tm - time in the format of YYYYMMDD.hhmmss or YYYYMMDD
# Local variable:
# a - a string variable
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return:
# a - the string of YYYY:MM:DD:hh:mm:ss
#
a = substr(tm, 1, 4) # current year: 0 ~ N
a = a ":" substr(tm, 5, 2) # current month: 1~12
a = a ":" substr(tm, 7, 2) # current date: 1~31
a = a ":" substr(tm, 10, 2) # current hour: 0~23
a = a ":" substr(tm, 12, 2) # current minute: 0~59
a = a ":" substr(tm, 14, 2) # current second: 0~59
return a
}
# <a name="getDaysInYear"></a>
function getDaysInYear(ymd, year,mon,days,mds,i) {
# Input variables:
# ymd - input date in format of YYYYMMDD or YYYYMMDD.hhmmss
# Local variables:
# year - a string of YYYY
# mon - a string of MM
# days - a string of DD
# mds - array of days in months
# i - loop index
# Global variables used: None
# Output global variables: None
# Calls-to: None
# Return: days
#
year = substr(ymd, 1, 4) + 0 # current year: 0 ~ N
mon = substr(ymd, 5, 2) + 0 # current month: 1~12
days = substr(ymd, 7, 2) + 0 # current date: 1~31
if (year%4==0)
split("31 29 31 30 31 30 31 31 30 31 30 31", mds, " ")
else
split("31 28 31 30 31 30 31 31 30 31 30 31", mds, " ")
for (i=1; i<=(mon - 1); ++i)
days += mds[i]
return days
}
# <a name="obtain_ymd"></a>
function getNewDate(ymd,gap,it, cy,a,b,Y,M,D,i,n,r,z) {
# Input variables:
# ymd - date in the format of YY/MM/DD or MM/DD/YY or DD/MM/YY
# gap - number of days from ymd
# it - input date format: default is YY/MM/DD or YY MM DD
# 1=MM/DD/YY 2=DD/MM/YY 10=YYYYMMDD
# Local variables:
# cy - Y2K cutoff year
# a,b - date and time array
# Y - year in YYYY
# M - month in MM
# D - date in DD
# n - total days in the year ( ymd + gap)
# i - loop index
# z - days in each month
# Global varialbes used: None
# Output global variables: None
# Return: date in the format of YYYY/MM/DD
#
cy = 31
split(ymd, a, " "); split(a[1], b, "/")
if (it==1) { # in_type is MM/DD/YY
M = b[1]; D = b[2]; Y = b[3]
} else if (it==2) { # in_type is DD/MM/YY
D = b[1]; M = b[2]; Y = b[3]
} else if (it==10) { # in_type is YYYYMMDD.hhmmss
Y = substr(ymd, 1, 4)+0
M = substr(ymd, 5, 2)+0
D = substr(ymd, 7, 2)+0
} else {
Y = b[1]; M = b[2]; D = b[3]
}
if (Y>cy && Y<100 ) Y += 1900
else if (Y<=cy) Y += 2000
# if (M==0) { M=1 }
# if (D==0) { D=1 }
n = getDaysInYear(sprintf("%04
while (n < 0) {
if (Y%4==0) { n += 366; --Y }
else { n += 365; --Y }
}
while (n > 365) {
if (Y%4 == 0) { n -= 366; ++Y }
else { n -= 365; ++Y }
}
z[0] = split("31 28 31 30 31 30 31 31 30 31 30 31", z, " ")
if( Y%4 == 0 ) { z[2]=29 }
else z[2]=28
for (i=1; i<=12; ++i) {
if (n<=z[i]) {M=i; D=n; break}
n -= z[i]
}
if (it==10) {
return sprintf("%04d%02d%02d", Y, M, D)
} else {
return sprintf("%04d/%02d/%02d", Y, M, D)
}
}
# <a name="isLeapYear"></a>
function isLeapYear(year) {
# Input variables:
# year - year in the format of YYYY
# Local variables: None
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return: 1 or 0
#
if ((year % 400) == 0) return 1
else if ((year % 100) == 0) return 1
else if ((year % 4) == 0) return 1
else return 0
}
$ more tst03.txt
01/01/1999|01:20|0011|12|6
01/01/1999|03:20|0621||4..
01/01/1999|05:41|0001|1|77
01/01/1999|00:41|0015|1|77
01/01/1999|00:01|1015|1|77
01/03/1996|00:01|0135|1|77
$ ./tst03.awk tst03.txt
01/01/1999|01:18:54|0:1:6|
01/01/1999|02:17:54|1:2:6|
01/01/1999|05:40:54|0:0:6|
01/01/1999|00:39:30|0:1:30
31/12/1998|22:19:30|1:41:3
29/02/1996|23:47:30|0:13:3
ASKER
Again, apologies for the delay in getting back to you earlier geotiger, and thanks again for your efforts in fixing the last bug!
Unfortunately, after trying a bit more test data, I found another bug. The following input:
01/01/1999|00:00|0001|1|77 ...
01/01/2000|00:00|0001|1|77 ...
01/01/2001|00:00|0001|1|77 ...
Produces this output:
31/12/1998|23:59:54|0:0:6| 1|77...
99/19/1999|10:01:23|0:0:6| 1|77...
30/12/2000|23:59:54|0:0:6| 1|77...
As you can see, the first record is correct, but the next 2 are wrong (last should be 31/12/2000). Any ideas? If you want to give up that's OK, but if you can get it working perfectly, the 100 points are still yours.
This script is now taking 80 secs for 2500 records. Still not too bad.
I might be asking EE for a Perl solution sometime. I think Perl can be quite compact and fast. It will be interesting to compare Perl & awk. Do you know any Perl?
Unfortunately, after trying a bit more test data, I found another bug. The following input:
01/01/1999|00:00|0001|1|77
01/01/2000|00:00|0001|1|77
01/01/2001|00:00|0001|1|77
Produces this output:
31/12/1998|23:59:54|0:0:6|
99/19/1999|10:01:23|0:0:6|
30/12/2000|23:59:54|0:0:6|
As you can see, the first record is correct, but the next 2 are wrong (last should be 31/12/2000). Any ideas? If you want to give up that's OK, but if you can get it working perfectly, the 100 points are still yours.
This script is now taking 80 secs for 2500 records. Still not too bad.
I might be asking EE for a Perl solution sometime. I think Perl can be quite compact and fast. It will be interesting to compare Perl & awk. Do you know any Perl?
It is not the issue of languages. It is an issue of date and time calculation. Please post a new thread in "Unix programming area" and I will try to fix it. I think the problem is coming from leap year.
ASKER
I know the language is not the issue. It's just a logic error.
I'm interested in a Perl solution because I'd be interested to compare code size and performance with what you've written in awk. I plan to repost the question sometime. Thanks again.
I'm interested in a Perl solution because I'd be interested to compare code size and performance with what you've written in awk. I plan to repost the question sometime. Thanks again.
I do not know whether you are still interested in awk solution. For the past a few days, I re-examine the code and finally got it to work. The debugginh is REALLY time consuming!!! The last three records are added from your message.
Hope the following codes deserve the 100 points.
$ cat tst03.awk
#!/bin/nawk -f
# File name tst03.awk
# Purpose: 1. convert seconds into hh:mm:ss format
# 2. get dt_tm from dt_tm minus gap in seconds
#
#
BEGIN { FS="|"; OFS="|";
}
{ str = $1 "|" $2 # $1 - year in MM/DD/YYYY
# $2 - call end time in hh:mm
tot = $3 * 6 ; # $3 - count of each 6 second
tm1 = cvtSec2Tm(tot)
tm2 = getCallTime($1,$2,tot)
str = str "|" $3 "|" tot
str = str "|" tm1
str = str "|" tm2
tel = cvtTimeFormat(tm2) "|" tm1
for (i=4; i<=NF; i++) {
str = str "|" $i; tel = tel "|" $i
}
if (mytest==1) { print str} else { print tel }
}
END {}
function getCallTime(dt,tm,gap, ymd,yr,a,b,t,r) {
# Input variables:
# dt - in the format of MM/DD/YYYY
# tm - in the format of hh:mm
# gap - in seconds
#
a[0] = split(dt, a, "/")
b[0] = split(tm, b, ":")
ymd = sprintf("%04d%02d%02d.%02d %02d%02d", a[3], a[2], a[1],
b[1], b[2], 0)
yr = substr(ymd,1,4) # get year
t = getSecondsInYear(ymd) # year got dropped
# print "\n" ymd ":" t ":" yr ":" gap
r = t - gap;
if (r < 0) { # not enough seconds in the year
if (isLeapYear(yr-1)) {
t = t + 366 * 86400 # 1 day = 86400 seconds
} else {
t = t + 365 * 86400
}
--yr; # reduce one year
r = t - gap
}
# print t " - " gap " = " r " yr=" yr
return yr cvtSecondsToTime(yr, r)
}
function cvtTimeFormat(tm, Y, M, D, hh, mm, ss) {
# convert YYYYMMDD.hhmmss to DD/MM/YYYY|hh:mm:ss
Y = substr(tm, 1, 4)
M = substr(tm, 5, 2)
D = substr(tm, 7, 2)
hh = substr(tm, 10,2)
mm = substr(tm, 12,2)
ss = substr(tm, 14,2)
return sprintf("%02d/%02d/%04d|%0 2d:%02d:%0 2d",D,M,Y, hh,mm,ss)
}
function cvtSec2Tm(sec, hh, mm, ss, rst) {
hr = int(sec / 3600); rst = sec - hr * 3600;
mm = int (rst / 60);
ss = rst - mm * 60;
return hr ":" mm ":" ss
}
# <a name="cvtSecondsToTime"></ a>
function cvtSecondsToTime(yr,sec, rst,Y,M,D,hh,mm,ss,tm,ds,f ,a,r) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms s")
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
if (isLeapYear(yr)) {
f["ys"] = 31622400 # 1 leap year = 31622400 seconds
} else {
f["ys"] = 31536000 # 1 year = 31536000 seconds
}
Y = int(sec / f["ys"]); rst = sec - Y * f["ys"]
ds = int(rst / f["ds"]); rst = rst - ds * f["ds"]
f["ymd"] = getNewDate(sprintf("%04d01 01", yr), ds, 10)
# print "ymd=" f["ymd"] " ds=" ds
hh = int(rst / f["hs"]); rst = rst - hh * f["hs"]
mm = int(rst / f["ms"]); rst = rst - mm * f["ms"]
ss = rst
if (Y==0) {
r = substr(f["ymd"], 5,4) "." sprintf("%02d%02d%02d",hh, mm,ss)
} else {
r = f["ymd"] "." sprintf("%02d%02d%02d", hh, mm, ss)
}
return r
}
# <a name="getSecondsInYear"></ a>
function getSecondsInYear(tm, f,a,sec,t) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms s")
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
# ymd is in format of YYYYMMDD or YYYYMMDD.hhmmss
tm=tm ""; split(decompTime(tm), a, ":")
t = getDaysInYear(tm)
if (t>0) { sec = (t-1) * f["ds"] } # so we get accurate seconds
sec = sec + a[4] * f["hs"] + a[5] * f["ms"] + a[6]
return sec
}
# <a name="decompTime"></a>
function decompTime(tm, a) {
# Input variable:
# tm - time in the format of YYYYMMDD.hhmmss or YYYYMMDD
# Local variable:
# a - a string variable
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return:
# a - the string of YYYY:MM:DD:hh:mm:ss
#
a = substr(tm, 1, 4) # current year: 0 ~ N
a = a ":" substr(tm, 5, 2) # current month: 1~12
a = a ":" substr(tm, 7, 2) # current date: 1~31
a = a ":" substr(tm, 10, 2) # current hour: 0~23
a = a ":" substr(tm, 12, 2) # current minute: 0~59
a = a ":" substr(tm, 14, 2) # current second: 0~59
return a
}
# <a name="getDaysInYear"></a>
function getDaysInYear(ymd, year,mon,days,mds,i) {
# Input variables:
# ymd - input date in format of YYYYMMDD or YYYYMMDD.hhmmss
# Local variables:
# year - a string of YYYY
# mon - a string of MM
# days - a string of DD
# mds - array of days in months
# i - loop index
# Global variables used: None
# Output global variables: None
# Calls-to: None
# Return: days
#
year = substr(ymd, 1, 4) + 0 # current year: 0 ~ N
mon = substr(ymd, 5, 2) + 0 # current month: 1~12
days = substr(ymd, 7, 2) + 0 # current date: 1~31
if (year%4==0)
split("31 29 31 30 31 30 31 31 30 31 30 31", mds, " ")
else
split("31 28 31 30 31 30 31 31 30 31 30 31", mds, " ")
for (i=1; i<=(mon - 1); ++i)
days += mds[i]
return days
}
# <a name="obtain_ymd"></a>
function getNewDate(ymd,gap,it, cy,a,b,Y,M,D,i,n,r,z,ds) {
# Input variables:
# ymd - date in the format of YY/MM/DD or MM/DD/YY or DD/MM/YY
# gap - number of days from ymd
# it - input date format: default is YY/MM/DD or YY MM DD
# 1=MM/DD/YY 2=DD/MM/YY 10=YYYYMMDD
# Local variables:
# cy - Y2K cutoff year
# a,b - date and time array
# Y - year in YYYY
# M - month in MM
# D - date in DD
# n - total days in the year ( ymd + gap)
# i - loop index
# z - days in each month
# Global varialbes used: None
# Output global variables: None
# Return: date in the format of YYYY/MM/DD
#
cy = 31
split(ymd, a, " "); split(a[1], b, "/")
if (it==1) { # in_type is MM/DD/YY
M = b[1]; D = b[2]; Y = b[3]
} else if (it==2) { # in_type is DD/MM/YY
D = b[1]; M = b[2]; Y = b[3]
} else if (it==10) { # in_type is YYYYMMDD.hhmmss
Y = substr(ymd, 1, 4)+0
M = substr(ymd, 5, 2)+0
D = substr(ymd, 7, 2)+0
} else {
Y = b[1]; M = b[2]; D = b[3]
}
if (Y>cy && Y<100 ) Y += 1900
else if (Y<=cy) Y += 2000
# if (M==0) { M=1 }
# if (D==0) { D=1 }
n = getDaysInYear(sprintf("%04 d%02d%02d" , Y, M, D)) + gap + 0
# print "n=" n
while (n < 0) {
if (isLeapYear(Y)==1) { n += 366; --Y }
else { n += 365; --Y }
}
if (isLeapYear(Y) == 1) { ds = 366; } else { ds = 365; }
while (n > ds) {
if (isLeapYear(Y) == 1) { n -= 366; ++Y }
else { n -= 365; ++Y }
}
z[0] = split("31 28 31 30 31 30 31 31 30 31 30 31", z, " ")
if( Y%4 == 0 ) { z[2]=29 }
else z[2]=28
for (i=1; i<=12; ++i) {
if (n<=z[i]) {M=i; D=n; break}
n -= z[i]
}
if (it==10) {
return sprintf("%04d%02d%02d", Y, M, D)
} else {
return sprintf("%04d/%02d/%02d", Y, M, D)
}
}
# <a name="isLeapYear"></a>
function isLeapYear(year) {
# Input variables:
# year - year in the format of YYYY
# Local variables: None
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return: 1 or 0
#
if ((year % 400) == 0) return 1
else if ((year % 100) == 0) return 1
else if ((year % 4) == 0) return 1
else return 0
}
$ ./tst03.awk tst03.txt
01/01/1999|01:18:54|0:1:6| 12|678...
01/01/1999|02:17:54|1:2:6| |4...
01/01/1999|05:40:54|0:0:6| 1|77...
01/01/1999|00:39:30|0:1:30 |1|77...
31/12/1998|22:19:30|1:41:3 0|1|77...
29/02/1996|23:47:30|0:13:3 0|1|77...
31/12/1998|23:59:54|0:0:6| 1|77...
31/12/1999|23:59:54|0:0:6| 1|77...
31/12/2000|23:59:54|0:0:6| 1|77...
$
Hope the following codes deserve the 100 points.
$ cat tst03.awk
#!/bin/nawk -f
# File name tst03.awk
# Purpose: 1. convert seconds into hh:mm:ss format
# 2. get dt_tm from dt_tm minus gap in seconds
#
#
BEGIN { FS="|"; OFS="|";
}
{ str = $1 "|" $2 # $1 - year in MM/DD/YYYY
# $2 - call end time in hh:mm
tot = $3 * 6 ; # $3 - count of each 6 second
tm1 = cvtSec2Tm(tot)
tm2 = getCallTime($1,$2,tot)
str = str "|" $3 "|" tot
str = str "|" tm1
str = str "|" tm2
tel = cvtTimeFormat(tm2) "|" tm1
for (i=4; i<=NF; i++) {
str = str "|" $i; tel = tel "|" $i
}
if (mytest==1) { print str} else { print tel }
}
END {}
function getCallTime(dt,tm,gap, ymd,yr,a,b,t,r) {
# Input variables:
# dt - in the format of MM/DD/YYYY
# tm - in the format of hh:mm
# gap - in seconds
#
a[0] = split(dt, a, "/")
b[0] = split(tm, b, ":")
ymd = sprintf("%04d%02d%02d.%02d
b[1], b[2], 0)
yr = substr(ymd,1,4) # get year
t = getSecondsInYear(ymd) # year got dropped
# print "\n" ymd ":" t ":" yr ":" gap
r = t - gap;
if (r < 0) { # not enough seconds in the year
if (isLeapYear(yr-1)) {
t = t + 366 * 86400 # 1 day = 86400 seconds
} else {
t = t + 365 * 86400
}
--yr; # reduce one year
r = t - gap
}
# print t " - " gap " = " r " yr=" yr
return yr cvtSecondsToTime(yr, r)
}
function cvtTimeFormat(tm, Y, M, D, hh, mm, ss) {
# convert YYYYMMDD.hhmmss to DD/MM/YYYY|hh:mm:ss
Y = substr(tm, 1, 4)
M = substr(tm, 5, 2)
D = substr(tm, 7, 2)
hh = substr(tm, 10,2)
mm = substr(tm, 12,2)
ss = substr(tm, 14,2)
return sprintf("%02d/%02d/%04d|%0
}
function cvtSec2Tm(sec, hh, mm, ss, rst) {
hr = int(sec / 3600); rst = sec - hr * 3600;
mm = int (rst / 60);
ss = rst - mm * 60;
return hr ":" mm ":" ss
}
# <a name="cvtSecondsToTime"></
function cvtSecondsToTime(yr,sec, rst,Y,M,D,hh,mm,ss,tm,ds,f
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
if (isLeapYear(yr)) {
f["ys"] = 31622400 # 1 leap year = 31622400 seconds
} else {
f["ys"] = 31536000 # 1 year = 31536000 seconds
}
Y = int(sec / f["ys"]); rst = sec - Y * f["ys"]
ds = int(rst / f["ds"]); rst = rst - ds * f["ds"]
f["ymd"] = getNewDate(sprintf("%04d01
# print "ymd=" f["ymd"] " ds=" ds
hh = int(rst / f["hs"]); rst = rst - hh * f["hs"]
mm = int(rst / f["ms"]); rst = rst - mm * f["ms"]
ss = rst
if (Y==0) {
r = substr(f["ymd"], 5,4) "." sprintf("%02d%02d%02d",hh,
} else {
r = f["ymd"] "." sprintf("%02d%02d%02d", hh, mm, ss)
}
return r
}
# <a name="getSecondsInYear"></
function getSecondsInYear(tm, f,a,sec,t) {
# Input variables:
# tm - time in the format of YYYYMMDD.hhmmss
# Local variables:
# f - array of time factors
# a - time array for tm
# sec - total seconds
# Global variables used: None
# Output global variables: None
# Calls-To:
# decompTime("YYYYMMDD.hhmms
# getDaysInYear(tm)
# Return: seconds
#
# set constant variables
f["ms"] = 60 # 1 minute = 60 seconds
f["hs"] = 3600 # 1 hour = 3600 seconds
f["ds"] = 86400 # 1 day = 86400 seconds
f["ys"] = 31536000 # 1 year = 31536000 seconds
# ymd is in format of YYYYMMDD or YYYYMMDD.hhmmss
tm=tm ""; split(decompTime(tm), a, ":")
t = getDaysInYear(tm)
if (t>0) { sec = (t-1) * f["ds"] } # so we get accurate seconds
sec = sec + a[4] * f["hs"] + a[5] * f["ms"] + a[6]
return sec
}
# <a name="decompTime"></a>
function decompTime(tm, a) {
# Input variable:
# tm - time in the format of YYYYMMDD.hhmmss or YYYYMMDD
# Local variable:
# a - a string variable
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return:
# a - the string of YYYY:MM:DD:hh:mm:ss
#
a = substr(tm, 1, 4) # current year: 0 ~ N
a = a ":" substr(tm, 5, 2) # current month: 1~12
a = a ":" substr(tm, 7, 2) # current date: 1~31
a = a ":" substr(tm, 10, 2) # current hour: 0~23
a = a ":" substr(tm, 12, 2) # current minute: 0~59
a = a ":" substr(tm, 14, 2) # current second: 0~59
return a
}
# <a name="getDaysInYear"></a>
function getDaysInYear(ymd, year,mon,days,mds,i) {
# Input variables:
# ymd - input date in format of YYYYMMDD or YYYYMMDD.hhmmss
# Local variables:
# year - a string of YYYY
# mon - a string of MM
# days - a string of DD
# mds - array of days in months
# i - loop index
# Global variables used: None
# Output global variables: None
# Calls-to: None
# Return: days
#
year = substr(ymd, 1, 4) + 0 # current year: 0 ~ N
mon = substr(ymd, 5, 2) + 0 # current month: 1~12
days = substr(ymd, 7, 2) + 0 # current date: 1~31
if (year%4==0)
split("31 29 31 30 31 30 31 31 30 31 30 31", mds, " ")
else
split("31 28 31 30 31 30 31 31 30 31 30 31", mds, " ")
for (i=1; i<=(mon - 1); ++i)
days += mds[i]
return days
}
# <a name="obtain_ymd"></a>
function getNewDate(ymd,gap,it, cy,a,b,Y,M,D,i,n,r,z,ds) {
# Input variables:
# ymd - date in the format of YY/MM/DD or MM/DD/YY or DD/MM/YY
# gap - number of days from ymd
# it - input date format: default is YY/MM/DD or YY MM DD
# 1=MM/DD/YY 2=DD/MM/YY 10=YYYYMMDD
# Local variables:
# cy - Y2K cutoff year
# a,b - date and time array
# Y - year in YYYY
# M - month in MM
# D - date in DD
# n - total days in the year ( ymd + gap)
# i - loop index
# z - days in each month
# Global varialbes used: None
# Output global variables: None
# Return: date in the format of YYYY/MM/DD
#
cy = 31
split(ymd, a, " "); split(a[1], b, "/")
if (it==1) { # in_type is MM/DD/YY
M = b[1]; D = b[2]; Y = b[3]
} else if (it==2) { # in_type is DD/MM/YY
D = b[1]; M = b[2]; Y = b[3]
} else if (it==10) { # in_type is YYYYMMDD.hhmmss
Y = substr(ymd, 1, 4)+0
M = substr(ymd, 5, 2)+0
D = substr(ymd, 7, 2)+0
} else {
Y = b[1]; M = b[2]; D = b[3]
}
if (Y>cy && Y<100 ) Y += 1900
else if (Y<=cy) Y += 2000
# if (M==0) { M=1 }
# if (D==0) { D=1 }
n = getDaysInYear(sprintf("%04
# print "n=" n
while (n < 0) {
if (isLeapYear(Y)==1) { n += 366; --Y }
else { n += 365; --Y }
}
if (isLeapYear(Y) == 1) { ds = 366; } else { ds = 365; }
while (n > ds) {
if (isLeapYear(Y) == 1) { n -= 366; ++Y }
else { n -= 365; ++Y }
}
z[0] = split("31 28 31 30 31 30 31 31 30 31 30 31", z, " ")
if( Y%4 == 0 ) { z[2]=29 }
else z[2]=28
for (i=1; i<=12; ++i) {
if (n<=z[i]) {M=i; D=n; break}
n -= z[i]
}
if (it==10) {
return sprintf("%04d%02d%02d", Y, M, D)
} else {
return sprintf("%04d/%02d/%02d", Y, M, D)
}
}
# <a name="isLeapYear"></a>
function isLeapYear(year) {
# Input variables:
# year - year in the format of YYYY
# Local variables: None
# Global variables used: None
# Output global variables: None
# Calls-To: None
# Return: 1 or 0
#
if ((year % 400) == 0) return 1
else if ((year % 100) == 0) return 1
else if ((year % 4) == 0) return 1
else return 0
}
$ ./tst03.awk tst03.txt
01/01/1999|01:18:54|0:1:6|
01/01/1999|02:17:54|1:2:6|
01/01/1999|05:40:54|0:0:6|
01/01/1999|00:39:30|0:1:30
31/12/1998|22:19:30|1:41:3
29/02/1996|23:47:30|0:13:3
31/12/1998|23:59:54|0:0:6|
31/12/1999|23:59:54|0:0:6|
31/12/2000|23:59:54|0:0:6|
$
Tel2, does it work for you and deserve 100 points?
ASKER
geotiger,
Yes it does, and yes it is!
I hadn't forgotten you, geo. I tested it yesterday, and I was going to reply today. Your points are in Q_10492298
Good work.
Yes it does, and yes it is!
I hadn't forgotten you, geo. I tested it yesterday, and I was going to reply today. Your points are in Q_10492298
Good work.
ASKER
Well done geotiger! Keep up the good work.
Any ideas how the 2nd part could be done? Is awk capable of doing such calculations? Can be quite complex when going back over the midnight barrier.