Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 218
  • Last Modified:

Convert date format and numbers in column on Linux

Hi,
I have file on Linux with 50000+ lines like below:
Wed Nov 19 04:42:33 2014|1969929|877973|439.549|
Wed Nov 19 04:43:05 2014|1291231|969149|138.522|
Wed Nov 19 04:43:05 2014|969151|969149|5.884|

I need to do the following on Linux:
    First column convert to format "%Y%m%d%H%M%S"
    Second and third column divide by 1048576
    Forth column roundup

20141119044233|1.87|0.84|439


Thanks in advance
0
IKeystone
Asked:
IKeystone
  • 13
  • 5
  • 4
  • +1
2 Solutions
 
ozoCommented:
perl -aF'\W' -ne '$"="";printf "$F[6]${{Jan=>1,Feb=>2,Mar=>3,Apr=>4,May=>5,Jun=>6,Jul=>7,Aug=>8,Sep=>9,Oct=>10,Nov=>11,Dec=>12}}{$F[1]}@F[2..5]|%.2f|%.2f|%.d\n",(map$_/1048576,@F[7,8]),$F[9]' file
0
 
tel2Commented:
Well said, ozo.

IKeystone, when you say "Forth column roundup", what do you mean in the context of your example, which has "439.549" as input and "439" as output?  Looks more like "truncate forth column to 0 decimal places", to me.
0
 
tel2Commented:
Hi again IKeystone,

If you are wanting to round the forth column to the nearest integer, here are some slight modifications of ozo's code which seem to do it:

perl -aF'\W' -ne '$"="";printf "$F[6]${{Jan=>1,Feb=>2,Mar=>3,Apr=>4,May=>5,Jun=>6,Jul=>7,Aug=>8,Sep=>9,Oct=>10,Nov=>11,Dec=>12}}{$F[1]}@F[2..5]|%.2f|%.2f|%f\n",(map$_/1048576,@F[7,8]),$F[9]+$F[10]/1000' file

Here's the output:
20141119044233|1.88|0.84|440
20141119044305|1.23|0.92|139
20141119044305|0.92|0.92|6
If that's not what you want, please claify your requirements.
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
ozoCommented:
perl -aF'[^\w.]' -ne '$"="";printf "$F[6]${{Jan=>1,Feb=>2,Mar=>3,Apr=>4,May=>5,Jun=>6,Jul=>7,Aug=>8,Sep=>9,Oct=>10,Nov=>11,Dec=>12}}{$F[1]}@F[2..5]|%.2f|%.2f|%.f\n",(map$_/1048576,@F[7,8]),$F[9]'
0
 
tel2Commented:
Well said again, ozo.

IKeystone, note that my '%f' was meant to read '%.f'.  But my code only works if the forth column has 3 decimal places (e.g. 5.5 would be considered the same as 5.005).  ozo's new solution doesn't have that limitation, though...and it's shorter...as usual.
0
 
IKeystoneAuthor Commented:
Working fine. Can you change you script to have double digit month. Jan=01, etc ...
0
 
tel2Commented:
Good to hear it, IKeystone.

I expect ozo will give you that fix shortly...if I don't beat him to it.

And would you want double digit days, too?  No extra charge if you do.  Just say the word.

Meanwhile...
What do you mean by "Forth column roundup", IKestone?  For example, do you want "5.499" to round down to "5" or up to "6"?  What about "5.001"?
0
 
IKeystoneAuthor Commented:
Mean 4.589=5, 5.3467=5, 5.001=5, 116.5=117 (I think). Roundup to nearest integer
0
 
tel2Commented:
Or more simply, "round to nearest integer", I guess.
0
 
tel2Commented:
What was the answer to the question in my last post about "double digit days", IKeystone?

Why is it that I have to ask everything more than once?  I'm trying to prevent you from running into problems and having to come back and ask again.  Please make it easy for us to help you, especially when we are trying to go the extra mile for you.
0
 
tel2Commented:
While I'm waiting for a response to my last post, try this:

perl -aF'[^\w.]' -ne '$"="";printf "%d%02d%02d%02d%02d%02d|%.2f|%.2f|%.f\n",$F[6], ${{Jan=>1,Feb=>2,Mar=>3,Apr=>4,May=>5,Jun=>6,Jul=>7,Aug=>8,Sep=>9,Oct=>10,Nov=>11,Dec=>12}}{$F[1]},@F[2..5],(map$_/1048576,@F[7,8]),$F[9]' file
0
 
ozoCommented:
perl -aF'[^\w.]' -ne '$"="";printf "$F[6]${{Jan=>"01",Feb=>"02",Mar=>"03",Apr=>"04",May=>"05",Jun=>"06",Jul=>"07",Aug=>"08",Sep=>"09",Oct=>10,Nov=>11,Dec=>12}}{$F[1]}@F[2..5]|%.2f|%.2f|%.f\n",(map$_/1048576,@F[7,8]),$F[9]' file
0
 
simon3270Commented:
Alternatively, just using standard commands ("date" to convert the date, "bc" to handle the division, and printf to do the rounding (bc just truncates)):
OIFS=$IFS
IFS='|'
while read dat; do
  set $dat
  d1=$(date -d "$1" '+%Y%m%d%H%M%S')
  d2=$(echo "scale=5;$2/1048576.0" | bc)
  d3=$(echo "scale=5;$3/1048576.0" | bc)
  printf "%s|%0.2f|%0.2f|%0.0f\n" "$d1" "$d2" "$d3" "$4"
done < file
IFS=$OIFS

Open in new window

(Note that printf rounds down when the fraction is exactly 0.5)
0
 
tel2Commented:
Works for me, Simon.

> "(Note that printf rounds down when the fraction is exactly 0.5)"
Good point.  I never knew this.  Strange!  0.5 rounds down, and 1.5 etc rounds up!
$ printf "%1.0f\n" 0.5
0
$ printf "%1.0f\n" 1.5
2

What's that about?

Same thing happens via printf in Perl, so I maybe it uses the same code.

tel2
0
 
ozoCommented:
The standard when rounding in a base that is a multiple of 2 but not a multiple of 4 (such as base 10 or base 2) is to round values that are half way between an even number and an odd number to the even number.
(see Knuth, Graham, et al.)
0
 
tel2Commented:
Thanks ozo.  And that's what seems to be happening below.

$ printf "%1.0f\n" 2.5
2
$ printf "%1.0f\n" 3.5
4
$ printf "%1.0f\n" 4.5
4
$ printf "%1.0f\n" 5.5
6
0
 
IKeystoneAuthor Commented:
Latest changes from ozo doesn't work.

syntax error at -e line 1, at EOF
Missing right curly or square bracket at -e line 1, within string
0
 
tel2Commented:
Have you tried my amendments to ozo's code, IKeystone?  See post 40458769, above.  What is wrong with that code?

And I know I've already asked this twice, but I in the absense of an ansewer, could you please answer my question in post 40458760.

Thanks.
tel2
0
 
IKeystoneAuthor Commented:
Sorry for misunderstanding. 40458769 working fine. And yes, I need double digit days and month
0
 
tel2Commented:
Thanks for the clarification, IKeystone.  My version of ozo's code handles double digit months and days.

Does that cover your requirements, or have we missed anything?
0
 
ozoCommented:
perl -aF'[^\w.]' -ne 'BEGIN{$"="";@m{qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)}="01".."12"} printf "$F[6]$m{$F[1]}@F[2..5]|%.2f|%.2f|%.f\n",(map$_/1048576,@F[7,8]),$F[9]' file
0
 
tel2Commented:
Thanks for the points, IKeystone.  Generous (to me) considering that ozo provided the basic code, but I guess some of my points are for nagging you to give us full spec's, which is all part of the work.  8)
Personally I think simon deserved some credit for providing a perfectly working solution too, although I expect the Perl solutions would be much faster (if speed is a consideration).

ozo, that's very concise/elegant, as usual.  Did you notice that it doesn't always output 2 digits for days though?  That is part of the final spec's.  For example, this input:
    Wed Jan 2 04:43:05 2014|969151|969149|5.4|
gives results in this output:
    2014012044305|0.92|0.92|5
but should result in this output:
    20140102044305|0.92|0.92|5

tel2
0
 
simon3270Commented:
@tel2 - Maybe that's an argument for using the shell-based version. The  formatting is a little more visible, though I do think that  the Perl version is more concise! :-)
0
 
tel2Commented:
True, Simon.  Concise, but cryptic.  Though it could be written more readably, starting by not writing it as a one-liner.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 13
  • 5
  • 4
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now