Solved

Convert date format and numbers in column on Linux

Posted on 2014-11-19
24
190 Views
Last Modified: 2014-11-26
Hi,
I have file on Linux with 50000+ lines like below:
Wed Nov 19 04:42:33 2014|1969929|877973|439.549|
Wed Nov 19 04:43:05 2014|1291231|969149|138.522|
Wed Nov 19 04:43:05 2014|969151|969149|5.884|

I need to do the following on Linux:
    First column convert to format "%Y%m%d%H%M%S"
    Second and third column divide by 1048576
    Forth column roundup

20141119044233|1.87|0.84|439


Thanks in advance
0
Comment
Question by:IKeystone
  • 13
  • 5
  • 4
  • +1
24 Comments
 
LVL 84

Expert Comment

by:ozo
Comment Utility
perl -aF'\W' -ne '$"="";printf "$F[6]${{Jan=>1,Feb=>2,Mar=>3,Apr=>4,May=>5,Jun=>6,Jul=>7,Aug=>8,Sep=>9,Oct=>10,Nov=>11,Dec=>12}}{$F[1]}@F[2..5]|%.2f|%.2f|%.d\n",(map$_/1048576,@F[7,8]),$F[9]' file
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
Well said, ozo.

IKeystone, when you say "Forth column roundup", what do you mean in the context of your example, which has "439.549" as input and "439" as output?  Looks more like "truncate forth column to 0 decimal places", to me.
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
Hi again IKeystone,

If you are wanting to round the forth column to the nearest integer, here are some slight modifications of ozo's code which seem to do it:

perl -aF'\W' -ne '$"="";printf "$F[6]${{Jan=>1,Feb=>2,Mar=>3,Apr=>4,May=>5,Jun=>6,Jul=>7,Aug=>8,Sep=>9,Oct=>10,Nov=>11,Dec=>12}}{$F[1]}@F[2..5]|%.2f|%.2f|%f\n",(map$_/1048576,@F[7,8]),$F[9]+$F[10]/1000' file

Here's the output:
20141119044233|1.88|0.84|440
20141119044305|1.23|0.92|139
20141119044305|0.92|0.92|6
If that's not what you want, please claify your requirements.
0
 
LVL 84

Expert Comment

by:ozo
Comment Utility
perl -aF'[^\w.]' -ne '$"="";printf "$F[6]${{Jan=>1,Feb=>2,Mar=>3,Apr=>4,May=>5,Jun=>6,Jul=>7,Aug=>8,Sep=>9,Oct=>10,Nov=>11,Dec=>12}}{$F[1]}@F[2..5]|%.2f|%.2f|%.f\n",(map$_/1048576,@F[7,8]),$F[9]'
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
Well said again, ozo.

IKeystone, note that my '%f' was meant to read '%.f'.  But my code only works if the forth column has 3 decimal places (e.g. 5.5 would be considered the same as 5.005).  ozo's new solution doesn't have that limitation, though...and it's shorter...as usual.
0
 

Author Comment

by:IKeystone
Comment Utility
Working fine. Can you change you script to have double digit month. Jan=01, etc ...
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
Good to hear it, IKeystone.

I expect ozo will give you that fix shortly...if I don't beat him to it.

And would you want double digit days, too?  No extra charge if you do.  Just say the word.

Meanwhile...
What do you mean by "Forth column roundup", IKestone?  For example, do you want "5.499" to round down to "5" or up to "6"?  What about "5.001"?
0
 

Author Comment

by:IKeystone
Comment Utility
Mean 4.589=5, 5.3467=5, 5.001=5, 116.5=117 (I think). Roundup to nearest integer
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
Or more simply, "round to nearest integer", I guess.
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
What was the answer to the question in my last post about "double digit days", IKeystone?

Why is it that I have to ask everything more than once?  I'm trying to prevent you from running into problems and having to come back and ask again.  Please make it easy for us to help you, especially when we are trying to go the extra mile for you.
0
 
LVL 11

Accepted Solution

by:
tel2 earned 250 total points
Comment Utility
While I'm waiting for a response to my last post, try this:

perl -aF'[^\w.]' -ne '$"="";printf "%d%02d%02d%02d%02d%02d|%.2f|%.2f|%.f\n",$F[6], ${{Jan=>1,Feb=>2,Mar=>3,Apr=>4,May=>5,Jun=>6,Jul=>7,Aug=>8,Sep=>9,Oct=>10,Nov=>11,Dec=>12}}{$F[1]},@F[2..5],(map$_/1048576,@F[7,8]),$F[9]' file
0
 
LVL 84

Expert Comment

by:ozo
Comment Utility
perl -aF'[^\w.]' -ne '$"="";printf "$F[6]${{Jan=>"01",Feb=>"02",Mar=>"03",Apr=>"04",May=>"05",Jun=>"06",Jul=>"07",Aug=>"08",Sep=>"09",Oct=>10,Nov=>11,Dec=>12}}{$F[1]}@F[2..5]|%.2f|%.2f|%.f\n",(map$_/1048576,@F[7,8]),$F[9]' file
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 19

Expert Comment

by:simon3270
Comment Utility
Alternatively, just using standard commands ("date" to convert the date, "bc" to handle the division, and printf to do the rounding (bc just truncates)):
OIFS=$IFS
IFS='|'
while read dat; do
  set $dat
  d1=$(date -d "$1" '+%Y%m%d%H%M%S')
  d2=$(echo "scale=5;$2/1048576.0" | bc)
  d3=$(echo "scale=5;$3/1048576.0" | bc)
  printf "%s|%0.2f|%0.2f|%0.0f\n" "$d1" "$d2" "$d3" "$4"
done < file
IFS=$OIFS

Open in new window

(Note that printf rounds down when the fraction is exactly 0.5)
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
Works for me, Simon.

> "(Note that printf rounds down when the fraction is exactly 0.5)"
Good point.  I never knew this.  Strange!  0.5 rounds down, and 1.5 etc rounds up!
$ printf "%1.0f\n" 0.5
0
$ printf "%1.0f\n" 1.5
2

What's that about?

Same thing happens via printf in Perl, so I maybe it uses the same code.

tel2
0
 
LVL 84

Expert Comment

by:ozo
Comment Utility
The standard when rounding in a base that is a multiple of 2 but not a multiple of 4 (such as base 10 or base 2) is to round values that are half way between an even number and an odd number to the even number.
(see Knuth, Graham, et al.)
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
Thanks ozo.  And that's what seems to be happening below.

$ printf "%1.0f\n" 2.5
2
$ printf "%1.0f\n" 3.5
4
$ printf "%1.0f\n" 4.5
4
$ printf "%1.0f\n" 5.5
6
0
 

Author Comment

by:IKeystone
Comment Utility
Latest changes from ozo doesn't work.

syntax error at -e line 1, at EOF
Missing right curly or square bracket at -e line 1, within string
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
Have you tried my amendments to ozo's code, IKeystone?  See post 40458769, above.  What is wrong with that code?

And I know I've already asked this twice, but I in the absense of an ansewer, could you please answer my question in post 40458760.

Thanks.
tel2
0
 

Author Comment

by:IKeystone
Comment Utility
Sorry for misunderstanding. 40458769 working fine. And yes, I need double digit days and month
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
Thanks for the clarification, IKeystone.  My version of ozo's code handles double digit months and days.

Does that cover your requirements, or have we missed anything?
0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 250 total points
Comment Utility
perl -aF'[^\w.]' -ne 'BEGIN{$"="";@m{qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)}="01".."12"} printf "$F[6]$m{$F[1]}@F[2..5]|%.2f|%.2f|%.f\n",(map$_/1048576,@F[7,8]),$F[9]' file
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
Thanks for the points, IKeystone.  Generous (to me) considering that ozo provided the basic code, but I guess some of my points are for nagging you to give us full spec's, which is all part of the work.  8)
Personally I think simon deserved some credit for providing a perfectly working solution too, although I expect the Perl solutions would be much faster (if speed is a consideration).

ozo, that's very concise/elegant, as usual.  Did you notice that it doesn't always output 2 digits for days though?  That is part of the final spec's.  For example, this input:
    Wed Jan 2 04:43:05 2014|969151|969149|5.4|
gives results in this output:
    2014012044305|0.92|0.92|5
but should result in this output:
    20140102044305|0.92|0.92|5

tel2
0
 
LVL 19

Expert Comment

by:simon3270
Comment Utility
@tel2 - Maybe that's an argument for using the shell-based version. The  formatting is a little more visible, though I do think that  the Perl version is more concise! :-)
0
 
LVL 11

Expert Comment

by:tel2
Comment Utility
True, Simon.  Concise, but cryptic.  Though it could be written more readably, starting by not writing it as a one-liner.
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Java performance on Solaris - Managing CPUs There are various resource controls in operating system which directly/indirectly influence the performance of application. one of the most important resource controls is "CPU".   In a multithreaded…
The purpose of this article is to demonstrate how we can use conditional statements using Python.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now