We help IT Professionals succeed at work.

Linux: Count lines in file that begin with

hankknight
hankknight asked
on
510 Views
Last Modified: 2013-11-18
My file is in this format:
Nov 10 04:03:00 Consectetur adipiscing elit. Mauris luctus, nulla eu pellentesque interdum, arcu quam elementum magna
Nov 10 04:03:07 Sollicitudin scelerisque magna lacus sit amet magna. Nunc iaculis arcu a egestas rutrum. 
Nov 10 04:04:01 Nulla quis feugiat dolor, vitae ultrices quam. In cursus eu est id luctus. Cras at velit eleifend
Nov 10 04:04:01 Tincidunt felis sed, elementum quam. Donec sit amet nisi vulputate, lobortis arcu ut, mattis tellus. 

Open in new window

It is located here:
/home/xyz/my.log

How can I count the number of lines that begin with the date and time stamp that matches one hour or less from now?
Comment
Watch Question

CERTIFIED EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
Again, a nice task.

input="/home/xyz/my.log"
start=$(date -d "1 hour ago" "+%s"); i=0
while read line; do
  logdate=$(date -d "$(echo $line | awk '{print $1, $2, $3}')" "+%s")
   [[ $logdate -ge $start ]] && ((i+=1))
done < $input
echo "Lines in $input not older than 1 hour:" $i
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Author

Commented:
Thank you both.

I like the simplicity of Mazdajai's solution however there is a problem.  Instead of giving the results for the past 60 minutes it returns results based on the value of the current hour.

For example at 11:59 your solution returns 5000 results and at 12:00 your solution returns 0 results.

I want a solution that returns results based on the past hour, not the value of the current hour.

woolmilkporc, your idea gives me a syntax error.
CERTIFIED EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
Please, what is the exact error message?

I can guess a lot, but not everything.

Author

Commented:
syntax error near unexpected token `
line 5: `   [[ $logdate -ge $start ]] && ((i+=1))

Open in new window

CERTIFIED EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
I don't have such a backtick  ` in front of "[[ $logdate ... ."

Where does it come from?

Author

Commented:
woolmilkporc, the problem must be on my end.  All the other code you have provided for other questions works perfectly.

That said, I still prefer Mazdajai's approach here:
grep -c "`date | awk '{print $2,$3,$4}' |awk -F: '{print $1}'`" /home/xyz/my.log

Open in new window

How can that code be adjusted to include all records from the past 60 minutes?  Maybe the date needs to be converted into a traditional timestamp on every line before processing it?
CERTIFIED EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
>> Maybe the date needs to be converted into a traditional timestamp <<

That's what I did. Looking forward to see Mazdajai's solution.

Here is my version again, this time in "code" format  which might be better for copy and paste, and again, there is no backtick anywhere:

input="/home/xyz/my.log"
start=$(date -d "1 hour ago" "+%s"); i=0
while read line; do
  logdate=$(date -d "$(echo $line | awk '{print $1, $2, $3}')" "+%s")
   [[ $logdate -ge $start ]] && ((i+=1))
done < $input
echo "Lines in $input not older than 1 hour:" $i 

Open in new window

Author

Commented:
This can return (inaccurate) results almost instantly even for very large files (10+ gigs)
grep -c "`date | awk '{print $2,$3,$4}' |awk -F: '{print $1}'`" /home/xyz/my.log

Open in new window

The code you posted can take more than 5 minutes.  Can the performance of your solution be enhanced?  

IDEA 1: Log entries are sequential.  Break the loop as soon as a non-matching line is found.
IDEA 2: Before starting the loop, check every 100th result and cut the file as soon as an older match is found
IDEA 3: Use Mazdajai's code to get all results from this hour and from last hour then use your approach to get the results for this hour only
CERTIFIED EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
What exactly do you mean with "Log entries are sequential"? Do you mean that new records are added at the top of the file? I strongly doubt this.

If (as ususal) new entries are added at the end of the file you can try (following your IDEA 1):
input="my.log"
start=$(date -d "1 hour ago" "+%s"); i=0
tac $input | while read line; do
  logdate=$(date -d "$(echo $line | awk '{print $1, $2, $3}')" "+%s")
  if [[ $logdate -ge $start ]]; then ((i+=1))
   else break
  fi
done
echo "Lines in $input younger than 1 hour:" $i

Open in new window

CERTIFIED EXPERT

Commented:
Try..

awk '
        BEGIN {
                cmd="date -d \"1 hour hour ago\" +%H:%M:%S"
                cmd | getline onehourago
                close(cmd)
        }
        {
                cmd="date -d \""$3"\" +%H:%M:%S"
                cmd | getline d
                close(cmd)
                if (d > onehourago) print
        }
' /home/xyz/my.log|wc -l 

Open in new window

Author

Commented:
Both your ideas return 0 even when there are entries.  

New entries are always added to the bottom.  So I guess we would need to start at the bottom of the file and work our way up and then break when the time condition is not met.
CERTIFIED EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
That's exactly what I tried to do with "tac". Are you sure that your file's contents are OK?
CERTIFIED EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
Or is there an empty line at the end? Anyway, please try
input="/home/xyz/my.log"
start=$(date -d "1 hour ago" "+%s"); i=0
tac $input | while read line; do
  if ! echo $line | grep -Eq "^[a-zA-Z]" ; then continue; fi 
  logdate=$(date -d "$(echo $line | awk '{print $1, $2, $3}')" "+%s")
  if [[ $logdate -ge $start ]]; then ((i+=1))
     echo $i > /tmp/counter.$$
   else break
  fi
done
echo "Lines in $input younger than 1 hour:" $(</tmp/counter.$$)
rm /tmp/counter.$$

Open in new window

Please note that I made some modification in the handling of the counter variable (use a counter file instead) to work around an old bash blur.
CERTIFIED EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
This avoids using an external counter file:
input="/home/xyz/my.log"
start=$(date -d "1 hour ago" "+%s"); i=0
while read line; do
  if ! echo $line | grep -Eq "^[a-zA-Z]" ; then continue; fi 
  logdate=$(date -d "$(echo $line | awk '{print $1, $2, $3}')" "+%s")
  if [[ $logdate -ge $start ]]; then ((i+=1))
   else break
  fi
done <<< $(tac $input)
echo "Lines in $input younger than 1 hour:" $i

Open in new window

Note "<<<" is not a typo!
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
CERTIFIED EXPERT
Most Valuable Expert 2013
Top Expert 2013
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Author

Commented:
Thank you all.  Here is the code I will use.  It uses parts from all your comments however it executes faster than any single solution provided.
#!/bin/bash
t1=$( date "+%s.%N" )
NOW=$( date +"%s" )
while read MONTH DAY TIME DATA
do
    THEN=$( date -d "$MONTH $DAY $TIME" +"%s" 2>/dev/null )
      if [ $(( NOW - THEN )) -gt 3600 ];
       then break
      else
        COUNT=$((COUNT+1))
      fi
done <<< "$(tac $1)"
t2=$( date "+%s.%N" )
DIFF=$(echo "scale=3; ($t2 - $t1)/1" | bc)
echo "Execution time: $DIFF seconds"
echo "Lines younger than 1 hour:" $COUNT

Open in new window

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.