Solved

Bash script / while loop extremely slow read file

Posted on 2014-03-13
8
703 Views
Last Modified: 2014-03-25
I have a while loop that that reads in a ftp log file and puts it into an array so I'll be able to search through the array and match up/search for a flow. Unfortunately the while loop is taking forever to get through the file, it is a very large file but there must be another faster way of doing this.

# read file into array for original search results
while read FTP_SEARCH
do
ogl_date[count]=`echo $FTP_SEARCH | awk '{print $1, $2}'`
ogl_time[count]=`echo $FTP_SEARCH | awk '{print $3}'`
ogl_server[count]=`echo $FTP_SEARCH | awk '{print $4}'`
ogl_id[count]=`echo $FTP_SEARCH | awk '{print $5}'`
ogl_type[count]=`echo $FTP_SEARCH | awk -F '[' '{print $1}' | awk '{print $5}'`
ogl_pid[count]=`echo $FTP_SEARCH | awk -F'[' '{print $2}' | awk -F']' '{print $1}'`
ogl_commands[count]=`echo $FTP_SEARCH | awk '{
    for(i = 6; i <= NF; i++) 
        print $i;
    }'`

let "count += 1"

done < /tmp/ftp_search.14-12-02

Open in new window

0
Comment
Question by:dloszewski
  • 3
  • 2
  • 2
  • +1
8 Comments
 

Author Comment

by:dloszewski
ID: 39926825
some sample from ftp_search

Dec  1 23:59:03 sslmftp1 ftpd[4152]: USER xxxxxx  
Dec  1 23:59:03 sslmftp1 ftpd[4152]: PASS password  
Dec  1 23:59:03 sslmftp1 ftpd[4152]: FTP LOGIN FROM 172.19.x.xx [172.19.x.xx], xxxxxx  
Dec  1 23:59:03 sslmftp1 ftpd[4152]: PWD  
Dec  1 23:59:03 sslmftp1 ftpd[4152]: CWD /test/data/872507/  
Dec  1 23:59:03 sslmftp1 ftpd[4152]: TYPE Image`
0
 
LVL 84

Expert Comment

by:ozo
ID: 39926948
What are you doing with ogl_date, ogl_time, ogl_server, ogl_id, ogl_type, ogl_type, ogl_commands?
What do the lines in /tmp/ftp_search.14-12-02 look like?  
This should be a little faster, but knowing more about the format of each line or what you want to do with the arrays would probably allow further improvements

while read FTP_1 FTP_2 FTP_3 FTP_4 FTP_5 FTP_6
do
ogl_date[count]="$FTP_1 $FTP_2"
ogl_time[count]=$FTP_3
ogl_server[count]=$FTP_4
ogl_id[count]=$FTP_5
ogl_type[count]=`echo $FTP_1 $FTP_2 $FTP_3 $FTP_4 $FTP_5 $FTP_6 | awk -F '[' '{print $1}' | awk '{print $5}'`
ogl_pid[count]=`echo $FTP__1 $FTP_2 $FTP_3 $FTP_4 $FTP_5 $FTP_6 | awk -F'[' '{print $2}' | awk -F']' '{print $1}'`
ogl_commands[count]=$FTP_6
let count+=1
done < /tmp/ftp_search.14-12-02
0
 
LVL 29

Expert Comment

by:MikeOM_DBA
ID: 39926960
. . .  puts it into an array so I'll be able to search through the array and match up/search for a flow. . .
There may be other alternatives, but you need to provide the requirements / expected results for the above.
0
 
LVL 84

Expert Comment

by:ozo
ID: 39926973
Given the format of http:#a39926825, this should be equivalent

# read file into array for original search results
while read FTP_1 FTP_2 FTP_3 FTP_4 FTP_5 FTP_6
do
ogl_date[count]="$FTP_1 $FTP_2"
ogl_time[count]=$FTP_3
ogl_server[count]=$FTP_4
ogl_id[count]=$FTP_5
ogl_type[count]=${FTP_5%[*}
FTP_5=${FTP_5%]*}
ogl_pid[count]=${FTP_5#*[}
ogl_commands[count]=$FTP_6
let count+=1
done < /tmp/ftp_search.14-12-02
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 

Author Comment

by:dloszewski
ID: 39926991
Basically, I have a ftp log file with above data, and I want to show the entire flow by searching username or IP. So I figured I'd read data into array, search for criteria, and then match that process id with others so I would get the entire flow.
0
 
LVL 29

Accepted Solution

by:
MikeOM_DBA earned 500 total points
ID: 39927758
Perhaps if you load the data into some database (Access/ MySQL/ Oracle/ ...) it would be quicker and then you can analyze using sql queries!

Loaded into M$ Access
0
 
LVL 61

Expert Comment

by:gheist
ID: 39941055
popular web statistics pacages have recipies for handling ftp xferlogs from popular ftp servers.
0
 
LVL 84

Expert Comment

by:ozo
ID: 39941079
> read data into array, search for criteria, and then match that process id
Depending on how you are doing this, I would think it could be faster to
search for criteria, match that process id, and then read data into array
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Suggested Solutions

Over the years I've spent many an hour playing on hardened, DMZ'd servers, with only a sub-set of the usual GNU toy's to keep me company; frequently I've needed to save and send log or data extracts from these server back to my PC, or to others, and…
Active Directory replication delay is the cause to many problems.  Here is a super easy script to force Active Directory replication to all sites with by using an elevated PowerShell command prompt, and a tool to verify your changes.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now