jayatallen
asked on
how to grep urls from apache access logs and again run them using wget - for testing purpose
hi Folks,
i want to run few urls against my application server for testing purpose.Basically these are urls from access logs and we want to test if these urls causing application slowness.
Is there any way i can grab the hits from access log and again invoke those urls using any kind of program or script.
Below is snippet from access log:
155.180.105.36 - - [09/Jul/2011:11:46:50 -0400] "GET /cm/Satellite?pagename=CYQ /Href&urln ame=cyqorg an/am/wate ress/about us HTTP/1.1" 200 29811
155.180.105.36 - - [09/Jul/2011:11:46:52 -0400] "GET /cm/Satellite?pagename=CYQ /Href&urln ame=cyqorg an/am/wate ress/about us HTTP/1.1" 200 29732
155.180.105.36 - - [09/Jul/2011:11:46:52 -0400] "GET /cm/Satellite?pagename=CYQ /Href&urln ame=cyqorg an/am/wate ress/about us HTTP/1.1" 200 29732
155.180.105.36 - - [09/Jul/2011:11:46:52 -0400] "GET /cm/Satellite?pagename=CYQ /Href&urln ame=cyqorg an/am/wate ress/about us HTTP/1.1" 200 29732
155.180.105.36 - - [09/Jul/2011:11:46:52 -0400] "GET /cm/Satellite?pagename=CYQ /Href&urln ame=cyqorg an/am/wate ress/about us HTTP/1.1" 200 29732
I was thinking to do awk and grep the url but it missing the hostname.
so, is there any way to append hostname and wget before the url and then put it in a shell script and run it .
something like
/Href&urln ame=cyqorg an/am/wate ress/about us HTTP/1.1" 200 29732
wget hostname /cm/Satellite?pagename=CYQ /Href&urln ame=cyqorg an/am/wate ress/about us HTTP/1.1" 200 29732
wget hostname /cm/Satellite?pagename=CYQ /Href&urln ame=cyqorg an/am/wate ress/about us HTTP/1.1" 200 29732
wget hostname /cm/Satellite?pagename=CYQ /Href&urln ame=cyqorg an/am/wate ress/about us HTTP/1.1" 200 29732
wget hostname /cm/Satellite?pagename=CYQ /Href&urln ame=cyqorg an/am/wate ress/about us HTTP/1.1" 200 29732
wget hostname /cm/Satellite?pagename=CYQ /Href&urln ame=cyqorg an/am/wate ress/about us HTTP/1.1" 200 29732
i want to run few urls against my application server for testing purpose.Basically these are urls from access logs and we want to test if these urls causing application slowness.
Is there any way i can grab the hits from access log and again invoke those urls using any kind of program or script.
Below is snippet from access log:
155.180.105.36 - - [09/Jul/2011:11:46:50 -0400] "GET /cm/Satellite?pagename=CYQ
155.180.105.36 - - [09/Jul/2011:11:46:52 -0400] "GET /cm/Satellite?pagename=CYQ
155.180.105.36 - - [09/Jul/2011:11:46:52 -0400] "GET /cm/Satellite?pagename=CYQ
155.180.105.36 - - [09/Jul/2011:11:46:52 -0400] "GET /cm/Satellite?pagename=CYQ
155.180.105.36 - - [09/Jul/2011:11:46:52 -0400] "GET /cm/Satellite?pagename=CYQ
I was thinking to do awk and grep the url but it missing the hostname.
so, is there any way to append hostname and wget before the url and then put it in a shell script and run it .
something like
wget hostname /cm/Satellite?pagename=CYQwget hostname /cm/Satellite?pagename=CYQ
wget hostname /cm/Satellite?pagename=CYQ
wget hostname /cm/Satellite?pagename=CYQ
wget hostname /cm/Satellite?pagename=CYQ
wget hostname /cm/Satellite?pagename=CYQ
ASKER
thank you your reply.
copied few lines from access log to a file named it test and then used given command
bash-3.00$ awk -F"\"" {printf("wget hostname %s %s\n",$2,$3)}' test
bash: syntax error near unexpected token `('
bash-3.00$
Please help.
copied few lines from access log to a file named it test and then used given command
bash-3.00$ awk -F"\"" {printf("wget hostname %s %s\n",$2,$3)}' test
bash: syntax error near unexpected token `('
bash-3.00$
Please help.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
tried ...very close..
after runing given command, one of line from output pasted below:
wget hostname GET /cm/Satellite?blobcol=urld ata&blobhe ader=image %2Fjpeg&bl obkey=id&b lobtable=M ungoBlobs& blobwhere= 11586
any way we can change it to like:
wget hostname/cm/Satellite?blob col=urldat a&blobhead er=image%2 Fjpeg&blob key=id&blo btable=Mun goBlobs&bl obwhere=11 586
I mean how i can remove GET and no spaces between hostname/cm.
thank you very much.
after runing given command, one of line from output pasted below:
wget hostname GET /cm/Satellite?blobcol=urld
any way we can change it to like:
wget hostname/cm/Satellite?blob
I mean how i can remove GET and no spaces between hostname/cm.
thank you very much.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Good, if you got the answer, then please close this question :)
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
answer provided by Guru helped..i just copied it.
Open in new window