bash script

how to write a bash script to scan through all the files in a folder and sub-folders and write the full file path to a log file ?

Any refer for "http://xxxxx" string but do not include a passing parameter "yyy" in the string should be listed out ?

Tks
AXISHKAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

omarfaridCommented:
if you are looking for script to scan / list files in dir + sub-dir, then run

find /path/to/dir >> /tmp/fileslist

This can be made as script for passing dir name to the script as:

find $1 >> /tmp/fileslist

This can be run as:

scriptname /path/to/dir

other part of the question is not clear. If you can give example
AXISHKAuthor Commented:
Level1/Level2/file1: http://www.mydomain.com/about.html

Leve/1/file2: http://www.otherdomain.com/about.html

Level1/file3:http://www.otherdomain.com/about.html

"findfile mydomain" will log
1. Leve/1/file2: http://www.otherdomain.com/about.html
2. Level1/file3:http://www.otherdomain.com/about.html
woolmilkporcCommented:
cd to the directory just above "Level1", then run

D="otherdomain"
find Level1 -type f | xargs grep -oH "http://.*$D[^ ]*"

Inside the script "findfile" use

D="$1"  
instead of
D="otherdomain"

The above will display the matching part starting with "http://" up to the first space or end-of-line.
If you want to stop the match at a different or another character please let me know.

For example, to hide everything following a colon or an ampersand sign or a space (including the respective character itself) use:

find Level1 -type f | xargs grep -oH "http://.*$D[^:& ]*"
IT Pros Agree: AI and Machine Learning Key

We’d all like to think our company’s data is well protected, but when you ask IT professionals they admit the data probably is not as safe as it could be.

woolmilkporcCommented:
Sorry, re-reading your Q and particularly your subsequent comment I found my answer should have looked like this:

cd to the directory just above "Level1", then run

D="mydomain"
find Level1 -type f | xargs grep -oH "http://[^ ]*" | grep -v "$D"

Inside the script "findfile" use

D="$1"  
instead of
D="mydomain"

The above will display the matching part starting with "http://" up to the first space or end-of-line.
It will exclude all results containing "mydomain" (or what's passed as the first parameter to the script "findfile") from being displayed.

If you want to stop the match at (a) different or another character(s) please let me know.
For example, to hide everything following a colon or an ampersand sign or a space (including the respective character itself) use:

find Level1 -type f | xargs grep -oH "http://[^:& ]*" | grep -v "$D"
omarfaridCommented:
try this:

find $1 -xargs grep -v $2 >> /tmp/fileslist

This can be run as:

findfile /path/to/dir mydomain

If you don't want to have the dir name passed with the command then:

find . -xargs grep -v $1 >> /tmp/fileslist

This can be run as:

cd /path/to/dir

findfile  mydomain
AXISHKAuthor Commented:
xargs grep -oH "http://.*$1[^ ]*"


Any special mean for . and two * in the above expression ?

One more check, is it possible to include one more condition, ie

Can I list all files that either don't include http://mydomain.com or contain words "iframe" ?
omarfaridCommented:
The -e option lets you specify multiple patterns for search

The -o option gives you exact match

The -v option lets you search for not having pattern

Please see man page

http://linux.die.net/man/1/grep
woolmilkporcCommented:
Do you mean that the files shouldn't contain "iframe" anywhere or do you mean that "iframe" should not appear in the "http" line?

If "should not apear in the http line":

D="mydomain"
W="iframe"

find Level1 -type f | xargs grep -oH "http://[^ ]*" | grep -Ev "$D|$W"
or
find Level1 -type f | xargs grep -oH "http://[^ ]*" | grep -v -e "$D" -e "$W"

If "should not appear anywhere in the file" please let me know!

>> "http://.*$1[^ ]*"
      Any special mean for . and two * in the above expression ?
<<

Yep.

 "." means "any single character", the following "*" means "any number of occurrences or no occurrence of this character"
"[^ ]" means "any nonblank character", again the following "*" means "any number of occurrences or no occurrence of this character"

So we cause grep to search for and display any string
starting with "http://" followed by any number of or zero arbitrary characters
followed by what's in the first commandline parameter passed to the script ($1)
followed by any number of nonblank characters.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
AXISHKAuthor Commented:
Tks
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux

From novice to tech pro — start learning today.