Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

3.5 millions of files to delete with a specific substring

Posted on 2010-08-13
13
Medium Priority
?
387 Views
Last Modified: 2012-05-10
Hi,

I have 3.5 million of files to delete containing the substring ".r13125.ovh.net,S="

How can I delete those ?

Thank you
0
Comment
Question by:matthew016
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 3
  • +2
13 Comments
 
LVL 7

Accepted Solution

by:
jhp333 earned 536 total points
ID: 33428002
Do you mean the string in the filenames?

To list the files:
find . -name '*.r13125.ovh.net,S=*'

To delete them:
find . -name '*.r13125.ovh.net,S=*' -delete

where . is the starting directory. You can use "/" if you want to start from root.
0
 
LVL 9

Author Comment

by:matthew016
ID: 33428041
r13125 ~ # find . -name '*.r13125.ovh.net,S=*' -delete
find: prédicat invalide `-delete'
r13125 ~ #

(in english : invalid predicate)
0
 
LVL 7

Expert Comment

by:jhp333
ID: 33428054
It seems your find utility is old one, not supporting -delete option, which is relatively new. In that case,

find . -name '*.r13125.ovh.net,S=*' -exec rm {} \;
0
Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

 
LVL 9

Author Comment

by:matthew016
ID: 33428149
The file number went down, but only like thousand.
Then error essage :

find: Ne peut faire un clonage (fork).: Ne peut allouer de la mémoire

aprroximate translation from french to english : find: can't clone (fork).: can't allocate memory

I tried to loop thousand times, but after the first error message, the files don't get down anymore.
0
 
LVL 9

Author Comment

by:matthew016
ID: 33428184
I created a file in /home/LIST
with the command :  find . | fgrep 'r13125.ovh.net,S=' > /home/LIST

So I have the list of files to delete (but only filename, without full path, the full path is /home/vpopmail/domains/r13125.ovh.net/postmaster/Maildir/new)
I heard it was possible to delete all the files listed in LIST file with xargs. How can achieve this exactly ?
0
 
LVL 7

Expert Comment

by:jhp333
ID: 33428340
It seems you have one or more circular links.
Find the link and delete it manually.
0
 
LVL 3

Assisted Solution

by:pitt7
pitt7 earned 532 total points
ID: 33429248
You can use the -H option to prevent following symbolic links (if they are circular).
When using -exec with find with many files you should use the following syntax:

find . -name '*.r13125.ovh.net,S=*' -exec rm {} +

Closing the -exec command with + means that the rm command is called with as many files as argument as the maximum command length size allows.
By using \; for every file a new rm command is invoked which is much slower with many files.

To delete files from a file with xargs:
xargs -a /home/LIST rm
0
 
LVL 7

Expert Comment

by:jhp333
ID: 33431952
@pitt7

"-P     Never follow symbolic links." is the default. I guess his circular links are by hard links.
0
 
LVL 80

Assisted Solution

by:arnold
arnold earned 532 total points
ID: 33432636
The large amount of files will generate errors when using the {}\; grouping.

Using jhp333 find but process one file at atime
find . -name '*.r13125.ovh.net,S=*' | while read a; do
echo Deleting $a
/bin/rm -rf $a
done
0
 
LVL 3

Expert Comment

by:pitt7
ID: 33432724
@jhp333:
Yes, that's right. Using -H is useless here.

@arnold's answer:
I strongly suggest to use a solution that doesn't spawn one rm process for each file. The author says there are 3.5million files this will take a much longer time with one process per file.
If
find . -name '*.r13125.ovh.net,S=*' -exec rm {} +
throws an error too use
find . -name '*.r13125.ovh.net,S=*' | xargs rm

xargs reads arguments from stdin and passes them over to rm. But it does not call rm for every single file but builds a command line with the maximum possible length. This results in much fewer rm processes.

(Side note:
If your filenames can contain \n newlines "find | xargs rm" will fail. In that case use:
find . -name '*.r13125.ovh.net,S=*' -print0 | xargs -0 rm
With this command the arguments are not delimited by a newline but a \0 character which can't be used in filenames.)
0
 
LVL 80

Expert Comment

by:arnold
ID: 33433889
The problem with both {}\; and xargs is that it will try to pass 3.5 million entries to rm which will generate the too many items error for the {}\; I suspect the same thing will happen with xargs.

instead of rm , unlink can be used.

At any one time one rm process will be running per file.

Another option is to use the -mtime as a filter i.e.

find . -name '*.r13125.ovh.net,S=*' -mtime +360 -exec rm {}\;
delete files that match the pattern and are more than a year old. This may reduce the number of files to be deleted per  batch

You can use a for loop to go from 360 to 90 at 5,10,20,30 day steps.

IMHO, since these many files accumulated, the example I posted with the while loop deleting one file at a time, is a good approach.

If you want to get complex i.e. delete 10,20,30 files at a time, you could do build the string that will be passed to rm every X number of files.

0
 
LVL 3

Expert Comment

by:pitt7
ID: 33434328
The error he gets is from find not rm.

find will not pass 3.5 million entries to rm because find can't. The argument length of a program is limited. This is why there are tools like xargs or terminating find -exec with +.

Just run:
find / -exec echo {} \;
[you will see one file per line, which means echo is called for every single file]
and
find / -exec echo {} + | cut -c 80
[you see very much files in one line, but not just one line, echo is called multiple times. of course only if find finds enough files.]
to see the difference (the "| cut -c 80" is to truncates the very long lines generated).
0
 
LVL 3

Assisted Solution

by:stetor
stetor earned 400 total points
ID: 33464406
Hi

I think this is a "just one time" task, so i don't think the time is a problem ...

from the shell prompt type the following in the evening before going to home :
while
  read fname
do
  rm "/home/vpopmail/domains/r13125.ovh.net/postmaster/Maildir/new/$fname"
done</home/LIST

and the next morning the directory is cleaned ;-)


0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Join Greg Farro and Ethan Banks from Packet Pushers (http://packetpushers.net/podcast/podcasts/pq-show-93-smart-network-monitoring-paessler-sponsored/) and Greg Ross from Paessler (https://www.paessler.com/prtg) for a discussion about smart network …
Google Drive is extremely cheap offsite storage, and it's even possible to get extra storage for free for two years.  You can use the free account 15GB, and if you have an Android device..when you install Google Drive for the first time it will give…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
Suggested Courses

609 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question