matthew016
asked on
3.5 millions of files to delete with a specific substring
Hi,
I have 3.5 million of files to delete containing the substring ".r13125.ovh.net,S="
How can I delete those ?
Thank you
I have 3.5 million of files to delete containing the substring ".r13125.ovh.net,S="
How can I delete those ?
Thank you
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
It seems your find utility is old one, not supporting -delete option, which is relatively new. In that case,
find . -name '*.r13125.ovh.net,S=*' -exec rm {} \;
find . -name '*.r13125.ovh.net,S=*' -exec rm {} \;
ASKER
The file number went down, but only like thousand.
Then error essage :
find: Ne peut faire un clonage (fork).: Ne peut allouer de la mémoire
aprroximate translation from french to english : find: can't clone (fork).: can't allocate memory
I tried to loop thousand times, but after the first error message, the files don't get down anymore.
Then error essage :
find: Ne peut faire un clonage (fork).: Ne peut allouer de la mémoire
aprroximate translation from french to english : find: can't clone (fork).: can't allocate memory
I tried to loop thousand times, but after the first error message, the files don't get down anymore.
ASKER
I created a file in /home/LIST
with the command : find . | fgrep 'r13125.ovh.net,S=' > /home/LIST
So I have the list of files to delete (but only filename, without full path, the full path is /home/vpopmail/domains/r13 125.ovh.ne t/postmast er/Maildir /new)
I heard it was possible to delete all the files listed in LIST file with xargs. How can achieve this exactly ?
with the command : find . | fgrep 'r13125.ovh.net,S=' > /home/LIST
So I have the list of files to delete (but only filename, without full path, the full path is /home/vpopmail/domains/r13
I heard it was possible to delete all the files listed in LIST file with xargs. How can achieve this exactly ?
It seems you have one or more circular links.
Find the link and delete it manually.
Find the link and delete it manually.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
@pitt7
"-P Never follow symbolic links." is the default. I guess his circular links are by hard links.
"-P Never follow symbolic links." is the default. I guess his circular links are by hard links.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
@jhp333:
Yes, that's right. Using -H is useless here.
@arnold's answer:
I strongly suggest to use a solution that doesn't spawn one rm process for each file. The author says there are 3.5million files this will take a much longer time with one process per file.
If
find . -name '*.r13125.ovh.net,S=*' -exec rm {} +
throws an error too use
find . -name '*.r13125.ovh.net,S=*' | xargs rm
xargs reads arguments from stdin and passes them over to rm. But it does not call rm for every single file but builds a command line with the maximum possible length. This results in much fewer rm processes.
(Side note:
If your filenames can contain \n newlines "find | xargs rm" will fail. In that case use:
find . -name '*.r13125.ovh.net,S=*' -print0 | xargs -0 rm
With this command the arguments are not delimited by a newline but a \0 character which can't be used in filenames.)
Yes, that's right. Using -H is useless here.
@arnold's answer:
I strongly suggest to use a solution that doesn't spawn one rm process for each file. The author says there are 3.5million files this will take a much longer time with one process per file.
If
find . -name '*.r13125.ovh.net,S=*' -exec rm {} +
throws an error too use
find . -name '*.r13125.ovh.net,S=*' | xargs rm
xargs reads arguments from stdin and passes them over to rm. But it does not call rm for every single file but builds a command line with the maximum possible length. This results in much fewer rm processes.
(Side note:
If your filenames can contain \n newlines "find | xargs rm" will fail. In that case use:
find . -name '*.r13125.ovh.net,S=*' -print0 | xargs -0 rm
With this command the arguments are not delimited by a newline but a \0 character which can't be used in filenames.)
The problem with both {}\; and xargs is that it will try to pass 3.5 million entries to rm which will generate the too many items error for the {}\; I suspect the same thing will happen with xargs.
instead of rm , unlink can be used.
At any one time one rm process will be running per file.
Another option is to use the -mtime as a filter i.e.
find . -name '*.r13125.ovh.net,S=*' -mtime +360 -exec rm {}\;
delete files that match the pattern and are more than a year old. This may reduce the number of files to be deleted per batch
You can use a for loop to go from 360 to 90 at 5,10,20,30 day steps.
IMHO, since these many files accumulated, the example I posted with the while loop deleting one file at a time, is a good approach.
If you want to get complex i.e. delete 10,20,30 files at a time, you could do build the string that will be passed to rm every X number of files.
instead of rm , unlink can be used.
At any one time one rm process will be running per file.
Another option is to use the -mtime as a filter i.e.
find . -name '*.r13125.ovh.net,S=*' -mtime +360 -exec rm {}\;
delete files that match the pattern and are more than a year old. This may reduce the number of files to be deleted per batch
You can use a for loop to go from 360 to 90 at 5,10,20,30 day steps.
IMHO, since these many files accumulated, the example I posted with the while loop deleting one file at a time, is a good approach.
If you want to get complex i.e. delete 10,20,30 files at a time, you could do build the string that will be passed to rm every X number of files.
The error he gets is from find not rm.
find will not pass 3.5 million entries to rm because find can't. The argument length of a program is limited. This is why there are tools like xargs or terminating find -exec with +.
Just run:
find / -exec echo {} \;
[you will see one file per line, which means echo is called for every single file]
and
find / -exec echo {} + | cut -c 80
[you see very much files in one line, but not just one line, echo is called multiple times. of course only if find finds enough files.]
to see the difference (the "| cut -c 80" is to truncates the very long lines generated).
find will not pass 3.5 million entries to rm because find can't. The argument length of a program is limited. This is why there are tools like xargs or terminating find -exec with +.
Just run:
find / -exec echo {} \;
[you will see one file per line, which means echo is called for every single file]
and
find / -exec echo {} + | cut -c 80
[you see very much files in one line, but not just one line, echo is called multiple times. of course only if find finds enough files.]
to see the difference (the "| cut -c 80" is to truncates the very long lines generated).
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
find: prédicat invalide `-delete'
r13125 ~ #
(in english : invalid predicate)