Hi Experts. Looking for ideas.
I want to write a program to find duplicate files on my hard drive. I've researched a lot and see there is "Finding by Filename" and "Finding by Content" (File size and hashing). I'm looking for ideas on the best way to achieve this.
Is it better loading all the filenames and sizes into an array and then finding duplicates in the array, if so wouldn't there be a problem with an array with 300 000 elements? (Which is what I have on my hard drive) Or, is it better to go file by file, which looks like it would take forever?
Looking for ideas on the principle of the best way to proceed