• Status: Solved
  • Priority: Medium
  • Security: Private
  • Views: 31
  • Last Modified:

How to modify the $_.Group in this script to display a full file path

I want to find duplicate files on my local computer, based on file content.  This script on github seems to do the trick. My PowerShell skills are poor, but I  grok enough to see what it's doing and how.  

I also want print out the full path of both files instead of just the two file names.   Unfortunately that's beyond  beyond my limited PS skills, even after reading this article on group.  

Anyone know how to modify the $_.Group  to display the full paths of the two files?

(I'd be happy to post what I've tried, but it's embarrassingly NON-functional... )
0
_agx_
Asked:
_agx_
  • 2
  • 2
3 Solutions
 
footechCommented:
You can't really control what Group-Object will display in the "Group" property.  That will be determined by the properties of the objects grouped together, and whatever internals to the Group-Object cmdlet.

There are quite a few ways you might modify the script output, but it really depends on what you want the output to be.  Some choices for output could be better for some needs and worse for others.
#requires -version 3
[CmdletBinding()]
param (
    [string]
    $Path
)

function Get-MD5 {
    param (
        [Parameter(Mandatory)]
        [string] 
        $Path
    )
    # This Get-MD5 function sourced from:
    # http://blogs.msdn.com/powershell/archive/2006/04/25/583225.aspx
    $HashAlgorithm = New-Object -TypeName System.Security.Cryptography.MD5CryptoServiceProvider
    $Stream = [System.IO.File]::OpenRead($Path)
    try {
        $HashByteArray = $HashAlgorithm.ComputeHash($Stream)
    } finally {
        $Stream.Dispose()
    }

    return [System.BitConverter]::ToString($HashByteArray).ToLowerInvariant() -replace '-',''
}

if (-not $Path) {
    if ((Get-Location).Provider.Name -ne 'FileSystem') {
        Write-Error 'Specify a file system path explicitly, or change the current location to a file system path.'
        return
    }
    $Path = (Get-Location).ProviderPath
}

Get-ChildItem -Path $Path -Recurse -File |
    Where-Object { $_.Length -gt 0 } |
    Group-Object -Property Length |
    Where-Object { $_.Count -gt 1 } |
    ForEach-Object {
        "::::::"
        $_.Group |
            ForEach-Object {
                $_ |
                    Add-Member -MemberType NoteProperty -Name ContentHash -Value (Get-MD5 -Path $_.FullName)
            }

        $_.Group |
            Group-Object -Property ContentHash |
            Where-Object { $_.Count -gt 1 } | 
            ForEach-Object { $_.Group.FullName }
    }

Open in new window

Lines 40, 49, and 50 are where I made some changes.
1
 
_agx_Author Commented:
Thanks, I've been struggling through examples for about an hour to do something even close to that ;-)  

Basically I just want to print a list of dupe counts and the full paths to the files.  I can reformat it, but basically something like one of these:

Count = 2                                <=== Total count
c:\temp\myfile.ext                 <=== List of duplicate files with full path
c:\other\myfile.ext

Count = 3
c:\temp\myfileA.ext
c:\other\myfileA.ext
c:\other\otherFile.ext

==== OR =====

Count  | Hash | Fullpaths 
2 | xxxxxxxxxxxxxxxxx |  c:\temp\myfile.ext, c:\other\myfile.ext
3 | xxxxxxxxxxxxxxxxx |  c:\temp\myfileA.ext, c:\other\myfileA.ext, c:\other\otherFile.ext

Open in new window

That will be determined by the properties of the objects grouped together, and whatever internals to the Group-Object cmdlet.

Yeah, makes sense. From what I was reading, "format-table" could be used to alter the the display of the object. But ... I couldn't figure out if it's possible to alter the grouped "object" in such a way that it could also store other properties, like a list of files. Does that kind of grouping *only* produce a COUNT or could you add other properties to it, like an array of file names?
0
 
oBdACommented:
The function below will return GroupInfo objects for all duplicates. You can call it with -Verbose to see results as well as process them further.
To save resources, it will only calculate the hash for objects of the same size.
Function Find-DuplicateItem {
[CmdletBinding()]
Param(
	[string]$Path,
	[string]$Filter,
	[switch]$Recurse
)
	$sha1 = New-Object -TypeName System.Security.Cryptography.SHA1CryptoServiceProvider
	$splat = @{}
	'Path', 'Filter', 'Recurse' | ForEach-Object {If ($PSBoundParameters.ContainsKey($_)) {$splat[$_] = $PSBoundParameters[$_]}}
	Get-ChildItem @splat -File |
		Group-Object -Property Length |
		Where-Object {$_.Count -gt 1} |
		ForEach-Object {
			$_.Group |
				Select-Object -Property FullName, @{n='Hash'; e={[System.BitConverter]::ToString($sha1.ComputeHash([System.IO.File]::ReadAllBytes($_.FullName)))}} |
				Group-Object -Property Hash |
				Where-Object {$_.Count -gt 1} | ForEach-Object {
					Write-Verbose "===== Duplicates with hash '$($_.Name)' ====="
					$_.Group | Select-Object -ExpandProperty FullName | Write-Verbose
					$_
				}
		}
}
# Examples:
$dupes = Find-DuplicateItem -Path C:\Temp -Recurse -Verbose
# Plain GroupInfo objects
$dupes | Select-Object -ExpandProperty Group
# Formatted console output
$dupes.Group | Format-Table -GroupBy Hash
# Save to csv
$dupes | Select-Object -ExpandProperty Group | Export-Csv -NoTypeInformation -Path C:\Temp\dupes.csv

Open in new window

1
 
footechCommented:
You can insert the following after line 40 to display the count.
"Count = $($_.Count)"

Open in new window



The Group property (output from Group-Object) is a collection of all the objects grouped together.  It would be difficult to modify all those objects contained in that property.  But you can always create a calculated property with any info you need from information you have available to you.
Group-Object only has the properties Count, Group, Name, Values by default.
0
 
_agx_Author Commented:
Thanks all!
@footech - That helps me better understand Group.
@oBdA - Works and great example to study.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now