Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
385 views
in Technique[技术] by (71.8m points)

powershell - MD5 hash all files and then Compare with the Original csv

i want PowerShell after Creating MD5 Hashes for all of my Media files using the Command bellow

Get-ChildItem -Path 'D:Media 1' -Recurse -File |  
       Get-FileHash -Algorithm MD5 | 
       Export-Csv -Path 'D:MediaHashes1.csv' -UseCulture -NoTypeInformation

to Compare the new Created Mediahashes1.csv with the Original D"Originalhashes.csv and Create a new csv file notify me if they are identical or Not thanks

question from:https://stackoverflow.com/questions/65951457/md5-hash-all-files-and-then-compare-with-the-original-csv

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Use Import-Csv to read old CSV and new CSV into separate variables, then call Compare-Object to compare them. Make sure to compare on both Hash and Path to detect moved files.

$oldHashes = Import-Csv oldhashes.csv
$newHashes = Import-Csv newhashes.csv
Compare-Object $oldHashes $newHashes -Property Hash, Path | Export-Csv difference.csv

As a performance optimization, you could store the new hashes in a variable, while you are generating them. This way you wouldn't have to read the new hashes from the file back again.

The output file 'difference.csv' will be empty if CSVs are equal.

Otherwise the output file contains all differences. The column SideIndicator indicates whether a given combination of Hash and Path exists only in the oldhashes.csv (SideIndicator <=) or only in the newhashes.csv (SideIndicator =>).


Further improvements

The above code outputs duplicate paths, in case a file has been modified. Example output in table format for readability:

Path            Hash                             SideIndicator
----            ----                             -------------
C:estar.txt 73FEFFA4B7F6BB68E44CF984C85F6E88 <=
C:estaz.txt D41D8CD98F00B204E9800998ECF8427E =>
C:estfoo.txt 2AF65102DB00B80835E1578278064AD1 =>
C:estfoo.txt 37B51D194A7513E45B56F6524F2D51F2 <=

The file "foo.txt" is listed twice because it exists in old and new state but has modified content. To remove duplicate paths, we can use Group-Object to group on the Path property:

# Compare and store differences in variable $diff
$diff = Compare-Object $oldHashes $newHashes -Property Path, Hash | Sort-Object 

# Group on path to create only a single output item for each modified file
$groupedDiff = $diff | Group-Object Path | ForEach-Object {

    # The pscustomobject is the output of the ForEach-Object script block
    [pscustomobject] @{
        Path = $_.Name

        # If group has only one element, the file exists either in old or in new state.
        # Otherwise we have a modified file, which will be indicated by
        # special SideIndicator value '<>'.
        SideIndicator = if( $_.Count -eq 1 ) { $_.Group.SideIndicator } else { '<>' }
    }
}

$groupedDiff | Export-Csv difference.csv

Contents of $groupedDiff in table format:

Path            SideIndicator
----            -------------
C:estar.txt <=
C:estaz.txt =>
C:estfoo.txt <>

Now we get only a single line for "foo.txt", where <> clearly shows that the file has been modified. I didn't output the hash values because I don't see much use for them in this output format. If you need them, add a property to the [pscustomobject] and assign $_.Group.Hash.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...