MD5 Comparing files

As Part of the project I am working on right now, I needed to do some optimizations on storing files. The problem was simple: If two files had the same content they should not be stored more than once.

There were two ways that this could be done. One was using the Algo provided by .net which is MD5. The code below demostrates using MD5 to calculate the hash value of a Stream using Md5(MD5CryptoServiceProvider) which is found in the System.Security.Cryptography namespace.

   

    1     private static string GetHashString(FileStream fileStream)

    2         {

    3             string hashString = String.Empty;

    4             using (MD5CryptoServiceProvider md5er = new MD5CryptoServiceProvider())

    5             {

    6                 byte[] hash = md5er.ComputeHash(fileStream);

    7                 foreach (byte hex in hash)

    8                 {

    9                     hashString += hex.ToString("x2");

   10                 }

   11             }

   12             Debug.Assert(hashString.Length < 200);

   13             return hashString;

   14         }


You can then use this  method to compare the hash of two streams (mostly file streams) . Here is some sample code that does that

    1             FileStream file1 = new FileStream("Test.txt", FileMode.Open);

    2             file1.Position = 0;

    3             string hashValue_1 = GetHashString(file1);

    4             Console.Out.WriteLine("hashValue_1 = {0}", hashValue_1);

    5             file1.Close();

    6 

    7             FileStream file2 = new FileStream("Copy of Test.txt", FileMode.Open);

    8             string hashValue_2 = GetHashString(file2);

    9             Console.Out.WriteLine("hashValue_2 = {0}", hashValue_2);

   10             file2.Close();

   11 

   12            Console.Out.WriteLine("hashValue_2.Equals(hashValue_1 = {0}",
                                                   hashValue_2.Equals(hashValue_1));

   13             Console.Read();


Where "Test.txt" and "Copy of Test.txt" have the same contents.

here is the output





Hope this helps someone




 

 

What did you think of this article?




Trackbacks
  • No trackbacks exist for this post.
Comments
  • No comments exist for this post.
Leave a comment

Submitted comments are subject to moderation before being displayed.

 Enter the above security code (required)

 Name

 Email (will not be published)

 Website

Your comment is 0 characters limited to 3000 characters.