VBA Code to Find Duplicate Files


Yesterday I was working on my computer and cleaning the drives to make some more space.  I was surprised to see so many files saved at multiple places which was consuming huge space of my computer drives.

One of the biggest tasks was to identify such duplicate files. Initially I tried to manually find those files by opening and comparing different folders, files and file size but later decided to create a VBA code to help myself.

The code really helped me; hence I thought to share it with all of you.

Here is the code to find duplicate files:

'This function lists files in a folder and sub-folders
'and mark the files as duplicates where name and size are same
'Note: The code does not delete any file
Public Sub FindDuplicateFiles()

    'Variable Declaration
    Dim objFSO As Object
    Dim objFile As Object
    Dim objFolder As Object
    Dim vFile As Variant
    Dim strPath As String
    Dim iCurRow As Integer
    
    'Remove filter (if already applied)
    Sheet1.AutoFilterMode = False
    
    'Clear old data
    Sheet1.Range("B9:E1000").ClearContents
    
    'Set the path of the folder
    strPath = Sheet1.Range("B4").Value
    
    'Validate if the given folder path is valid
    If Dir(strPath, vbDirectory) = "" Then
        MsgBox "Invalid Folder path", vbInformation
        Exit Sub
    End If
    
    'Initialize file system objects
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objFolder = objFSO.getFolder(strPath)
    
    'Set Directory to folder path
    ChDir objFolder.Path
    vFile = Dir(objFolder.Path & "\*.*") 'Change or add formats to get specific file types
    
    iCurRow = 9
    Do While vFile <> "" 'LOOP until all files in folder strPath have been looped through
        Sheet1.Cells(iCurRow, 2).Value = vFile
        Sheet1.Cells(iCurRow, 3).Value = objFolder.Path
        Set objFile = objFSO.Getfile(objFolder.Path & "\" & vFile) 'Set the object to file
        Sheet1.Cells(iCurRow, 4).Value = Round(objFile.Size / 1024, 0) 'Divide the size by 1024 to convert to KB
        vFile = Dir
        iCurRow = iCurRow + 1
    Loop
    
    'Call the function to list files in sub-folders
    Call ListFilesInSubFolder(objFolder)
    
    'Add formula to find duplicate file based on file name and file size
    Sheet1.Range("E9").Value = "=IF(COUNTIFS(B:B,B9,D:D,D9)>1,""Duplicate"","""")"
    
    'Find the row number where formula needs to be copied
    iCurRow = Sheet1.Range("B" & Rows.Count).End(xlUp).Row
    
    'Copy the formula to all records
    Sheet1.Range("E9").Copy Sheet1.Range(Sheet1.Range("E9"), Sheet1.Range("E" & iCurRow))
    
    'Calculate sheet
    Sheet1.Calculate
    
    'Sort the data based on File Name and Size
    Sheet1.Sort.SortFields.Clear  'First Clear old sort field (if any)
    'Add sort field on column B (File Name)
    Sheet1.Sort.SortFields.Add Key:=Sheet1.Range("B8:B" & iCurRow), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
    'Add sort field on column D (Size (KB))
    Sheet1.Sort.SortFields.Add Key:=Sheet1.Range("D8:D" & iCurRow), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
    With Sheet1.Sort
        .SetRange Range("B8:E" & iCurRow)
        .Header = xlYes
        .MatchCase = False
        .Orientation = xlTopToBottom
        .SortMethod = xlPinYin
        .Apply
    End With
    
    'Add filter in column E (Is Duplicate?)
    Sheet1.Range("B8:E" & iCurRow).AutoFilter Field:=4, Criteria1:="Duplicate"
    
    MsgBox "Done"
    
End Sub


'This function lists files in the sub-folder
Public Sub ListFilesInSubFolder(objFolder As Object)
        
    Dim vFile As Variant
    Dim iCurRow As Integer
    Dim objSubFolder As Object
    Dim objsubfld As Object
    Dim objFSO As Object
    Dim objFile As Object
    
    'Initialize file system object
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    
    'Find the row number where data needs to be entered
    iCurRow = Sheet1.Range("B" & Rows.Count).End(xlUp).Row + 1
    
    'Run the loop for each sub-folder
    For Each objSubFolder In objFolder.SubFolders
    
        'Set Directory to folder path
        ChDir objSubFolder.Path
        vFile = Dir(objSubFolder.Path & "\*.*")
        Do While vFile <> "" 'LOOP until all files in folder strPath have been looped through
            Sheet1.Cells(iCurRow, 2).Value = vFile
            Sheet1.Cells(iCurRow, 3).Value = objSubFolder.Path
            Set objFile = objFSO.Getfile(objSubFolder.Path & "\" & vFile)  'Set the object to file
            Sheet1.Cells(iCurRow, 4).Value = Round(objFile.Size / 1024, 0) 'Divide the size by 1024 to convert to KB
            vFile = Dir
            iCurRow = iCurRow + 1
        Loop
    Next
    'If the sub-folder contains more sub-folders then call the same function
    For Each objsubfld In objFolder.SubFolders
        Call ListFilesInSubFolder(objsubfld)
    Next
    
End Sub

There are two functions in the above code

  1. FindDuplicateFiles: This is the main function which gets file details and apply formula and filters.
  2. ListFilesInSubFolder: This is the secondary function which is called from the main function to get details from sub-folders

How can I use this VBA code?

The easiest way is to download the excel file which is at the bottom on this post and use the file to find duplicate files in your system. If you want to use this code in your VBA tool, then follow below steps:

  1. Open the Excel file in which you want to copy this code
  2. Press Alt+F11
  3. Insert a Module (Insert>Module) from menu bar
  4. Paste the code in the moduleVBA Code to Find Duplicate Files

How can I run this code?

  1. Go to ‘View’ tab on the ribbon
  2. Click on ‘Macros’ button
  3. Select ‘FindDuplicateFiles’ from the list of macros
  4. Click on ‘Run’ button
    VBA Code to Find Duplicate Files

What will happen when I run this code?

  1. The code is reading folder path from cell B4 of Sheet1, ensure that you supply correct folder path in the cell.
  2. Macro will read file details (Name, Path and Size) of each file the folder and sub-folders
  3. After that a formula will be applied in column E (Is Duplicate?) to find duplicate files based on file name and size
  4. At the last a filter will be applied in column E (Is Duplicate?) as ‘Duplicate’
  5. Macro will show confirmation to user once it gets complete

Leave a Reply

avatar
VBA Tips you Must Know – Part 1
VBA TRICKS
Excel VBA Tips you Must Know – Part 1

Here we are coming with one more exciting post which can help you to solve very basic but very important problems while writing VBA codes.

Random Rows Selector Tool
FREE VBA UTILITIES
Free Excel Based Random Data Sampling Tool For 2020

Random Rows Selector is an MS Excel based tool which can be used to pick random or stratified samples from a set of records available in the Excel. The tool is fully dynamic, it can support any data format in Excel.

VBA Code to Count Cells by Color
VBA TRICKS
VBA Code to Count Cells by Color

Have you ever felt that Microsoft should have added a formula in Excel which can count the cells based on specific color? I have seen many code requests to share a VBA code that can count the cells by its color. To help our subscribers and developers, we are sharing 2 codes that be used to count the cells with specific color and returns the count of the matching color cells.

error

Like the Article? Please spread the word :)

Facebook
Facebook
YouTube
YouTube
Pinterest
Pinterest
LinkedIn