VBA Code to Find Duplicate Files


Yesterday I was working on my computer and cleaning the drives to make some more space.  I was surprised to see so many files saved at multiple places which was consuming huge space of my computer drives.

One of the biggest tasks was to identify such duplicate files. Initially I tried to manually find those files by opening and comparing different folders, files and file size but later decided to create a VBA code to help myself.

The code really helped me; hence I thought to share it with all of you.

Here is the code to find duplicate files:

'This function lists files in a folder and sub-folders
'and mark the files as duplicates where name and size are same
'Note: The code does not delete any file
Public Sub FindDuplicateFiles()

    'Variable Declaration
    Dim objFSO As Object
    Dim objFile As Object
    Dim objFolder As Object
    Dim vFile As Variant
    Dim strPath As String
    Dim iCurRow As Integer
    
    'Remove filter (if already applied)
    Sheet1.AutoFilterMode = False
    
    'Clear old data
    Sheet1.Range("B9:E1000").ClearContents
    
    'Set the path of the folder
    strPath = Sheet1.Range("B4").Value
    
    'Validate if the given folder path is valid
    If Dir(strPath, vbDirectory) = "" Then
        MsgBox "Invalid Folder path", vbInformation
        Exit Sub
    End If
    
    'Initialize file system objects
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objFolder = objFSO.getFolder(strPath)
    
    'Set Directory to folder path
    ChDir objFolder.Path
    vFile = Dir(objFolder.Path & "\*.*") 'Change or add formats to get specific file types
    
    iCurRow = 9
    Do While vFile <> "" 'LOOP until all files in folder strPath have been looped through
        Sheet1.Cells(iCurRow, 2).Value = vFile
        Sheet1.Cells(iCurRow, 3).Value = objFolder.Path
        Set objFile = objFSO.Getfile(objFolder.Path & "\" & vFile) 'Set the object to file
        Sheet1.Cells(iCurRow, 4).Value = Round(objFile.Size / 1024, 0) 'Divide the size by 1024 to convert to KB
        vFile = Dir
        iCurRow = iCurRow + 1
    Loop
    
    'Call the function to list files in sub-folders
    Call ListFilesInSubFolder(objFolder)
    
    'Add formula to find duplicate file based on file name and file size
    Sheet1.Range("E9").Value = "=IF(COUNTIFS(B:B,B9,D:D,D9)>1,""Duplicate"","""")"
    
    'Find the row number where formula needs to be copied
    iCurRow = Sheet1.Range("B" & Rows.Count).End(xlUp).Row
    
    'Copy the formula to all records
    Sheet1.Range("E9").Copy Sheet1.Range(Sheet1.Range("E9"), Sheet1.Range("E" & iCurRow))
    
    'Calculate sheet
    Sheet1.Calculate
    
    'Sort the data based on File Name and Size
    Sheet1.Sort.SortFields.Clear  'First Clear old sort field (if any)
    'Add sort field on column B (File Name)
    Sheet1.Sort.SortFields.Add Key:=Sheet1.Range("B8:B" & iCurRow), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
    'Add sort field on column D (Size (KB))
    Sheet1.Sort.SortFields.Add Key:=Sheet1.Range("D8:D" & iCurRow), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
    With Sheet1.Sort
        .SetRange Range("B8:E" & iCurRow)
        .Header = xlYes
        .MatchCase = False
        .Orientation = xlTopToBottom
        .SortMethod = xlPinYin
        .Apply
    End With
    
    'Add filter in column E (Is Duplicate?)
    Sheet1.Range("B8:E" & iCurRow).AutoFilter Field:=4, Criteria1:="Duplicate"
    
    MsgBox "Done"
    
End Sub


'This function lists files in the sub-folder
Public Sub ListFilesInSubFolder(objFolder As Object)
        
    Dim vFile As Variant
    Dim iCurRow As Integer
    Dim objSubFolder As Object
    Dim objsubfld As Object
    Dim objFSO As Object
    Dim objFile As Object
    
    'Initialize file system object
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    
    'Find the row number where data needs to be entered
    iCurRow = Sheet1.Range("B" & Rows.Count).End(xlUp).Row + 1
    
    'Run the loop for each sub-folder
    For Each objSubFolder In objFolder.SubFolders
    
        'Set Directory to folder path
        ChDir objSubFolder.Path
        vFile = Dir(objSubFolder.Path & "\*.*")
        Do While vFile <> "" 'LOOP until all files in folder strPath have been looped through
            Sheet1.Cells(iCurRow, 2).Value = vFile
            Sheet1.Cells(iCurRow, 3).Value = objSubFolder.Path
            Set objFile = objFSO.Getfile(objSubFolder.Path & "\" & vFile)  'Set the object to file
            Sheet1.Cells(iCurRow, 4).Value = Round(objFile.Size / 1024, 0) 'Divide the size by 1024 to convert to KB
            vFile = Dir
            iCurRow = iCurRow + 1
        Loop
    Next
    'If the sub-folder contains more sub-folders then call the same function
    For Each objsubfld In objFolder.SubFolders
        Call ListFilesInSubFolder(objsubfld)
    Next
    
End Sub

There are two functions in the above code

  1. FindDuplicateFiles: This is the main function which gets file details and apply formula and filters.
  2. ListFilesInSubFolder: This is the secondary function which is called from the main function to get details from sub-folders

How can I use this VBA code?

The easiest way is to download the excel file which is at the bottom on this post and use the file to find duplicate files in your system. If you want to use this code in your VBA tool, then follow below steps:

  1. Open the Excel file in which you want to copy this code
  2. Press Alt+F11
  3. Insert a Module (Insert>Module) from menu bar
  4. Paste the code in the moduleVBA Code to Find Duplicate Files

How can I run this code?

  1. Go to ‘View’ tab on the ribbon
  2. Click on ‘Macros’ button
  3. Select ‘FindDuplicateFiles’ from the list of macros
  4. Click on ‘Run’ button
    VBA Code to Find Duplicate Files

What will happen when I run this code?

  1. The code is reading folder path from cell B4 of Sheet1, ensure that you supply correct folder path in the cell.
  2. Macro will read file details (Name, Path and Size) of each file the folder and sub-folders
  3. After that a formula will be applied in column E (Is Duplicate?) to find duplicate files based on file name and size
  4. At the last a filter will be applied in column E (Is Duplicate?) as ‘Duplicate’
  5. Macro will show confirmation to user once it gets complete
Duplicate File Remover (Format.xlsm)

Leave a Reply

avatar
VBA Code to Delete File
EXCEL VBA TRICKS
VBA Code to Delete File

In this article we are going to show you how you can delete file or files using a single line of VBA code. 1. Delete a specific file from the folder, 2. Delete specific type of files from the folder, 3. Delete all files from the folder

VBA Code to Add Border to Excel Range
EXCEL VBA TRICKS
VBA Code to Add Border to Excel Range

Through formatting the cells, you can make your Excel data more beautiful and easier to understand. If you record a macro to add borders to Excel range, you will notice that it generates 30 plus lines of code for just a small work. Here we are sharing one line of code which does the same thing.

Protect Excel Sheet for Manual Input
EXCEL VBA TRICKS
VBA Code to Protect Excel Sheet for Manual Input but Allow Programming Inputs

Did you come across any requirement where you want the user to interact with a sheet only through VBA Form? Here is a simple code which can help you.