Duplicate Data Identifier Tool

Duplicate Data Identifier

Duplicate data Identifier Tool

Complete Excel VBA Course

Duplicate Data Identifier is an MS Access based tool which helps to identify duplicates from any Excel based data. The tool supports up to 10 conditions and 25 types of matching conditions to find the exact duplicate. You can also define formatting conditions to first format the data before checking for duplicates

Duplicate Data Identifier Tool Features

  • Support two datasets (Current and Historic) where each record of current dataset is compared with all records in historic dataset

Duplicate Data Identifier

Benefits of Duplicate Data Identifier tool

  • Easily identify duplicates and save money and efforts
  • Increase accuracy and reliability of your data
  • Highly configurable and generates fast results

Allows two ways to import data in the tool
Manual Copy and Paste

Complete Excel VBA Course

Import from Excel file

  • It has got user friendly options to define condition for duplicates
  • The tool supports 25 types of match conditions
  • You can also define formatting conditions to first format the data before checking for duplicates

Why to waste your efforts, when we have ready professional tool designed for you. To download click below:-

Benefits

  • Easily identify duplicates and save money and efforts
  • Increase accuracy and reliability of your data
  • Highly configurable and generates fast results

System Requirements

  • Installed version of MS Access 2016 or above version
  • Installed version of MS Excel 2016 or above version
  • Windows 7 or above operating system

Duplicate Identifier Data Tool Limitations

  • As the number of records or matching conditions increases, tool may take more time to analyze data

How to use Duplicate Data Identifier tool?

Follow the below video which details out how to use this tool.
  • Open the tool in MS Access 2007 or above version
  • You may see a warning message on top because the file contains VBA Codes, click on Enable Content
    Duplicate Data Identifier
  • Double click on ‘Home’ form to open the tool
    Duplicate Data Identifier
  • You will see a blank form opened like below
    Duplicate Data Identifier
  • To use this tool for analysis, you need two datasets in Excel files.

    Data 1 (Current Data): This is the data in which you want to identify duplicates

    Data 2 (Historic Data): This is the data from which you want to compare current data to identify duplicates

    Points to Note:

    1. Both Current and Historic dataset should be in same format like sequence of columns
    2. You can import only 10 columns from the dataset in the tool
    3. Tool can read only 255 characters of each cell

    See below a sample dataset:
    Duplicate Data Identifier

  • To import data in the tool, click on ‘Manage Data’ button
    Duplicate Data Identifier
  • There are two ways you can import data in the tool

    Option 1 (Manual Copy Paste): You can simply copy your data (without headers) from any Excel file and paste in the tool

    Step 1: Copy data from Excel file
    Duplicate Data Identifier
    Step 2: Select the appropriate tab (Current or Historic) to paste the data and click on the top left section of the datasheet
    Duplicate Data Identifier
    Step 3: Press Ctrl+V to paste the data and click on ‘Yes’ button to confirm the action
    Duplicate Data Identifier
    Option 2 (Import from Excel): You can use import functionality in the tool to browse an Excel file and import data
    Step 1: Click on ‘Import from Excel File’ button
    Duplicate Data Identifier
    Step 2: Read the instructions on the form and select the dataset to be imported
    Duplicate Data Identifier
    Step 3: Browse the Excel file you want to import and Click on ‘Import Data’ button
    Duplicate Data Identifier

  • Once the data is imported, you need to define datatype of each column. By default, each column is considered as text, you need to explicitly change the datatype. It is an important step because you can define few conditions on specific datatype only. To define the datatype, select the right option for each column
    Duplicate Data Identifier
  • Now it’s time to configure the tool to identify duplicates. Since there are different ways a duplicate invoice can be processed; hence this tool comes with fully configurable conditions to catch the duplicate.
    Duplicate Data Identifier
  • This tool comes with 25 types of matches, have a look at below table which can help you to decide the right match type to be selected.
    Duplicate Data Identifier
  • Let’s start with configuring the tool to identify duplicates, first we will define Invoice Number condition as ‘Character Match [>70%]’. You can also choose other character match options depends on how much variation you are expecting in the data. As you decrease the character match percentage, you are expected to get more duplicates
    Duplicate Data Identifier
  • In some cases where you want to remove special characters such as !@#$%^&*() before comparing the data, you can use Formatting option
    Duplicate Data Identifier
  • For Text datatype, you can use ‘Remove Special Characters’ formatting option. For Number datatype, you can use ‘Remove decimal values’ and ‘Convert number to absolute’ options
    Duplicate Data Identifier
    Duplicate Data Identifier
  • Now we will define condition for Invoice Date as below. You can also choose other options as appropriate
    Duplicate Data Identifier
  • We will define condition for Vendor Name as ‘Left Match [>60%]’
    Duplicate Data Identifier
  • Next is Amount condition, for this we will define the condition as ‘Amount [+-1]’
    Duplicate Data Identifier
  • The last condition we will define for Customer Name as ‘Exact Match’
    Duplicate Data Identifier
  • Done, let’s click on ‘Analyze Data’ button and see the result.
    Duplicate Data Identifier
    Note that if you want to stop the analysis in between then you can click on the same button again. Also, you can see the progress on the bottom progress bar and percentage label
    Duplicate Data Identifier
  • Once Analysis are completed, you will see a confirmation message box along with number of duplicates found. Click on ‘OK’ button to proceed.
    Duplicate Data Identifier
  • To view the results in Excel file, click on ‘Export Report to Excel’ button
    Duplicate Data Identifier
  • Report will be divided in two sections:
    Section 1 – Current Data: These are the records from Current Data which are found as duplicate when comparing with Historic Data. You can identify them from Column A (Record Type) as ‘Current Data’. Also, these records will be marked in Orange color for easy identification
    Section 2 – Historic Data: These are the matching records from Historic Data based on which duplicates have been identified in Current Data. You can identify them from Column A (Record Type) as ‘Historic Data’.
  • Let’s have a look at a report with few more records

In the above screenshot, you can notice that there are 5 duplicates found in Current Data and there are 7 matching records from Historic Data. Each match has been given a Match Number which you can see in column B (MatchNumber). So, if you want to have a look at matching records of first duplicate then you can apply filter in Column B as 1

Similarly, to look at matching records of fifth duplicate then apply filter in Column B as 5

  • Great news, now you are ready to use the tool and save your business from duplicate payments.

Recommended Links

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *