Introduction | Example | Tutorial | Applications
Introduction - VBA RemoveDuplicates
Use the VBA RemoveDuplicates method to remove duplicate rows from columns in Excel. The RemoveDuplicates method is a clever way to simplify a range of data.
I just got back from a wonderful trip to Ireland, so I’m going to use a little Irish data to demonstrate how to remove duplicate information from a range.
If you prefer to work with arrays instead of ranges, check out my tutorial where I teach you how to remove duplicates from an array.
Example - VBA RemoveDuplicates
Remove Rows with Duplicate Values Appearing in a Single Column
This example deletes an entire row from a range if the value in the 3rd column is a duplicate. Keep following this tutorial for an explanation of how you can adapt the code to your own project.
Sub RemoveDuplicateRows() 'Demonstrates how to use the VBA RemoveDuplicates method to remove 'the duplicate rows from a particular column in a range of data. Dim MyRange As Range Dim LastRow As Long LastRow = ActiveSheet.Range("A" & Rows.Count).End(xlUp).Row Set MyRange = ActiveSheet.Range("A1:D" & LastRow) MyRange.RemoveDuplicates Columns:=3, Header:=xlYes End Sub
Make powerful spreadsheets with our free VBA Developer's Guide It’s easy to copy and paste a macro like this, but it’s harder make one on your own. To help you make macros like this, we built a free VBA Developer Guide and a bunch of pre-built macros so you can master file I/O, arrays, strings and more.
It’s easy to copy and paste a macro like this, but it’s harder make one on your own. To help you make macros like this, we built a free VBA Developer Guide and a bunch of pre-built macros so you can master file I/O, arrays, strings and more.
Tutorial - VBA RemoveDuplicates
Let’s say you have a list of the most populated cities in Ireland. The list looks something like this, with the city name, county and population:
Instead of listing the most populated cities in the entire country of Ireland, you want to find the most populated cities in each county.
Normally, you would go through all the data in the 3rd column and delete all the rows with a duplicate county after the first instance appears. For example, you would want to delete rows 8 and 18 (rank 7 and 17) since County Dublin already appears in row 2 (rank 1). Likewise, you’d want to delete row 19 (rank 18) since County Kildare already appears in row 16 (rank 15).
This is exactly what the RemoveDuplicates example macro above does. It starts at the top of the 3rd column, the County Column, and removes all subsequent duplicates in that column. What you’re left with is a shorter list that contains the largest city in each county - all the less populated cities were deleted since the county names in the less populated cities were duplicates. This works because the data was already sorted by population.
Here’s the list of Irish cities with the duplicates removed using the RemoveDuplicates method:
Notice the list is shorter. Yes, some of the cities cross multiple counties so the example isn’t the best, but you get the idea.
More about RemoveDuplicates VBA
The RemoveDuplicates function must be applied to a particular range. In my example macro, I defined a variable
Your range doesn’t have to be defined as its own variable. The following example would have worked just as well:
Sub RemoveDuplicateRows_2() 'Demonstrates how to use the VBA RemoveDuplicates method to remove 'the duplicate rows from a particular column in a range of data. 'This example hardwires the range. Range("A1:D101").RemoveDuplicates Columns:=3, Header:=xlYes End Sub
A word of warning: once you run a macro on a range with the RemoveDuplicates argument, the duplicate data is gone for good. You won’t be able to undo it and bring it back without recreating the data yourself.
The RemoveDuplicates method accepts 2 arguments:
- Header (optional)
The first argument, Columns, is required and it’s where you put the columns that contain the duplicate information you want to clean up. Notice I said columns, plural.
If you had multiple columns you wanted to check, you would enter the first argument using the Array function, like this:
Range("A1:E101").RemoveDuplicates Columns:=Array(3, 5), Header:=xlYes
In this example, I check the third and fifth columns. If both the columns contain duplicate information, the row of data will be deleted from the range.
The second argument, Header, is an optional argument with a default value of xlNo. This is how you tell the RemoveDuplicates method if your data contains a header row or not. Acceptable values are xlYes, xlNo, and xlGuess. These are pretty self-explanatory, don’t you agree?
In this tutorial, I gave you an example of how to take a list of the most populous cities in Ireland and consolidate it down into a list of the most populous cities in each county of Ireland.
I’ve seen people use RemoveDuplicates VBA to validate lists imported from multiple sources, as well. Let’s pretend you have a list of names and email addresses from two different CSV files. You want to combine them into one, but you want to make sure each entry only appears once.
To do that, you would copy both lists into Excel - one on top of each other - and use the RemoveDuplicates method to check both columns and remove any duplicate entries.