Some odd times I find myself wanting to remove duplicates from a table and it ends up being a hassle. Here are two simple ways to do so:
The First method is using my ruby gem rearmed. It is a collection for helpful methods and monkey patches that can be safely applied on an opt-in basis. The method in rearmed is called find_duplicates
and can be used like so.
# Duplicates based on all attributes excluding id & timestamps
Model.find_duplicates
# Duplicates based on specific columns
Model.find_duplicates(:name, :description)
# Remove Duplicates
Modal.find_duplicates(:name, delete: {keep: :first})
However if you want to be able to do this without adding a gem then you can use the following method in your model to find the duplicates. Note though you must decide how to determine which items you are delete and which to keep
class Model < ApplicationRecord
def self.get_duplicates(*columns)
self.order('created_at ASC').select("#{columns.join(',')}, COUNT(*)").group(columns).having("COUNT(*) > 1")
end
def self.dedupe(*columns)
# find all models and group them on keys which should be common
self.group_by{|x| columns.map{|col| x.send(col)}}.each do |duplicates|
first_one = duplicates.shift # or pop to keep last one instead
# if there are any more left, they are duplicates then delete all of them
duplicates.each{|x| x.destroy}
end
end
end
columns = [:name, :description]
Model.get_duplicates(*columns).dedupe(*columns)
# or
Model.get_duplicates(:name, :description).dedupe(:name, :description)
Related External Links: