Dedupe Your Models In Rails

Posted By Weston Ganger

Some odd times I find myself wanting to remove duplicates from a table and it ends up being a hassle. Here are two simple ways to do so:

The First method is using my ruby gem rearmed. It is a collection for helpful methods and monkey patches that can be safely applied on an opt-in basis. The method in rearmed is called find_duplicates and can be used like so.

class Model < ApplicationRecord

  def self.get_duplicates(*columns)
    self.order('created_at ASC').select("#{columns.join(',')}, COUNT(*)").group(columns).having("COUNT(*) > 1")
  end

  def self.dedupe(*columns)
    # find all models and group them on keys which should be common
    self.group_by{|x| columns.map{|col| x.send(col)}}.each do |duplicates|
      first_one = duplicates.shift # or pop to keep last one instead

      # if there are any more left, they are duplicates then delete all of them
      duplicates.each{|x| x.destroy}
    end
  end

end


columns = [:name, :description]
Model.get_duplicates(*columns).dedupe(*columns)
# or 
Model.get_duplicates(:name, :description).dedupe(:name, :description)

Related External Links:

Article Topic:Software Development - Ruby / Rails

Date:December 27, 2016