Select all duplicates of a column in Rails
Let's say we have a table users
and we would like to find all places where first_name
is the same:
we can always find all users where user_name
is h4ppyr0gu3
but what about all possible duplicates:
User.select(min(id) as id, first_name).group(:first_name).having('count(*) > 1')
or
User.where.not(first_name: nil).group(:first_name).having('count(*) > 1')
This returns an array of User
objects with the id
and first_name
columns.
Now that we have all duplicated names we can map over them and get the ids:
duplicated_user_names =
User.select(min(id) as id, first_name)
.group(:first_name).having('count(*) > 1')
.map(&:user_name)
and then using the previous query we can get the ids:
duplicated_user_ids.each do |x|
User.where(user_name: x).more_processing
end
We could just write the query as:
User.group(:user_name).having('count(*) > 1')
but this raises PG::GroupingError
Edit
an improvement would be to use array_agg
method to retrieve ids
duplicated_user_ids =
User.group(:first_name)
.having('count(id) > 1')
.pluck('array_agg(id)')
Last Updated: 28/07/2023
Published: 11/07/2023