Select all duplicates of a column in Rails

Let's say we have a table users and we would like to find all places where first_name is the same: we can always find all users where user_name is h4ppyr0gu3 but what about all possible duplicates:

User.select(min(id) as id, first_name).group(:first_name).having('count(*) > 1')

or

User.where.not(first_name: nil).group(:first_name).having('count(*) > 1')

This returns an array of User objects with the id and first_name columns. Now that we have all duplicated names we can map over them and get the ids:

duplicated_user_names = 
    User.select(min(id) as id, first_name)
        .group(:first_name).having('count(*) > 1')
        .map(&:user_name)

and then using the previous query we can get the ids:

duplicated_user_ids.each do |x|
    User.where(user_name: x).more_processing
end

We could just write the query as:

User.group(:user_name).having('count(*) > 1')

but this raises PG::GroupingError

Edit

an improvement would be to use array_agg method to retrieve ids

duplicated_user_ids = 
    User.group(:first_name)
        .having('count(id) > 1')
        .pluck('array_agg(id)')

Last Updated: 28/07/2023

Published: 11/07/2023