22 Aug
Active Record upsert - Tricks Issue 16
This week, I will show you how to make data imports blazing fast with Active Record.
I recently built a subscriber import functionality into the newsletter tool I am working on. The requirements were the following:
- a CSV file is uploaded
- then on the next screen the columns of the file can be mapped to the importable attributes
- a subscriber is created unless it already exists on the list
This second part could have a few naive implementation with a query to check if a matching record already exists, or by doing a find_or_create_by with each row of the import, but luckily Active Record supports upsert and even have an upsert_all method.
What upsert does in SQL, is it either updates a record if found or creates a new one, and since it happens on the database level, it is way more performant than doing it in Ruby.
The Active Record method takes the attributes as the first parameter and a list of optional parameters:
- on_duplicate: a SQL update sentence that will be used on conflict. By default it is update
- update_only: a list of column names to update on conflict. nil by default which means it updates all columns provided in the query.
- returning: an array of attributes to return for all successfully returned records. By default it returns the primary key. This works in PostgreSQL only.
- unique_by: by default, rows are considered to be unique based on every unique index on the table, but in PostgreSQL and SQLite, you can speficy the attributes or an index name you want to use. If you provide the attribute or attributes, you have to have a unique index on them.
- record_timestamps: whether to record the timestamps on the table or not. This uses the model’s record_timestamps by default.
Now let’s see an actual example. Let’s imagine we are receiving a list of subscribers in the params and we want to upsert them:
The above code will be pretty performant.
That’s it for this week!