Dumping and restoring data into PosgteSQL schemas

Context

I have recently implemented a multitenancy setup using Apartment and PostgreSQL Schemas. It was a legacy application with low number of high value tenants with separate databases.

Problem

Previously, the data of each tenant was stored in separate database which shared the same database structure as the main one. After changing application to support switching PostgreSQL schemas using Apartment, we needed to move the data from the legacy databases into newly created schemas.

The schemas inside the main_db was already created, so we cared only about the data.

We can do this with pg_dump with –data-only flag, and then running the dump SQL in the main database – after making slight changes.

Solution

Let’s assume we have a DB named tenants_legacy_db and we would like to move data inside main_db's legacy_dbs_equalent_tenant_schema.

To do so, we need to:

  1. Dump the DB on legacy_db

  2. Search and replace public. namespace with the schema name you would like to import into main_db

  3. Run the SQL using psql in the main DB.

So, let’s get to it.

Dumping the data

pg_dump -a -b -d main_db -U username -n public -T "ar_internal_metadata|schema_migrations" -f dump.sql

-a dump only the data, not the schema
-b include large objects in dump such as blobs
-d database to dump
-U connect as specified database user
-f output file or directory name
-n schema name
-T exclude tables regex

So, what happened here?

Because schema_migrations and ar_internal_metadata were already filled up when I created schemas using Apartment::Tenant.create('tenantname'), I did not need to dump those tables.

schema_migrations keeps the history of already ran migrations in the schema.

ar_internal_metadata keeps the environment information of the current DB Rails connects to.

This assumes that the schema name you want to dump from tenants_legacy_db is public. If not, change it according to your needs.

Search and replace the public. namespace with yourschemaname.

Now we have the dump, and as it includes the table names with public. schema as its namespace, we have to replace the occurrences with the schema name we want to import into in main_db.

We can use sed command to accomplish this task. You can also use any editor of choice.

sed -i 's/public\./schemanameinmaindb\./g' dump.sql

Importing SQL into main DB

We have now replaced the public. namespace in the dump and we can go ahead and import this SQL into the main DB. You can run:

psql -U username -d main_db_name -f dump.sql

Data should be restored into the schema name of your choice inside the main_db.

Be careful when running scripts you get from the web. Make sure to test it thoroughly on test databases before you do anything on production.

See you around!