We recently split our humongous database into two databases, one for web application to consume and one for backend to process with the raw data. We used Foreign Data Wrapper(FDW) extension in Postgres. This web app is built using Ruby on Rails. We made changes to the appropriate models to accommodate the FDW changes.

We deployed to “production” ….Bang! We had the web app hung and the web server was “crashing” on all the 3 servers.

We grabbed the slow performing query and did a “explain analyse”

and came to knew the “foreign” scan on the foreign database was taking forever and it couldn’t complete.

We test running the same query on the foreign DB host (locally) and it just took 670 ms and here’s the explain analyse: https://explain.depesz.com/s/Ek1M

When running the same query from another host (foreign DB/host) and it took almost 22000 ms. Here’s the explain analyse: https://explain.depesz.com/s/gMwd

By now, it was almost 2AM in the early morning and since it was “production” we can’t afford to have the web app down for a long while. So we turned up with the right solution i.e. we figured out that  use_remote_estimate was not used along with the FDW extension.

We went ahead and recreated the extension along with the “use_remote_estimate” option

and again we ran the analyse command

on the foreign database tables. After running the ANALYSE on the tables in the foreign database, the site was up and running and it was all good

and we cheered up ourselves and had a good and peaceful sleep ?