← Back to context

Comment by MajimasEyepatch

12 days ago

Docker Compose is a super easy way to run Postgres, Redis, etc. alongside your tests, and most CI platforms can either use a Compose file directly or have a similar way of running service containers alongside your tests. Example: https://docs.github.com/en/actions/using-containerized-servi...

Typically you'd keep the database container itself alive, and you would run the schema migrations once at startup. Then your test runner would apply fixtures for each test class, which should set up and tear down any data they need to run or that they create while running. Restarting the database server between each test can be very slow.

The test data is a harder problem to solve. For unit tests, you should probably be creating specific test data for each "unit" and cleaning up in between each test using whatever "fixture" mechanism your test runner supports. However, this can get really nasty if there's a lot of dependencies between your tables. (That in and of itself may be a sign of something wrong, but sometimes you can't avoid it or prioritize changing it.)

You can attempt to anonymize production data, but obviously that can go very wrong. You can also try to create some data by using the app in a dev environment and then use a dump of that database in your tests. However, that's going to be very fragile, and if you really need hundreds of tables to be populated to run one test, you've got some big problems to fix.

Property-based testing is an interesting alternative, where you basically generate random data subject to some constraints and run your tests repeatedly until you've covered a representative subset of the range of possible values. But this can be complicated to set up, and if your tests aren't fast, your tests can take a very long time to run.

I think at the end of the day, the best thing you can do is decouple the components of your application as much as possible so you can test each one without needing giant, complicated test data.