← Back to context

Comment by mehagar

12 days ago

The catch-22 with refactoring to be able to write unit tests is that refactoring introduces risk as you are changing code, and you need tests to help reduce that risk. But you can't easily write tests without refactoring. This has been a very difficult problem for the team I'm currently on.

The only strategy I'm aware of is described in `Working Effectively With Legacy Code`, where you start by writing throwaway unit or E2E tests that give you "cover" for being able to refactor. These tests depend on the implementation or may use mocking just to get started. Then you refactor, and write better unit tests. Then get rid of the throwaway tests.

Why get rid of working e2e tests? IMO they are more useful than unit tests at finding the kinds of problems that stop a release/deployment.

You can attack from both directions: e2e tests make sure that certain processes work in fairly ordinary situations, then look for little things that you can unit test without huge refactoring. When you've pushed these as far as you can, section off some area and start refactoring it. Do your best to limit your refactoring to single aspects or areas so that you are never biting off more than you can chew. Don't expect everything to become wonderful in one PR.

Your e2e tests will catch some errors and when you look at what those commonly are then you can see how to best improve your tests to catch them earlier and save yourself time. In python I had stupid errors often - syntax errors in try-catch blocks or other things like that. If I used a linter first then I caught many of those errors very quickly.

I was working on a build system so I mocked the build - created a much simpler and shorter build - so I could catch dumb errors fast, before I ran the longer e2e test on the full build.

IMO you need to progress to your vision but trying to reach it in one step is very dangerous. Make life better piece by piece.

You can even do PRs where you only add comments to the existing files and classes (not too much detail but answering questions like "why" is this file/class here). This helps to make sure you really understand the current system is doing before you change it.

I once added type hints everywhere to a legacy python program - it wasn't as helpful as I'd hoped but it did prevent some issues while I was refactoring.