How to test without mocking

9 days ago (amazingcto.com)

If you're writing a CRUD app and mocking your database calls instead of just starting an actual Postgres instance before running the tests, you're probably using mocking wrong.

If you're writing a custom frontend for GitHub using the GitHub API and don't bother writing a decent set of mocks for how you expect the GitHub API to behave, your app will quickly require either full manual QA at best or become untestable at worst. Some APIs are very stable, and testing against the API itself can hit rate limiting, bans, and other anti-abuse mechanisms that introduce all kinds of instability to your test suite.

Use the right tools to solve your problems.

  • > If you're writing a custom frontend for GitHub using the GitHub API and don't bother writing a decent set of mocks for how you expect the GitHub API to behave, your app will quickly require either full manual QA at best or become untestable at worst. Some APIs are very stable, and testing against the API itself can hit rate limiting, bans, and other anti-abuse mechanisms that introduce all kinds of instability to your test suite.

    I've been doing E2E testing using 3rd-party APIs for a decade now, and this has yet to be a significant problem. The majority of my APIs had a dedicated sandbox environment to avoid "rate limiting, bans, and other anti-abuse mechanisms". The remainder were simple enough that the provider didn't care about users exploring on the live API, and were usually read-only as well.

    Did I run into the occasional flaky failure, or API stability issues? Sure. But it was very rare and easy to workaround. It never devolved into becoming "untestable" or "full manual QA"

    My other teams that relied on mocks suffered from far worse problems - a ton of time being spent on manual-QA, and bugs that leaked into production, because of mock-reality mismatches.

    • There are plenty of libraries out there, like VCR, that can set up a test and then save the response for future test runs. You don't really have to renew them that often either.

      That was always the go-to for me when testing against 3rd party services, especially because the tests would then survive the offboarding of the engineer who set them up with their personal credentials.

      If your test suite relies on live Github PATs or user-specific OAuth access tokens, then you can either figure out how to manage some kind of service account with a 'bot' user, or live with things breaking every time someone leaves the org.

      Services that incur a per-request charge, or consume account credits, are another problem. Especially if they don't have sandboxes.

    • Outside of the payments industry I haven't encountered many sandbox APIs that don't have rate-limits, what are some good ones you've seen of those?

    • I have a custom frontend for GitHub using the GitHub API (https://github.com/fastai/ghapi/) and don't use mocks - I test using the real API. I've had very occasional, but not enough to ever cause any real issues.

      I don't find mocks for this kind of thing very helpful, because what you're really testing for are things like changes to how an API changes over time -- you need real API calls to see this.

    • Yeah, even if there's no sandbox mode, a separate sandbox account will usually do. Sometimes this catches misuse that would've caused rate-limiting in prod. And if a service makes this hard, maybe you shouldn't use it in prod either.

  • When you write tests with mocks you almost always at some point end up with tests that test your mocks lol, and tests that test that you wrote the tests you think you wrote -- not the software itself.

    I’ve never been thrilled by tests that rely on mocking — it usually means you need to re-express your module interface boundary.

    Mocks for me fall into the class of software I affectionately call “load-bearing paint.” It’s basically universally the wrong tool for any given job but that really doesn’t stop people. Putting in a data class or similar model object and a delegate is usually sufficient and a much better tool.

    • > It’s basically universally the wrong tool for any given job but that really doesn’t stop people.

      I find mocks useful for testing conditions that are on the periphery and would be a decent amount of trouble to set up. For instance, if I have a REST controller that has a catch all for exceptions that maps everything to a 500 response, I want a test that will cause the DAO layer to throw an exception and test that the rest of the "real stack" will do the translation correctly. A mock is the easiest way to accomplish that.

      1 reply →

    • I agree that if you need to write mocks, it's likely that your interfaces are poorly defined. This is one of the claimed benefits of test driven development - writing the tests first forces you to design the code in a way that cleanly separates modules so they can be tested.

      8 replies →

  • Re. postgres, this is actually something I have always struggled with, so would love to learn how others do it.

    I’ve only ever worked in very small teams, where we didn’t really have the resources to maintain nice developer experiences and testing infrastructure. Even just maintaining representative testing data to seed a test DB as schemas (rapidly) evolve has been hard.

    So how do you

    - operate this? Do you spin up a new postgres DB for each unit test?

    - maintain this, eg have good, representative testing data lying around?

    • Docker Compose is a super easy way to run Postgres, Redis, etc. alongside your tests, and most CI platforms can either use a Compose file directly or have a similar way of running service containers alongside your tests. Example: https://docs.github.com/en/actions/using-containerized-servi...

      Typically you'd keep the database container itself alive, and you would run the schema migrations once at startup. Then your test runner would apply fixtures for each test class, which should set up and tear down any data they need to run or that they create while running. Restarting the database server between each test can be very slow.

      The test data is a harder problem to solve. For unit tests, you should probably be creating specific test data for each "unit" and cleaning up in between each test using whatever "fixture" mechanism your test runner supports. However, this can get really nasty if there's a lot of dependencies between your tables. (That in and of itself may be a sign of something wrong, but sometimes you can't avoid it or prioritize changing it.)

      You can attempt to anonymize production data, but obviously that can go very wrong. You can also try to create some data by using the app in a dev environment and then use a dump of that database in your tests. However, that's going to be very fragile, and if you really need hundreds of tables to be populated to run one test, you've got some big problems to fix.

      Property-based testing is an interesting alternative, where you basically generate random data subject to some constraints and run your tests repeatedly until you've covered a representative subset of the range of possible values. But this can be complicated to set up, and if your tests aren't fast, your tests can take a very long time to run.

      I think at the end of the day, the best thing you can do is decouple the components of your application as much as possible so you can test each one without needing giant, complicated test data.

    • We use TestContainers for this, and it's superb. It's a full instance of the DB, started for each unit test, running inside a docker container. TC does smart things to make sure it doesn't slow the suite too much.

      we have the same strategy for testing against Kafka., etc.

      where we care about data, we seed the db with data for a specific group of tests. Otherwise, we just nuke the db between each test.

      Prior to doing this, we'd use in-memory db for tests, and a real db for runtime, using JPA / Hibernate to make things transferrable. But this was leaky, and some things would pass in tests then fail at runtime (or vice versa)

      TestContainers has been so much better, as we're running against a real version of the database, so much smaller chance of test and runtime diverging.

    • TestContainers, or just assume there is a postgres running locally.

      > - maintain this, eg have good, representative testing data lying around?

      This can be tricky, but usually my advice is to never even being trying to do write seed data in the database unless its very static. It just gets annoying to maintain and will often break. Try to work out a clean way to setup state in your tests using code, and do not rely on magic auto increment ids. Some of the more effective ways I have found is to f.ex. have every test create a fresh customer, then the test does work on that customer. Avoid tests assuming that the first object you create will get id == 1, makes it very annoying to maintain.

      1 reply →

    • > operate this? Do you spin up a new postgres DB for each unit test?

      Generally I've seen a new database (schema in other dbs?) in postgres that is for testing, i.e "development_test" vs "development". The big thing is to wrap each of your tests in a transaction which gets rolled back after each test.

      > maintain this, eg have good, representative testing data lying around

      This is much harder. Maintaining good seed data - data that covers all the edge cases - is a large amount of work. It's generally easier to leave it up to each test to setup data specific to their test case, generalizing that data when possible (i.e if you're testing login endpoints, you have all your login test cases inherit from some logic specific data setup, and they can tweak as needed from there). You will end up with duplicated test setup logic. It's not that bad, and often you don't really want to DRY this data anyways.

      That being said, if you have the time and resources to maintain seed data it's absolutely a better way to go about it. It's also beneficial outside of tests.

      10 replies →

    • Create one DB for the whole test suite, and then re-instantiate tables/schemas on every unit test.

    • I have tried various approaches and here's what worked best, assuming that there is some natural way to partition most of the data (e.g. per account):

      1. Init the DB with some "default" data - configuration, lookup tables, etc

      2. Each test in the test suite owns its data. It creates a new account and inserts new records only for that account. It can for example create users on this account, new entities, etc. It can run multiple transactions, can do rollbacks if needed. It is important to only touch the account(s) created by the test and to avoid touching the initial configuration. There's no need to clean up the data after the test finishes. These tests can run concurrently.

      3. Create a separate integration test suite which runs sequentially and can do anything with the database. Running sequentially means that these tests can do anything - e.g. test cross-account functionality, global config changes or data migrations. In practice there aren't that many of those, most tests can be scoped to an account. These tests have to clean up after themselves so the next one starts in a good state.

      Other approaches had tons of issues. For example if each test is wrapped with a transaction which is later rolled back then testing is very limited - tests cannot use transactions on their own. Savepoints have similar issue.

    • At several places I worked at, we would snapshot the production DB, and use that for testing. You cannot get more ”real-world“ than that. We would also record real requests, and replay them (optionally at increased speed) for load testing.

      Obviously, there are some caveats, e.g.:

      * While this approach works perfectly for some tests (load testing, performance testing, …), it does not work for others (e.g. unit testing).

      * You have to be careful about PII, and sanitize your data.

    • I run a replicated copy of the production database on top of zfs and snapshot it before starting tests. PostgreSQL takes a few seconds to start on the snapshot and then you're off to the races with real production data. When the test suite finishes, the snapshot is discarded. This also ensures that migrations apply correctly to the production db before an actual prod is used.

    • I feel that trying to maintain "representative testing data" is generally not a good idea; set up the data you want/need in the test instead.

      Just run PostgreSQL on your local machine, connect to that, setup a new schema for every test (fairly cheap-ish) inside a test database.

        def Test1:
          setupdb()
          obj1 = createObj1()
          obj2 = createObj2()
          have = doStuff(obj1, obj2)
          if have != want: ...
      
        def Test1:
          setupdb()
          obj = createObj1()
          have = doOtherStuff(obj1)
          if have != want: ...
      

      Creating reasonably scoped reasonably contained "unit-y tests" like this means you will actually be able to understand what is going on. Too often have I seen people set up huge wads of "mock data" and then run all their tests on this. Then Test1 does something Test2 doesn't expect and you're screwed. Or worse: Test42 does something that screws Test185. Good luck with that. Or you introduce a regression somewhere and now you've got tons of data to understand.

      1 reply →

    • The ideal experience is that you anonymize prod and sync it locally. Whether it's for testing or debugging, it's the only way to get representative data.

      When you write mock data, you almost always write "happy path" data that usually just works. But prod data is messy and chaotic which is really hard to replicate manually.

      This is actually exactly what we do at Neosync (https://github.com/nucleuscloud/neosync). We help you anonymize your prod data and then sync it across environments. You can also generate synthetic data as well. We take care of all of the orchestration. And Neosync is open source.

      (for transparency: I'm one of the co-founders)

    • Great answers below (test containers for example).

      However, it’s not always possible.

      For example:

      - you use oracle db (takes minutes to start, license, hope the containers run on ARM fine, etc.) - sometimes an in memory fake is just much faster, and can be an official db on its own for people to try the product - your storage might be only available through a library by a third party provider that is not available locally.

      1 reply →

    • I've been on teams where we've done this (very successfully in my opinion!) by creating helper code that automates creating a separate Postgres schema for each test, running all migrations, then running your test function before tearing it all down again. This all runs on CI/CD and developer machines, no credentials to any actual environments.

      A major benefit of doing separate schemas for each test is that you can run them in parallel. In my experience, unless you have a metric ton of migrations to run for each test, the fact that your database tests can now run in parallel makes up (by a lot!) for the time you have to spend running the migrations for each test.

      EDIT: usually we also make utilities to generate entities with random values, so that it's easy to make a test that e.g. tests that when you search for 5 entities among a set of 50, you only get the 5 that you know happen to match the search criteria.

      1 reply →

    • My integration tests expect the db to run. If I need fixture data, those are sql and read in at the start of the suite. Each test uses its own temp db/tables and/or clears potentially old data before running.

    • Firstly, I'm seeing all these answers that say spin up a new server, and I have to wonder "WTF?"

      No need to spin up a new server, not in a container, not in a new directory, not at all. It's pointless busywork with too many extra points of failure.

      Nothing is stopping you using an existing server and creating a new DB, which takes about 1/100th the time that starting up a new server (whether in Docker or otherwise) takes.

      Secondly, I don't actually do unit-testing on the database layer - there's little point in it. I test workflows against the database, not units!

      What I do is create multiple 'packages' of tests, each with multiple tests. A single 'package' creates a temp db, runs its tests sequentially and then drops the temp db. Each package will setup itself with SQL statements.

      This lets the tests perform tests of actual workflows, instead of testing in isolation that an object can be (de)serialised. IOW, I can test that the sequence of `addUser(); setProfilePassword(); signIn(); viewInfo(); signOut();` work as expected, and that `removeUser(); signIn();` fail with the correct error.

    • > So how do you > ... > - maintain this, eg have good, representative testing data lying around?

      This one can be very easy, depending on the kind of data you're working with. Many places shall simply dump a part (or the whole if it's not too big) of the production DB into dev and pre-prod environments.

      Now if there are sensitive, non-encrypted, data that even the devs cannot see, than it can get tricky (but then arguably they cannot see the logs in the clear either, etc.).

      But yeah: a recent dump of the prod DB is good, representative data.

      I've worked at places where pre-prod had a daily dump of the prod DB. Simple.

  • Also, don't create any interfaces you wouldn't mock. I've seen too many people waste months creating some kind of "database wrapper."

  • > If you're writing a CRUD app and mocking your database calls instead of just starting an actual Postgres instance before running the tests,

    Actually that's wrong too. The production database will be different than the "testing Postgres instance", leading to bugs.

    It turns out that whatever testing solution you use, if it's not the actual production instance and you're not using real production data, there will be bugs. Even then there's still bugs.

    This is the simple truth: you can't catch all the bugs. Just put in Good Enough testing for what you're doing and what you need, and get on with life. Otherwise you will spend 99% of your time just on testing.

    • This doesn't mean the solution of testing with a separate Postgres instance is wrong.

    • > The production database will be different than the "testing Postgres instance", leading to bugs.

      It never happened to me to be honest. This reads an argument for "if you can’t do perfect, just do it badly" but it’s nonsense. Running tests against a local Postgres instance with the same major.minor version and same extensions as your prod instance WILL work.

      And testing your storage layer against the database is probably the most reliable safety net you can add to an app.

      4 replies →

  • One of the nice things about the .NET ORM EntityFramework is that you can swap a mocked in-memory database for your prod DB with dependency injection, so without modifying your code at all and theoretically without affecting the behavior of the ORM. Which is to say, you're right, it's about using the right tools. Those tools of course vary by ecosystem and so in some cases mocking the database is in fact the correct decision.

    • Probably the single most obnoxious production defect I ever found related to a database would never have made it into production if we had been using a real database instead of a test double. It happened because the test double failed to replicate a key detail in the database's transaction isolation rules.

      After figuring it out, I swapped us over to running all the tests that hit the database against the real database, in a testcontainer, with a RAM disk for minimizing query latency. It was about a day's worth of work, and turned up a few other bugs that hadn't bit us in production yet, too. Also sailing past our test suite because the test double failed to accurately replicate the behavior in question.

      Total time to run CI went up by about 10 seconds. (For local development you could chop that way down by not starting a fresh server instance for every test run.) Given how many person-hours we spent on diagnosing, resolving, and cleaning up after just that first defect, I estimated the nominally slower non-mocked tests are still a net time saver if amortized over anything less than about 50,000 CI runs, and even then we should probably only count the ones where an engineer is actually blocking on waiting for the tests to complete.

      That said, there was a time when I thought test doubles for databases was the most practical option because testing against real databases while maintaining test isolation was an unholy PITA. But that time was 5 or 6 years ago, before I had really learned how to use Docker properly.

      7 replies →

  • > Some APIs are very stable, and testing against the API itself can hit rate limiting, bans, and other anti-abuse mechanisms that introduce all kinds of instability to your test suite.

    Those rate limits, bans, and other anti-abuse mechanisms are things that would be good to uncover and account for during tests. Better for the test suite to detect those potential failures than the production deployment :)

  • And if you have to mock, at least try to have somebody else write the mock. Testing your understanding of GitHub's API against your understanding of GitHub's API isn't useful. Testing your interpretation of the API behavior against somebody else's interpretation provides a lot more value, even if it isn't nearly as good as testing against the actual API.

  • Clearly mocking DB is a footgun and it's not that hard to setup e2e test. Use TestContainer or Docker on a random port, run your API on a random port.

    Every tests seeds all the data needed to run (user, org, token), it requires an initial setup but then you just reuse it everywhere, and voila. No side effects, no mock to maintain, it also test your auth and permissions, almost 1:1 with prod.

    • > No side effects, no mock to maintain, it also test your auth and permissions, almost 1:1 with prod.

      Can also be used to test version updates of your DB.

Tests are a tool for you, the developer. They have good effects for other people, but developers are the people that directly interact with them. When something fails, it's a developer that has to figure out what change they wrote introduced a regression. They're just tools, not some magic incantation that protects you from bugs.

I think the author might be conflating good tests with good enough tests. If IOService is handled by a different team, I expect them to assure IOService behaves how it should, probably using tests. The reason we're mocking IOService is because it's a variable that I can remove, that makes the errors I get from a test run MUCH easier to read. We're just looking at the logic in one module/class/method/function. It's less conceptually good to mock things in tests, since I'm not testing the entire app that we actually ship, but full no-mocks E2E tests are harder to write and interpret when something goes wrong. I think that makes them a less useful tool.

The thing I do agree on, is assuming your mocks should only model the happy path. I'd say if something can throw an exception, you should at least include that in a mock. (as a stubbed method that always throws) but making the burden of reimplementing your dependancies mandatory, or relying on them in tests is going to mean you write less tests, and get worse failure messages.

Like everything, it depends eh?

  • This 100%. I'm not sure how the author managed to create consistent failure cases using real service dependencies, but in my code I find mocks to be the easiest way to test error scenarios.

  • With I/O in general, I've observed that socket, protocol, and serialization logic are often tightly coupled.

    If they're decoupled, there's no need to mock protocol or serialization.

    There's some cliché wrt "don't call me, I'll call you" as advice how to flip the call stack. Sorry, no example handy (on mobile). But the gist is to avoid nested calls, flattening the code paths. Less like a Russian doll, more like a Lego instructions.

    In defense of mocks, IoC frameworks like Spring pretty much necessitate doing the wrong thing.

  • > E2E tests are harder to write and interpret when something goes wrong.

    If the test is hard to debug when it goes wrong, then I assume the system is hard to debug when something goes wrong. Investing in making that debugging easy/easier unlocks more productivity. Of course it matters on how often bugs show up, how often the system changes, the risks of system failure on the business, etc. it may not be worth the productivity boost to have a debuggable system. In my cases, it usually is worth it.

    • I think it's always going to be harder to debug 1 thing, versus everything, regardless of how a system is built. If you're not mocking anything, then anything could have gone wrong anywhere.

      But also, if you're able to fix things effectively from E2E test results due to a focus on debug-ability, then that's great! I think it's just the framing of the article I have trouble with. It's not an all or nothing thing. It's whatever effectively helps the devs involved understand and fix regressions. I haven't seen a case where going all in on E2E tests has made that easier, but I haven't worked everywhere!

I don't think mocking is an anti-pattern. Using only unit tests and then mocking everything probably is.

Mocks have a perfectly viable place in testing. They help establish boundaries and avoid side effects that are not pertinent to the logic being tested.

I would reference the testing pyramid when thinking about where to be spending time in unit tests vs. integration tests vs. end to end tests. What introduces risk is if we're mocking behaviors that aren't being tested further up the pyramid.

  • I think the better advice that goes with the spirit of the article is, prioritize integration testing over unit testing if you're constrained on time.

  • I like the testing pyramid specifically because it captures the tradeoffs between the different kinds of tests. Mocks can come in handy, but like anything else can be abused. We need a "Mock this, not that" kind of guide.

I used to love mocks, once upon a time. Nowadays, though, I internally sigh when I see them.

I've come to the opinion that test doubles of any kind should be used as a last resort. They're a very useful tool for hacking testability into legacy code that's not particularly testable. But in a newer codebase they should be treated as a code smell. Code that needs mocks to be tested tends to be code that is overly stateful (read: temporally coupled), or that doesn't obey the Law of Demeter, or that does a poor job of pushing I/O to the edge where it belongs. And those are all design elements that make code brittle in ways that mocking can't actually fix; it can only sweep it under the carpet.

  • At some point, you need to interact with something that looks like I/O or an external service. Not handling failures from them is a source of a lot of bugs.

    Even if pushed to the periphery, how do you test the wrapper you built to hide these failures from the rest of your code base? If you don’t hide these failures in some wrapper, how do you test that your system handles them properly?

  • I think the answer, like most things, is "it depends". Specifically it depends on the complexity of the thing you're mocking. Mocking a database is a bad idea because there's a ton of complexity incumbent in Postgres that your mocks are masking, so a test that mocks a database isn't actually giving you much confidence that your thing works, but if your interface is a "FooStore" (even one that is backed by a database), you can probably mock that just fine so long as your concrete implementation has "unit tests" with the database in the loop.

    Additionally, mocking/faking is often the only way to simulate error conditions. If you are testing a client that calls to a remote service, you will have to handle I/O errors or unexpected responses, and that requires mocking or faking the remote service (or rather, the client side transport stack).

    But yeah, I definitely think mocks should be used judiciously, and I _really_ think monkeypatch-based mocking is a travesty (one of the best parts about testing is that it pushes you toward writing maintainable, composable code, and monkey patching removes that incentive--it's also just a lot harder to do correctly).

  • fully agree with you, mock is a shoehorn to fit unit tests into stateful monoliths with messed up dependencies that cross several stacks and reused in multiple modules.

    with better separation of concerns and separation of compute from IO one should not need mocks.

    only unit tests + integration e2e tests

  • Seconded. I shudder when I think back to the "test driven development" days when I wrote so much throwaway test code. Later, you try to refactor the app and it's another 50% of effort to update all the tests. The solution is to avoid it in the way you described.

Bad article.

Some of the advice is good, like decoupling I/O and logic where that makes sense. But the general idea of mocking being an anti-pattern is overreach.

This kind of thinking is overly rigid/idealistic:

> And with Postgres you can easily copy a test database with a random name from a template for each test. So there is your easy setup.

> You need to test reality. Instead of mocking, invest in end-to-end (E2E) testing.

"Easily" is like "just." The ease or difficulty is relative to skill, time, team size, infrastructure, and so on.

As for testing reality, sure. But there's also a place for unit tests and partial integration tests.

In some situations, mocking makes sense. In others, full E2E testing is better. Sometimes both might make sense in the same project. Use the right tool for the job.

  • I've worked with a lot of pro-mocking engineers, and the time they spent on mocks easily outstripped the time a good build engineer would have spent creating a fast reusable test framework using real databases/dummy services/etc. The mocks won not because they were better or more efficient, but because of lack of deeper engineering skill and cargo culting.

    • > the time they spent on mocks easily outstripped the time a good build engineer would have spent creating a fast reusable test framework

      This goes back to team size and skills. Not all teams have build engineers. And not all mocks are so complicated that they take up that much time.

      Again, it depends on the scope and the resources. The article goes too far by calling mocking an anti-pattern. It simply isn't.

      2 replies →

    • That engineer may have spent a couple of dozen hours on their mock. But the engineers who spent time on a test framework that uses real databases will soak up thousands of developer hours in CI time over the next decade.

      2 replies →

For some reason this article gives me flashbacks to the new CTO who comes in and declares 'micro-services' or 'containers' as the perfect solution for some problem that no one has actually run into. The article's author has had their pain points, but it doesn't mean all mocking is bad everywhere in every use case.

I wrote some code recently that detects cycle errors in objects with inheritance and I mocked the DB calls.

- Did I test for DB failures? No, but that's not the goal of the tests.

- Could I have refactored the code to not rely on DB calls? Yes, but every refactor risks the introduction of more bugs.

- Could I have launched a temporary DB instance and used that instead? Yes, but there's no obvious reason that would have been easier and cleaner than mocking DB calls.

In python it wasn't hard to implement. It was the first time I'd used the mock library so naturally there was learning overhead but that's unavoidable - any solution would have learning overhead.

> Modelling the happy path is great for refactoring - even a necessity, but doesn’t help with finding bugs.

This is a common misconception (one that I also initially held). Unit tests aren't meant to find bugs, they're meant to protect against regressions, and in doing so, act as a documentation of how a component is supposed to behave in response to different input.

  • > Unit tests aren't meant to find bugs, they're meant to protect against regressions

    That hasn't been the general consensus on unit tests for at least 30 years now. Regression tests are a small subset of tests, typically named for an ID in some bug tracker, and are about validating a fix. The majority of unit tests catch issues before a bug is even opened, and pretty much any random developer you talk to will consider that to be the point.

    • > Regression tests are a small subset of tests, typically named for an ID in some bug tracker, and are about validating a fix.

      This is how I also tend to think of them, but it's not how the phrase is generally used. The general meaning of regression tests it to ensure known correct functionality doesn't break with a future change. There's no actual requirement it be tied to a known bug.

    • They do not "find" bugs in the way that exploratory testing or user operation might (or even in the way that broader integration tests might), that is they don't find bugs that are not in the known problem space. But they are very good at proving a method works correctly and covers the known execution permutations.

    • The majority of unit tests catch issues before a bug is even opened

      The "issue" that is being caught is the bug the parent is talking about, not a "bug" in JIRA or something.

  • There's a few issues with this IMO:

    1. Changes often require changing the functionality of a component, which means many of the current unit tests are bunk and need to be updated. Changes that are simply refactoring but should retain the same behavior, need to update/rewrite the tests, in which case again often requires significant refactoring of the existing tests.

    2. Small isolated changes usually require testing everything which in a big org is very time consuming and slows down builds and deploys unnecessarily.

    3. A lot of false confidence is instilled by passing unit tests. The tests passed, were good! Most of the production bugs I've seen are things you'd never catch in a unit test.

    I really can't imagine a large refactor where we wouldn't end up rewriting all the tests. Integration tests are much better for that imo, "units" should be flexible.

    • Yes changing contracts implies updating tests. They should.

      Refactoring under the same contract should not lead to refactoring of tests. Unless of course you introduce a new dependency you have to mock ? That's just one example.

      If your code changes a lot it has nothing to do with tests being hard to change. It has to do with the code it tests changes too often. Poor contracts perhaps.

      And just like the parent comment. Tests are not about finding or solving bugs, they are about regressions and making sure your contracts are correctly implemented.

      1 reply →

  • I think writing tests as a form of documentation is a waste of time. If I'm using a component I don't want to read unit tests to figure out what it should do.

    Unit tests are most often used to cover a few more lines that need coverage. That's the value they provide.

    • A well designed API will generally allow users to understand usage without any additional documentation, sure. However, those who modify the API in the future will want to know every last detail that you knew when you were writing it originally. That must be documented to ensure that they don't get something wrong and break things – and for their general sanity. That is, unless you hate future developers for some reason.

      You could do it in Word instead, I suppose, but if you write it in code then a computer can validate that the documentation you wrote is true. That brings tremendous value.

      4 replies →

  • On that topic, static type checking can effectively be seen as a "unit test" that tests as well as documents the expected types for an interface.

Maybe I am missing something, but how else would I test various exception handling paths?

There is a whole world of errors that can occur during IO. What happens if I get a 500 from that web service call? How does my code handle a timeout? What if the file isn't found?

It is often only possible to simulate these scenarios using a mock or similar. These are also code paths you really want to understand.

  • Put a small data interface around your IO, have it return DATA | NOT_FOUND etc.

    Then your tests don't need behavioral mocks or DI, they just need the different shapes of data and you test your own code instead of whatever your IO dependency is or some simulation thereof.

    • Sure. This is a good practice for multiple reasons. However, the code that glues my interface to the underlying I/O is still there and needs testing, right?

      I agree with you in general. But it always feels like there are spots where a mock of some kind is the only way to cover certain things.

  • I don't think there's a disagreement; the author states "Whenever I look at mocks, they mostly have the same problem as all unit tests that I see, they only model the happy path". So by corollary their opinion of the correct usage of mocking would also include modelling brokenness.

Your mock object is a joke; that object is mocking you. For needing it.

~ Rich Hickey

  • Don’t really care about advice from the guy who invented an incredibly non pragmatic programming language. I also honestly had to look up who he was. So his sage advice hasn’t brought him much fame.

The problem is that mocks are normally used to avoid testing services/outside code.

For example making a wrapper around a message system if you don't mock, it tests both your code and the message system.

However the the overhead of keeping the mocking system up to date is a pain in the balls.

  • Testing both your code and the message system is exactly what you want, since if the message system is broken in a way that upstream didn't catch, you want to learn about it during testing and not production, if possible.

    • I’m still mad about the time I was told to mock a payment gateway in tests even though they had a testing environment and then got a slew of bug reports from people whose company names had punctuation (and thus failed the name validation the payment gateway was secretly running).

      3 replies →

    • Keep in mind that there are different kinds of testing. What Beck called unit tests and integration tests.

      Unit tests are really for purposes of documentation. They show future programmers the intent and usage of a function/interface so that others can figure out what you were trying to do. Mocking is fine here as future programmers are not looking to learn about the message system here. They will refer to the message system's own documentation when they need to know something about the message system.

      Integration tests are for the more classical view on testing. Beck suggested this is done by another team using different tools (e.g. UI control software), but regardless of specifics it is done as a whole system. This is where you would look for such failure points.

      2 replies →

    • I would want to seperate those tests. You want to know what has failed. Also depends on how many tests you have.

It is so cringe to see bad advice like this being given. Yes, you can write mocks incorrectly. You should not model them after the "happy path" but you should make sure they cover the most use-cases both good and bad. I have been a senior or principal engineer on teams that did both of these approaches and the non-mocking approach is terrible because you end up with separate tests that have colliding data. It's slower using a real database back-end and becomes a mess and leads to issues where your database is heavily coupled to your test code which is the real anti-pattern. Then a year or two later when you want to change databases or database architectures you're screwed because you have to go into a bunch of tests manually and change things. The whole point of the mocks is it makes everything modular.

“Mocks only test the happy path.”

This is a problem with the test authors, not mocks.

“All the bugs are when talking to an actual database.”

Databases have rules that need to be fillowed, and a lot of those can be tested very quickly with mocks. The combined system can have bugs, so don’t only use mocks. Mocks and unit tests are not a substitute for all the other tests you need to do.

How this person can claim to be a CTO I have no idea.

  • He probably meant it takes more effort to create mocks for all the negative cases. In most cases you won't have the time or the bandwidth to do this.

    Try mocking DB triggers, views, access rules etc in mocks and you will know why most teams don't bother mocking but use the real thing instead.

    And about the comment about him being a CTO. Well he is a CTO and you?

    • Then he should have said that. Is not clear communication a requirement for CTO these days?

      Everything you are describing is about actually testing the database. A database is a complex server and things like db triggers and store procedures should be tested isolation too. And then you have integration tests too.

      My team just found a bug that wasn’t covered in a unit test. We found it in a long running API test. And so we added a unit test for the specific low level miss, and a quick integration test too.

Anti-Pattern is a word that sounds smart and educated, but is rarely used against something that does not have a legit use case.

This article did not change my opinion on the subject.

The word anti-pattern is confusing in itself. "Anti" is usually a prefix for something that battles or goes against the word that it prefixes.

In my opinion a better word be a hostile or adverse pattern.

How can one write an article about testing that doesn't even mention the invariants you're trying to validate (by construction or testing)? That's the minimum context for addressing any QA solution.

The GoF pattern book did list patterns, but it primarily argued for a simple language about patterns: context, problem, solution, limitations. It's clear.

The blog-o-sphere recipe of click-bait, straw-man, glib advice designed not to guide practice but to project authority (and promise career advancement) is the exact opposite, because it obfuscates.

The point of writing is to give people tools they can apply in their proximal situation.

Are you really testing if your solutions start by refactoring the code to be more testable? That's more like design if not architecture -- excellent, but well beyond scope (and clearly in the CTO's bailiwick).

And as for mocks: they're typically designed to represent subsystems at integration points (not responses to functions or IO/persistence subsystems). How hard is that to say?

The CTO's way is not to win the argument but to lead organizations by teaching applicable principles, providing guardrails, and motivating people to do the right thing.

Sorry to be exasperated and formulaic, but I think we can do better.

  The problem is when IOService has edge cases. When building the mock, does it address the edge cases? When you want to find bugs by testing, the tests need to test the real world. So to work, the mock for IOService needs to model the edge cases of IOService. Does it? Does the developer know they need to model the edge cases, or the mock is not helping with finding bugs? Do you even know the edge cases of IOService? When IOService is a database service, does your mock work for records that do not exist? Or return more than one record?

It depends. Mocks are used to remove variables from the experiment you are running (the tests) and see if it behaves under very specific conditions. If you want to test how the code behaves when a specific row is returned by the database, but instead the dependency returns something else, then you are not testing that use case anymore. Reproducibility also has its values. But yes, you can definitely make your mocks return errors and fail in a myriad of ways.

Not to say you should mock everything. Of course having proper integration tests is also important, but articles like these will rarely tell you to have a good balance between them, and will instead tell you that something is correct and something else is wrong. You should do what makes sense for that specific case and exercise your abilities to make the right choice, and not blindly follow instructions you read in a blog post.

  • I totally agree, there is a balance between what makes sense to mock, and what needs proper integration tests.

    Additionally, just using integration tests does not guarantee that edge cases are covered, and you can just as easily write integration tests for happy path, without thinking about the rest.

  • 100% agree. Maybe you should be the amazing cto.

    Context always matters.

I prefer dependency injection instead of mocking. Not only is injecting a "mock" service better than monkey patch mocks in pretty much all cases, but it's an actually useful architectural feature beyond testing.

  • That's the only way to mock in some languages/testing frameworks. In C++ monkey patching would be quite difficult, but DI is simple. googlemock works this way.

Many comments are about the danger of over mocking, which is right.

But, I’ve also suffered the opposite: having to use a lib that assumes it only runs in production, and always initialises some context no matter what (up to assuming only a specific VM would be used, never ever elsewhere, especially not in local)

In the wild, I’ve rarely (if ever) saw code that was too testable. Too complex for no reason? Yes.

The golden rule was to only mock your own code. Make a facade around the framework class using an interface and mock that if needed to decouple your tests. Then write integration tests against your implementation of the interface. The moment you mock other people’s code you have brittle tests.

Everytime I read about this kind of argument against mock is usually due to a misunderstanding on why unit tests with mocking exists in the first place. The underlying assumption is that tests are a quality assurance tool but I think that this is true only for E2E (possibly in production). In outside in TDD unit test are used as a design tool not a QA one, and mocking is a convenient way to quickly do that without the need of implementing the next layer, mock usually don’t replace an implemented service that does IO they implemented a noop that triggers an exception (so they your E2E won’t pass until you implement that)

The problem is in the name, unit test should be called implementation spec or in-code documentation.

Each layer of testing has its roles and serves a different purpose.

As a Java (mostly Spring) dev, I use mocks a lot to separate different components from each other, if I only want to test one of them. If your code only contains tests that mock other things, you're missing something, as others have pointed out. But just because you have a good coverage of integration testing, doesn't mean that writing isolated unit tests are bad. I find it much easier to diagnose a problem in a tight unit test than in a integration test that covers half the project.

Some criticism to the article:

The "more unit testing" section reminds me of junior devs asking why they can't test private methods in Java. If I'm testing a unit, I want to test the contract it promises (in this case, a method that does some checks and then sends something). That the behavior is split between multiple methods is an implementation detail, and writing tests around that makes changes harder (now I can't refactor the methods without also having to update the tests, even if the contract doesn't change) and it doesn't even test the contract! (There's nothing that makes sure that the mail is actually sent - we could be testing methods that aren't used by anything but the test code)

For the "easier to test IO" section: just don't. Your tests now depend on some in-memory implementation that will behave differently than the real thing. That's just mocking with extra steps, you still don't know whether your application will work. If you want to do io, do the real io

"Separation of logic and IO": this is in general the right thing to do, but the way it's described is weird. First, it does the same as in the "more unit testing" section with the same problems. Then, the code is refactored until it's barely understandable and the article even admits it with the Greenspan quote. In the end, the production code is worse, just to ... Not test whether there's actually some code doing the IO.

I actually think there are some good ideas in there: separating the logic from the IO (and treating them as separate units) is important, not just for better testability, but also for easier refactoring and (if done with care) to be easier to reason about. In the end, you will need both unit and integration tests (and if your system is large enough, e2e tests). Whether you're using mocks for your unit tests or not, doesn't make much of a difference in the grand picture.

Just don't mock stuff in integration or e2e if you absolutely can't prevent it.

The catch-22 with refactoring to be able to write unit tests is that refactoring introduces risk as you are changing code, and you need tests to help reduce that risk. But you can't easily write tests without refactoring. This has been a very difficult problem for the team I'm currently on.

The only strategy I'm aware of is described in `Working Effectively With Legacy Code`, where you start by writing throwaway unit or E2E tests that give you "cover" for being able to refactor. These tests depend on the implementation or may use mocking just to get started. Then you refactor, and write better unit tests. Then get rid of the throwaway tests.

  • Why get rid of working e2e tests? IMO they are more useful than unit tests at finding the kinds of problems that stop a release/deployment.

    You can attack from both directions: e2e tests make sure that certain processes work in fairly ordinary situations, then look for little things that you can unit test without huge refactoring. When you've pushed these as far as you can, section off some area and start refactoring it. Do your best to limit your refactoring to single aspects or areas so that you are never biting off more than you can chew. Don't expect everything to become wonderful in one PR.

    Your e2e tests will catch some errors and when you look at what those commonly are then you can see how to best improve your tests to catch them earlier and save yourself time. In python I had stupid errors often - syntax errors in try-catch blocks or other things like that. If I used a linter first then I caught many of those errors very quickly.

    I was working on a build system so I mocked the build - created a much simpler and shorter build - so I could catch dumb errors fast, before I ran the longer e2e test on the full build.

    IMO you need to progress to your vision but trying to reach it in one step is very dangerous. Make life better piece by piece.

    You can even do PRs where you only add comments to the existing files and classes (not too much detail but answering questions like "why" is this file/class here). This helps to make sure you really understand the current system is doing before you change it.

    I once added type hints everywhere to a legacy python program - it wasn't as helpful as I'd hoped but it did prevent some issues while I was refactoring.

Mocks can be useful when there is a standard protocol and you want to document and test that your code follows the protocol exactly, doing the same steps, independently of whether some other component also follows the protocol. It tests something different from whether or not two components work together after you change both of them.

It takes time to come up with good protocols that will remain stable and it might not be worth the effort to test it when the protocol design is new and still in flux, and you don’t have alternative implementations anyway. This is often the case for two internal modules in the same system. If you ever want to change the interface, you can change both of them, so an integration test will be a better way to ensure that functionality survives protocol changes.

Database access tends to be a bad thing to mock because the interface is very wide: “you can run any SQL transaction here.” You don’t want to make changing the SQL harder to do. Any equivalent SQL transaction should be allowed if it reads or writes the same data.

Compare with testing serialization: do you want to make sure the format remains stable and you can load old saves, or do you just want a round trip test? It would be premature to test backwards compatibility when you haven’t shipped and don’t have any data you want to preserve yet.

This is a solid article. So many mocks that, at the end, verify you set up your mock amd that 1=1.

One paragraph I think is missing: error handling. You want units to be able to error so you can validate error handling which is _very_ hard on E2E tests. You can simulate disk full or db errors and make sure things fall back or log as expected. This can be done with fakes. Mocks are a specific type of test double that I have very little use of.

Anyone who is overly zealous about anything is always wrong in the end. Including testing.

"Why would people mock everything? Why not stand up a real test db and test on it?" Because the test zealous have explicitly declared that EACH test should be atomic. Yes you can find these people at major tech conferences. Each test should mock its own db, web service, etc. Every single time. And it should do that in no more than a few milliseconds, so that the entire project compiles in no more than 2mins, even for the largest and most complex corporate projects. And these tests should be fully end-to-end, even for complex microservices across complex networking architecture.

Some of you may be rolling on the floor laughing at how naive and time-consuming such a project would be.

We all agree such testing is a noble goal. But you need a team of absolute geniuses who do nothing but write "clever" code all day to get there in any sizeable project.

My organization won't hire or pay those people, no matter what they say about having 100% coverage. We just do the best we can, cheat, and lower the targets as necessary.

  • Lets not forget how long it would take to spin up an enterprise database, even in memory, there are hundreds (or thousands) of tables. Also there can be multiple databases with their own schema, and each require a fair amount of data in some of those tables just to do anything..

Wow, some great examples in here for how to use mocks wrong. I get the impression the author has just never seen tests that use mocks properly, honestly. The various refactorings contained in here are fine, of course, but I see no reason to call the entire use of mocks an anti-pattern. They're a tool, and they need to be used properly. Let's not throw the baby out with the bath water.

Mocking is an indeed an anti pattern ... when dealing with tests that pretend to be unit tests but are not actually unit tests (e.g. needing to be aware if IO edge-cases, to quote the article).

But tests that are not actually unit tests masquerading as unit tests and vice versa is arguably the bigger problem here. Not mocking per se.

If you inherited a project with no tests at all, mocking is a lifesaver. It allows you to only worry about specific aspects of the application so you can start writing and running tests. I agree though that if not done properly, it can be overused and can make your tests practically worthless.

A radical point of view. And as such it is of course wrong ;).

First of all, there are languages where dry-running your code with all parameters mocked is still a valid test run. Python, js, and Perl for instance make it very simple to have a stupid error in the routine that crashes every run.

But more importantly, a unit test usually executes inside the same process as the code. That gives you tremendous introspection capabilities and control over the execution flow. Testing for a specific path or scenario is exactly what you should do there.

Finally, what if not mocks, are in-memory filesystems or databases? They, too, won't show all the behaviors that the real thing will do. And so so test containers or even full dedicated environments. It's all going to be an approximation.

  • I can foresee the "not what I mean" answers to everything. Oh, sure it's a fake DB but that's not a mock. Oh, yeah, you need to test with something that always makes an error but that's not a mock.

    Eventually, what they mean is that if it sucks, it's what they're talking about, and you should never do that. If it was really useful, it's not a mock.

When I implemented the test suite for my JS framework [1], I realized that there was a ton of cruft and noise in most test set ups. The solution? Just start a mirror of the app [2] and its database(s) on different ports and run the tests against that.

Do away with mocks/stubs in favor of just calling the code you're testing, intentionally using a test-only settings file (e.g., so you can use a dev account for third-party APIs). You can easily write clean up code in your test this way and be certain what you've built works.

[1] https://cheatcode.co/joystick

[2] A mirror of the app/db creates a worry-free test env that can easily be reset without messing up your dev env.

If the normal way of testing is passing some parameters to test some code is what I'm going to call "outside-in" testing.

Then mocking is "inside-out" testing. You check that your code is passing the right params/request to some dependency and reacting correctly to the output/response.

Its really the same thing and you can flip between them by "inverting".

Sometimes mocking just makes much more sense, and sometimes just passing paramaters to a function directly does. The end goal is the same: test some unit of codes behaviour against some specific state/situation.

They have their place but like all testing should be layered with other types of test to "test in depth".

Mocks aren't an anti-pattern. Anti-patterns are a "common response to a recurring problem, usually ineffective, risking being highly counterproductive". On the contrary, mocks are a common response to a recurring problem which are often effective and have no greater risk than a great many alternative testing methodologies. They do solve problems and they are useful. But like literally anything else in the universe: it depends, and they don't solve every problem.

You wanna know how to test without mocking? Use any kind of test. Seriously, just make a test. I don't care what kind of test it is, just have one. When you notice a problem your testing doesn't catch, improve your testing. Rinse, repeat. I don't care what kind of 10x rockstar uber-genius you think you are, you're going to be doing this anyway no matter what super amazing testing strategy you come up with, so just start on it now. Are there some ways of testing that are more effective than others? Yes, but it depends. If testing were simple, easy, straightforward and universal we wouldn't be debating how to do it.

(about 99% of the time I'm disappointed in these clickbait blog posts upvoted on HN. they are shallow and brief (it's a blog post, not a book), yet quite often dismissive of perfectly reasonable alternatives, and in the absence of any other information, misleading. it would be better to just describe the problem and how the author solved it, and leave out the clickbaity sweeping generalizations and proclamations)

Go ahead. Don't mock that external service that you rely on for an API. Now you need to have multiple keys, one for each developer, or share keys separate from various environments? Does it not offer dev/test/staging/prod keys? Well, now you need to share those keys. Does it only offer Prod keys? Now you are stuck sharing that. API request limits? Now you are eating through that just to run tests.

And let's not forget that testing things locally means you are mocking the network, or lack-thereof. "Mocking is an anti-pattern" is a great sentiment if you ignore costs or restrictions in the real world.

  • That is a fairly good reason for trying to use external systems/tools that make testing easy/cheap to do.

    So a good approach would be to have tests where you can run with the mock and then run the same tests with the real system. Anything you catch with the mock saves you from using the costly system but you still get real testing.

  • Also the intermittent failures of your tests relying on unstable dependencies.

    • If your dependencies are unstable then that is very important to know! If it means you have to add forms of resilience then that's good for your code perhaps?

      1 reply →

I tell my developers that each mock you use costs you $100. Maybe it is worth it, but probably not.

No I don't really charge em – but it gets the idea across that mocks have costs that you don't always see up front.

  • Maybe you’re writing them incorrectly then? I’ve written several that were for core app features used in 30ish test cases on a team with 7 engineers and they’ve worked flawlessly for over two years.

In general I have fake IO object and a real IO object. Then run the same bunch of tests against them to make sure behaviour matches. You have verified your mock has the same behaviour as the real thing.

I then run unit tests against the fake io object. I don't mock internals, only boundaries. If for whatever reason i want to test it against the real db i can simply swap out the fake for the real object.

If I already have passing tests for anything function A might do, I can safely assume it will behave the same when called from B, C and D.

  • In some languages A might free a memory allocation e.g. after communicating with some server.

    If B also frees that memory then there is a bug. Presumably this means B's tests are wrong/incomplete. If B was mocking A to avoid the IO, you might not find out.

Most IO nowadays in my context is to call some REST API. I prefer to use nock (https://github.com/nock/nock) With that I can create an environment for my test to run in without changing anything about the implementation.

The article does not seem to bring up this way to do it.

Mocking is useful for testing small parts of your programs/libraries. For full-scale testing you really need to not emulate because any emulation will be woefully incomplete, so you're going to have to spin up a virtual network with all the services you need including DNS.

> When you add UI-driven tests (and you should have some),

I disagree. If you want to send your test suite into the toilet, add a headless browser driver and nondeterministic assertions based on it. Most output that becomes UI can be tested; the rest can be checked by a quick QA.

"What to do instead of Mocking"

should be more like

"Enhance your use of Mocks with better unit tests and integration tests".

The listed complaints sound more like problems with sloppy/lazy coding practices than actual problems with mocks.

> This often happens with an in-memory database, which is compatible to your main database you do use in production, but is faster.

Not sure how this will solve edge cases problems described at the beginning of the article

I've mocked a lot in my past. Last 2 years I've been using fakes explicitly, although it has an overhead, I like it as there is less maintenance and refactoring with tests.

Frightening that there is someone out there calling themselves a CTO and offering CTO coaching who doesn't understand what unit testing is.

What if we put the people in charge of an interface, also in charge of the mock for that interface that others can then use in their tests?

Some testing frameworks (thinking of my experience with Flutter) insist on mocking

From memory the HTTP client 404s every request in testing mode

I agree with this in principle, but some things can only be mocked, like AWS interactions in a test/CI environment.

i dont get it. I if am taking a dependency on database or another class and i mock it using its interface, what is the harm in it? Essentially i have tested that given my dependencies working correctly my class would also work as expected.

  • Almost -- you're testing that given your mocking implementation perfectly mirrors what the dependency would do given the inputs tested with that your functions produce the correct outputs (and hopefully you also verified the side-effects).

    The article is stating that almost nobody goes through the trouble of implementing a mock database perfectly, they just do something like make a single call return some hard-coded data. While this works a bit, it means that if the database ever changes its interface you have to remember to notice and implement that change as well.

Sigh, no no no, no, no it's not.

In fact, Mocking is an essential tool for writing _unit_ tests; you know, testing exactly one thing (a 'unit') at a time. In Java for instance, a 'unit' would be a single static method, or a single class. Other languages will have different definitions of these terms, but the essential point would be "smallest reasonable grouping of code that can be executed, preferably deterministically"

The problem is people conflate the various levels of integration tests. You actually should* have both: Full unit test coverage + an integration test to prove all of the pieces work together successfully. Small unit tests with mocks will point you _very quickly_ to exactly where a problem is a codebase by pointing out the effects of contract changes. Large integration tests prove your product meets requirements, and also that individual components (often written by different teams) work together. They are two different things with two different goals.

* Important Caveat on the word 'should': Testing de-risks a build. However, if your business product is a risk itself (lets say you're hedging a startup on NFTs going wild), then your testing should reflect the amount of risk you're underwilling to spend money on. Unit testing in general speeds up development cycles, but takes time to develop. A good Software Engineering leader recognizes the risks in both the business side and development side and finds a balance. As a product matures, so should the thoroughness of it's testing.

If you take the article's advice to move everything that's not IO into pure, testable code (which is good), what's left is code that does IO. What are you even testing when you call such a procedure? At that point, it's mostly calls into other people's code. Maybe that's a good place to draw the line on testing things?

Argument by exaggeration:

For car crash tests, we should always use full humans. A test dummy might have a lot of sensors and be constructed to behave like a human in a crash, but you'll never get the full crash details with a doll.

Notice the problem here? This argument does not consider the costs and risks associated with each approach.

For testing, IO is very expensive. It leads to huge CI setups and testsuites that take multiple hours to run. There is no way around this except using some kind of test double.

I've been doing this stuff (software) for a very long time and if it hadn't been invented by others, I'd never have thought of Mocking. It's that stupid of an idea. When I first came across it used in anger in a large project it took me a while to get my head around what was going on. When the penny dropped I remember a feeling of doom, like I had realized I was in The Matrix. Don't work there, and don't work with mock-people any more.

  • I don't like mocking either, but there are periodically situations where I've found it useful. Sometimes there is a complex system (whether of your own design or not) that isn't amenable to integration/e2e testing, and the interesting parts can't be easily unit tested due to external or tightly coupled dependencies.

    Of course you can always pick it apart and refactor so it can be unit tested, but sometimes the effort required makes mocking look pretty appealing.

  • With containerization it’s very quick to spin up test dependencies as well as part of your CICD. Why mock calls to a datastore when it’s super easy to spin up an ephemeral postgresql instance to test on?

    • > Why mock calls to a datastore when it’s super easy to spin up an ephemeral postgresql instance to test on?

      It's actually super hard to get Postgres to fail, which is what you will be most interested in testing. Granted, you would probably use stubbing for that instead.

    • Because you have 40000 tests, and an in memory object means they can run in seconds. And the real thing runs in minutes.