Comment by bunderbunder

10 hours ago

I share this ideal, but also have to gripe that "descriptive test name" is where this falls apart, every single time.

Getting all your teammates to quit giving all their tests names like "testTheThing" is darn near impossible. It's socially painful to be the one constantly nagging people about names, but it really does take constant nagging to keep the quality high. As soon as the nagging stops, someone invariably starts cutting corners on the test names, and after that everyone who isn't a pedantic weenie about these things will start to follow suit.

Which is honestly the sensible, well-adjusted decision. I'm the pedantic weenie on my team, and even I have to agree that I'd rather my team have a frustrating test suite than frustrating social dynamics.

Personally - and this absolutely echoes the article's last point - I've been increasingly moving toward Donald Knuth's literate style of programming. It helps me organize my thoughts even better than TDD does, and it's earned me far more compliments about the readability of my code than a squeaky-clean test suite ever does. So much so that I'm beginning to hold hope that if you can build enough team mass around working that way it might even develop into a stable equilibrium point as people start to see how it really does make the job more enjoyable.

Hah, I swing the other way! If module foo had a function bar then my test is in module test_foo and the test is called test_bar.

Nine times out of ten this is the only test, which is mostly there to ensure the code gets exercised in a sensible way and returns a thing, and ideally to document and enforce the contract of the function.

What I absolutely agree with you on is that being able to describe this contract alongside the function itself is far more preferable. It’s not quite literate programming but tools like Python’s doctest offer a close approximation to interleaving discourse with machine readable implementation:

  def double(n: int) -> int:
    “””Increase by 100%

    >>> double(7)
    14
    “””
    return 2 * n

Thanks for the hint about Knuth's literate programming! I hadn't heard about it before but it immediately looks great. (For those of us who hadn't heard about it before either, here is a link: https://en.wikipedia.org/wiki/Literate_programming)

About your other point: I have experienced exactly the same. It just seems impossible to instill the belief into most developers that readable tests lead to faster solving of bugs. And by the way, it makes tests more maintainable as well, just like readable code makes the code more maintainable anywhere else.

> It's socially painful to be the one constantly nagging people about names, but it really does take constant nagging to keep the quality high.

What do test names have to do with quality? If you want to use it as some sort of name/key, just have a comment/annotation/parameter that succinctly defines that, along with any other metadata you want to add in readable English. Many testing frameworks support this. There's exactly zero benefit toTryToFitTheTestDescriptionIntoItsName.

  • Some languages / test tools don’t enforce testNamesLikesThisThatLookStupidForTestDescriptions, and you can use proper strings, so you can just say meaningful requirements with a readable text, like “extracts task ID from legacy staging URLs”.

    It looks, feels, and reads much better.

    • With jest (Amonsts others), you can nest the statements. I find it really useful to describe what the tests are doing:

          describe('The foo service', () => {
      
            describe('When called with an array of strings', () => {
      
              describe('And the bar API is down', () => {
      
                it('pushes the values to a DLQ' () => {
                  // test here
                })
      
                it('logs the error somewhere' () => {
                  // test here
                })
      
                it('Returns a proper error message`, () => {
                  // test here
                })
              })
            })
          })
      
      

      You could throw all those assertions into one test, but they’re probably cheap enough that performance won’t really take a hit. Even if there is a slight impact, I find the reduced cognitive load of not having to decipher the purpose of 'callbackSpyMock' to be a worthwhile trade-off.

  • It's funny, you are asking what test names have to do with quality, and you proceed with mentioning a really bad test name, 'toTryToFitTheTestDescriptionIntoItsName', and (correctly) stating that this has zero benefit.

    Just like normal code, test methods should indicate what they are doing. This will help you colleague when he's trying to fix the failing test when you're not around. There are other ways of doing that of course which can be fine as well, such as describing the test case with some kind of meta data that the test framework supports.

    But the problem that OP is talking about, is that many developers simply don't see the point of putting much effort into making tests readable. They won't give tests a readable name, they won't give it a readable description in metadata either.

  • That's not the point of the article. The code should be readable no exception. The only reason we should be ysing x y z are for coordinates ; i should be left for index_what ; same goes for parameters ; they should also contain what unit they are on (not scale, but scale_float) only exception I see are typed languages ; and even then I'm occasionally asked a detail about some obscure parameter that we set up a year ago. I understand it can sound goofy, but the extra effort is made towards other people working on the project, or future self. There is no way I can remember keys or where I left the meaning of those, and there is no justification to just write it down.

    Readability of the code makes a lot of it's quality. A working code that is not maintainable will be refactored. A non working cofe that is maintainable will be fixed.

  • Kotlin has an interesting approach to solving this. You can name functions using backticks, and in those backticks you can put basically anything.

    So it's common to see unit tests like

      @Test
      fun `this tests something very complicated`() {
        ...
      }

    • You can do that in Java as well. Can't remember if it's exactly the same syntax

  • What kinds of things would you say are best as annotation vs in the test method name? Would you mind giving a few examples?

    Also, are you a fan of nesting test classes? Any opinions? Eg:

    Class fibrulatatorTest {

      Class highVoltages{
    
          Void tooMuchWillNoOp() {}
          Void maxVoltage() {}

    } }

  • It's important to this article because its claiming that the name is coupled functionally to what the code tests -- that the test will fail if the name is wrong.

    I don't know if any test tools that work like that though.

    • That's not what the article claims at all.

      It claims that, in order for tests to serve as documentation, they must follow a set of best practices, one of which is descriptive test names. It says nothing about failing tests when the name of the test doesn't match the actual test case.

      Note I'm not saying whether I consider this to be good advice; I'm merely clarifying what the article states.

  • > What do test names have to do with quality?

    The quality of the tests.

    If we go by the article, specifically their readability and quality as documentation.

    It says nothing about the quality of the resulting software (though, presumably, this will also be indirectly affected).

Obviously this is slightly implementation dependent but if your tests are accompanied by programmatic documentation (that is output together with the test), doesn't that eliminate the need for a descriptive test name in the first place?

If anything, in this scenario, I wouldn't even bother printing the test names, and would just give them generated identifier names instead. Otherwise, isn't it a bit like expecting git hashes to be meaningful when there's a commit message right there?

I’d rather leave a good comment instead of good test names. I mean do both, but a good comment is better imo. All I really care about is comments anymore. Just leave context, clues, and a general idea of what it’s trying to accomplish.

  • Four test failures in different systems, each named well, will more quickly and accurately point me to my introduced bug than comments in those systems.

    Identifiers matter.

Have you considered a linter rule for test names? Both Checkstyle and ESLint did great work for our team

It’s practically a sociology experiment at this point: half of the time when I suggest people force themselves to use a thesaurus whether they think they need it or not, I get upvoted. And half the time I get downvoted until I get hidden.

People grab the first word they think of. And subconsciously they know if they obsess about the name it’ll have an opportunity cost - dropping one or more of the implementation details they’re juggling in their short term memory.

But if “slow” is the first word you think of that’s not very good. And if you look at the synonyms and antonyms you can solidify your understanding of the purpose of the function in your head. Maybe you meant thorough, or conservative. And maybe you meant to do one but actually did another. So now you can not just chose a name but revisit the intent.

Plus you’re not polluting the namespace by recycling a jargon word that means something else in another part of the code, complicating refactoring and self discovery later on.

> ...increasingly moving toward Donald Knuth's literate style of programming.

I've been wishing for a long time that the industry would move towards this, but it is tough to get developers to write more than performative documentation that checks an agile sprint box, much less get product owners to allocate time test the documentation (throw someone unfamiliar with the code to do something small with it armed with only its documentation, like code another few necessary tests and document them, and correct the bumps in the consumption of the documentation). Even tougher to move towards the kind of Knuth'ian TeX'ish-quality and -sophistication documentation, which I consider necessary (though perhaps not sufficient) for taming increasing software complexity.

I hoped the kind of deep technical writing at large scales supported by Adobe Framemaker would make its way into open source alternatives like Scribus, but instead we're stuck with Markdown and Mermaid, which have their place but are painful when maintaining content over a long time, sprawling audience roles, and broad scopes. Unfortunate, since LLM's could support a quite rich technical writing and editing delivery sitting on top of a Framemaker-feature'ish document processing system oriented towards supporting literal programming.