Snapshot Testing in the Backend

The first time I learned about snapshot testing was in the context of front-end testing. It made a lot of sense. Writing an assertion-based test to check if a component was correctly rendered as HTML is tricky. And difficulty grows with the complexity of the output. Snapshot testing seemed a reasonable technique to get the job done, even with some caveats.

So, what is Snapshot Testing Anyway?

Snapshot testing is a methodology in which we collect the subject-under-test’s output and use it as the criterion for the subsequent test runs. We call this collected output snapshot because it captures the behavior of the subject-under-test (SUT) at a given moment in time. This defines a regression test as our goal is to be sure that future changes in the subject-under-test don’t affect its behavior. Of course, we should check that the snapshot is correct.

We use the subject-under-test to generate the criterion for future tests. This sounds a bit like cheating, isn’t it? But there are a lot of situations where using assertion-based testing is complicated, confusing, and frustrating, especially when you need to make changes.

Snapshot testing combines a bit of manual testing with automation. The manual part is the review of the generated snapshot to ensure that it’s correct. Once we are happy with it, the snapshot becomes the criterion for subsequent runs of the same test.

Let’s consider an example: the generation of files and complex objects, like HTML, JSON, XML, CSV, code, long plain texts, or even console output. These are outputs that can be very difficult to test with assertions. In the best scenario, you will get an example document for the expected output. A snapshot replaces this with the result of the SUT.

Our History with Snapshot Testing

I’m not a tester but a developer who uses testing as a tool in the process of designing, maintaining, and delivering software. I’m a member of a team responsible for a backend service that acts as a communications hub between other services in the company, consuming and producing events from and to our message brokers.

We have to deal with tons of huge objects. We expose an API Rest that receives complex requests and produces big and intricate objects as responses. We also listen to many events from other services containing plenty of data. And we also publish lots of events enriched with even more data. We needed a solution to ensure that we were creating all these objects and messages properly.

First, we tried with assertion testing. Unfortunately, this kind of test introduces a bunch of problems:

  • Writing the expectations is tedious and daunting because we must create all those big and complex objects by hand.
  • The maintenance is a nightmare. If we need to change something, it can be tough and prone to introduce errors.
  • Non-deterministic output was another big problem. Some fields, like unique identifiers and timestamps, are populated with non-predictable values, requiring special treatment and design changes.

In general, assertion-based testing with this sort of complex object is tedious and fragile. It’s possible to do it, of course. But, the cost of introduction and maintenance is too high to be practical.

Trying the Snapshot Approach in our Projects

I was familiar with some snapshot testing techniques to deal with legacy code, such as approval testing and golden master. At some point, an obvious question arose: why not try these techniques when testing our events?

The programming language of our main project is Go, but we also have PHP projects. We researched and found the [Approvals] library, which provides ports for several languages, including Go and PHP.

Approval testing is a technique introduced by Lewellyn Falco. It is based on snapshot testing. The single difference is that approvals will always require validating the snapshot before using it. Apart from that, it’s the same process.

So, we introduced this library to implement those tests, combining it with validating the JSON messages against OpenApi definitions. This setup is working well: we ensure both the contract of the event and that we are building them with the correct data. Thanks to snapshot testing we were able to write tests much faster and with fewer problems. We were able to write more tests in less time.

Sadly, we found some practical inconveniences using Approvals in Go, hence I’ve started to develop a new library for the task, called Golden.

I will use Golden to show some examples of how to test with snapshot testing and the usual problems you will have to address. Of course, you can use any similar library in your language of choice ecosystem, but the basic principles are the same.

Pros

  • It is easy to put snapshot testing in place when you need it. You don’t have to prepare criteria in advance because you will be using the current output of the SUT.
  • It’s handy when introducing testing in projects lacking it. You don’t need to know a lot about the SUT. Furthermore, the tests will help you understand the code behavior.
  • It solves the problem of testing big and complex objects, files, or documents. If you work on generating this kind of outcome, snapshot testing is the way.

Cons

  • It should not be used as a replacement for assertion-based testing when the outputs are simple or small.
  • In general, it doesn’t fit well in test-driven development methodologies.
  • Requires some discipline in keeping things clean. It’s easy to forget unused snapshots when you modify or remove tests.
  • Related to this, when you want to modify the SUT’s behavior, you have to delete the snapshot.

Caveats

Snapshot testing shouldn’t be a choice where assertion-based testing fits better. Simple and well-defined outputs will be best tested using assertions. Property-based testing also works better using assertions because they help describe the wanted properties accurately.

Limit snapshot testing to those use cases in which you need to generate complex outputs that probably need human supervision, like JSON objects, HTML docs, XML files, generated code, serialization, etc.

Using Snapshot Testing

We will work with the [Theatrical Players Refactoring Kata] by Emily Bache to illustrate the use of snapshot testing. The exercise provides tests already, but we will rewrite them using the library Golden. This is a pretty good example use case of snapshot testing because the generated output is a text that represents a Statement for an Invoice, with several details.

Basic usage

The basic use case is when you need to test an existing piece of code that generates a complex output. Typical use cases are serializing objects, API responses, etc.

The subject under test is StatementPrinter, an object that generates a text Statement for an Invoice. In this first example, We are going to try to print an empty Invoice, to show the basic usage and what to expect.

You only have to invoke the Verify method, like this:

				
					func TestEmptyStatementPrinter(t *testing.T) {
	invoice := theatre.Invoice{}

	plays := make(map[string]theatre.Play)

	printer := theatre.StatementPrinter{}

	statement, err := printer.Print(invoice, plays)
	if err != nil {
		t.Fatalf("error: %s", err.Error())
	}

	golden.Verify(t, statement)
}
				
			

This will generate a snapshot file, containing the output and making the test pass. You don’t need to prepare anything in advance. The snapshot file will be added to the filesystem in the same package as the test inside a dedicated folder: testdata/TestEmptyStatementPrinter.snap.

				
					Statement for 
Amount owed is $0.00
You earned 0 credits
				
			

Let’s try a more interesting test. We are going to put some data and print a realistic invoice. This example covers all possible execution flows in the code. As you can see, most of the code is preparing the data for the scenario. The interesting line is the last one and it is very simple.

				
					func TestPrintStatementForInvoice(t *testing.T) {
	plays := make(map[string]theatre.Play)

	plays = map[string]theatre.Play{
		"hamlet": {
			Name: "Hamlet",
			Type: "tragedy",
		},
		"as-you-like": {
			Name: "As You Like",
			Type: "comedy",
		},
	}

	invoice := theatre.Invoice{
		Customer: "Smith Ltd.",
		Performances: []theatre.Performance{
			{
				PlayID:   "hamlet",
				Audience: 25,
			},
			{
				PlayID:   "hamlet",
				Audience: 30,
			},
			{
				PlayID:   "hamlet",
				Audience: 35,
			},
			{
				PlayID:   "as-you-like",
				Audience: 15,
			},
			{
				PlayID:   "as-you-like",
				Audience: 25,
			},
			{
				PlayID:   "as-you-like",
				Audience: 25,
			},
		},
	}

	printer := theatre.StatementPrinter{}

	statement, err := printer.Print(invoice, plays)
	if err != nil {
		t.Fatalf("error: %s", err.Error())
	}
	golden.Verify(t, statement)
}


				
			

So, we have the output in the statement variable. And this is the generated snapshot: testdata/TestPrintStatementForInvoice.snap.

				
					Statement for Smith Ltd.
  Hamlet: $400.00 (25 seats)
  Hamlet: $400.00 (30 seats)
  Hamlet: $450.00 (35 seats)
  As You Like: $345.00 (15 seats)
  As You Like: $500.00 (25 seats)
  As You Like: $500.00 (25 seats)
Amount owed is $2,595.00
You earned 18 credits
				
			

You can manually edit the snapshot if you find something wrong. The next time you run the test, it will fail, and you can modify the code to produce the correct output.

In essence, this is all that you need to start snapshot testing. But, there is more.

Approval mode

One interesting thing about the approvals library is the fact that it requires human review of the snapshot. I decided to implement that behavior in Golden but in a different way.

The use case is similar: you want to check the complex output generated by the subject under test. But this time, you want to review the content to be sure that it is the expected output.

Let’s take another classic exercise. For example the [Gilded Rose Refactoring Kata], also in the version from Emily Bache. As in the Theatrical Plays kata, we are going to ignore the tests provided and replace them with our own. But this time, we are not sure about the test data. So, we will use the approval mode to discover them. This is the first iteration:

				
					func TestGildedRose(t *testing.T) {
	var items = []*Item{
		&Item{"foo", 0, 0},
	}

	UpdateQuality(items)

	if items[0].name != "foo" {
		t.Errorf("Name: Expected %s but got %s ", "fixme", items[0].name)
	}

	golden.Verify(t, items, golden.WaitApproval())
}
				
			

This will generate the correspondent snapshot file, but the test will not pass by default this time. You need to explicitly approve the snapshot by removing the golden.WaitApproval() parameter. Other libraries require changing the snapshot name to mark it as approved.

The snapshot is not very interesting, by the way:

				
					[
  {}
]


				
			

What happened? At least we should have one item, isn’t it? Well, Golden serializes the subject as JSON under the hood, but the Item struct is not ready for that. It should implement the Marshaller interface.

				
					func (i Item) MarshalJSON() ([]byte, error) {
	tmp := struct {
		Name    string `json:"name"`
		SellIn  int `json:"sellIn"`
		Quality int `json:"quality"`
	}{
		Name:    i.name,
		SellIn:  i.sellIn,
		Quality: i.quality,
	}

	return json.Marshal(tmp)
}
				
			

After the change, we run the test again. Let’s check the snapshot again:

				
					[
  {
    "name": "foo",
    "sellIn": -1,
    "quality": 0
  }
]
				
			

This looks certainly better. If it was enough with this at this moment we could remove the golden.WaitApproval() option and we should be done.

But, let’s say that we run the test, but with coverage this time. We get something like this:

image2

The test only exercises a little part of the code (and the image only shows half of the code). We need more Items in the list to cover all possible execution flows. No problem. We can analyze the values in the conditionals to make guesses about interesting values. For example, we need some object with quality > 0 to go into the first nested condition.

Let’s change the test:

				
					func TestGildedRose(t *testing.T) {
	var items = []*Item{
		&Item{"foo", 0, 0},
		&Item{"foo", 0, 10},
	}

	UpdateQuality(items)

	if items[0].name != "foo" {
		t.Errorf("Name: Expected %s but got %s ", "fixme", items[0].name)
	}

	golden.Verify(t, items, golden.WaitApproval())
}
				
			

Now, we have this snapshot, and the second item is shown:

				
					[
  {
    "name": "foo",
    "sellIn": -1,
    "quality": 0
  },
  {
    "name": "foo",
    "sellIn": -1,
    "quality": 8
  }
]
				
			

And, let’s see how coverage changed:

image1

That’s wonderful. We have been able to increase the coverage, with the help of the approval mode. Of course, we are not ready yet. We should continue adding examples to the test unit so we can cover all the possible execution flows. At some point, we will get 100% coverage and no changes in the snapshot. This is the moment to approve it by removing the golden.WaitApproval() option.

				
					func TestGildedRose(t *testing.T) {
	var items = []*Item{
		&Item{"foo", 0, 0},
		&Item{"foo", 0, 10},
		&Item{"Aged Brie", 0, 10},
		&Item{"Aged Brie", -1, 10},
		&Item{"Backstage passes to a TAFKAL80ETC concert", 10, 40},
		&Item{"Backstage passes to a TAFKAL80ETC concert", 5, 40},
		&Item{"Backstage passes to a TAFKAL80ETC concert", -1, 40},
	}

	UpdateQuality(items)

	if items[0].name != "foo" {
		t.Errorf("Name: Expected %s but got %s ", "fixme", items[0].name)
	}

	golden.Verify(t, items)
}
				
			

Probably, we will need some more items and different conditions, but I think you will get the point.

Approval vs Verify

What is the difference? Why would you work this way? We found some situations in which the approval mode is useful:

  • When you need a domain expert to review the snapshot, if the code under test produces a CSV file, a user should verify that all the fields are present and the data format is correct. You can ask the user to check the file before declaring it approved.
  • When you are building the code iteratively, adding pieces in steps to produce different parts of the desired output. While the test is running in Approval mode, the test doesn’t pass, even if there are no differences with the previous snapshot.
  • When you are trying to figure out the best values for creating test scenarios in code that you are not familiar with.

Managing the Non-Deterministic Output

I mentioned before the problem with non-deterministic output.

In snapshot testing, you will need to replace the parts of the snapshot generated by the SUT, mostly date-time or identifiers. This process is usually called scrubbing.

In simple words, scrubbing consists of applying replacement functions that match the non-deterministic targets using regular expressions and replacing them with placeholder or arbitrary data that resembles the problematic one. For certain formats, you could use other approaches. For example, in JSON you could replace the content of specific fields.

Golden has you covered with Scrubbers. Let’s see an example. In this test, we create a subject that includes the current time, which will change every time we run the test. The scrubber will look for the pattern and will replace it with a placeholder in the snapshot.

				
					func TestNonDeterministic(t *testing.T) {
	t.Run("should scrub date", func(t *testing.T) {
		scrubber := golden.NewScrubber("\\d{2}:\\d{2}:\\d{2}.\\d{3}", "<Current Time>")

		// Here we have a non-deterministic subject
		subject := fmt.Sprintf("Current time is: %s", time.Now().Format("15:04:05.000"))

		golden.Verify(t, subject, golden.WithScrubbers(scrubber))
	})
}
				
			

This is the result:

				
					Current time is: <Current Time>
				
			

You can create and use all the scrubbers you need. You could even use them to obfuscate sensitive data such as credit card numbers or emails.

Wrapping up

In this article, I’ve shown how to apply snapshot testing techniques in the backend. Also, I demonstrated it using the Golden snapshot testing library for Go.

Snapshot testing is handy when you need to put under test code that generates complex outputs, like documents, serializations, JSON files, etc. It can be useful when working with legacy code, provided that you can capture the outcomes of the SUT.

References

[Approval tests] https://approvaltests.com/

[Gilded Rose Refactoring Kata] https://github.com/emilybache/GildedRose-Refactoring-Kata

[Golden] https://github.com/franiglesias/golden

[Golden for PHP] https://github.com/franiglesias/php-golden

[Theatrical Players Refactoring Kata] https://github.com/emilybache/Theatrical-Players-Refactoring-Kata

Written by

Fran Iglesias is a developer with a strong interest in agile development practices, and that includes using testing in all the steps of the process, from test driven development to bug fixing, and all in the middle. In fact, Fran has been working in several teams and a big part of his contribution was to increase the number and quality of tests, or even setting up the projects to achieve this. He also writes about the subject in his technical blog (The Talking Bit) and has published several books. Currently, he is serving as backend developer in The Hotels Network, where he extensively uses snapshot testing techniques.

Leave a Reply

Your email address will not be published.

Related Posts

Testing Courses at Thrive EdSchool

Advertisement (Know More)

Get Top Community News

    Top Event and other The Test Tribe updates to your Inbox.

     

    Categories

    Tags