Triangle Testing: Trying to find the significance

In our last few columns we’ve talked about how to think about developing recipes and how to tweak the recipes you’ve developed to make them exactly the beer you want to drink. But what if you just can’t decide if those tweaks are working for you? Is one version of a recipe more pleasing to your tastes than another? Or does that tweak you made really make a difference at all? That’s what we’re diving into this time around.

Sometimes when you make changes to a recipe the results are so obvious that you simply need to decide if you like it or not. That’s called hedonic testing. Do you like it or don’t you? To a great many people, homebrewers are already hedonists, so yes, you know how to do that.

We highly recommend doing triangle tests as a club activity. Get people working those tastebuds!

And for the record — no matter what else happens with all the other forms of testing that you do — it’s always going to finally come down to a sense of enjoyment. Do I enjoy the beer more? Do I enjoy the process more? You’re not trying to make a more stable, more universally loved beer or shave pennies and seconds off your brew day for improved profit margins. You’re trying to make beer you enjoy.

Remember, we’ve discussed multiple ways to tweak a recipe in the November 2019 issue’s column, aptly named “Tweaking Out.” For example, turning an APA into an American brown by adding some sort of dark malt — it’s going to be pretty apparent if it’s darker or not! Other times, though, it’s not so apparent. Subbing one hop or malt variety for another, changing mash time or temperature, or making any kind of subtle change requires comparison between the before and after beers. And since confirmation bias is so unavoidable, the only way to know for sure if your tweak worked is to do a difference test. While there are several different types of discrimination tests, for our purposes, the triangle test has proven to be the perfect fit. A basic triangle test is simple to perform, provides solid results, and they’re fun — well at least for the pourer.

What are these triangle tests?

A chemist at Guinness Brewery named William Sealy Gosset, who went by the pen name “Student,” coined the Student’s t-test to statistically test a hypothesis. His data analysis makes it possible to use a very small data pool — like a small group of tasters — to tease out potentially statistically significant findings. Gosset developed the t-test to monitor the quality of the stout Guinness makes. Initially Guinness and Gosset were focused on changing barley varieties. Could tasters tell a difference in their stout if they used a new barley variety (maybe it was cheaper, maybe it was hardier, more/less protein/starch). All of these are very important concerns when you’re running a huge multinational corporation producing a much-loved product. Since its inception, there have been several variations of the blind tasting test developed, but for our use we’ll keep it simple and use the triangle sensory test.

Have someone pour you two samples of one beer and one of the other. You should have 3 glasses each alike in shape and clarity (opaque is best for most purposes). Have the pourer mark them in such a way that they can tell which is which. Avoid “ABC” or “123” — easy sequences like that with a natural order have been shown to influence results. See if you can pick out the one that’s different. If you can, the tweak you made had a large enough impact on your beer for you to tell the difference. It’s up to you to decide, though, if the difference it made is what you wanted the beer to be! Be sure to do the test more than once so you can be sure your choice wasn’t a fluke. Sorry, but that means you’ll have to drink more beer. Yeah, a shame . . .

Heck, you can even be stranger and get your friends involved. We highly recommend doing triangle tests as a club activity. Get people working those taste buds! It’s an interesting educational experience and one that can reinforce how tricky human perceptions are. Not to mention that the more tasters, the more test iterations you have — the more meaningful your conclusions can be made. Truthfully, this is also the best way for the test to be done. Why? Assuming you don’t tell them — these other tasters won’t know what you’ve tweaked. The human brain is such a powerful pattern detection system that armed with a little bit of knowledge — “I changed my dry hop schedule from 10 days to 7 days” — it will start trying to tease out any little sensation and highlight it as the crucial piece of evidence. In other words, you can trust the choice of the unknowing more than you can your own.

The game remains the same. Create two versions of your beer — tweaked and untweaked. Make sure you keep the beers as similar as possible, including little things like carbonation levels! Pick up some cheap opaque plastic cups. Mark them with symbols like circle, square, triangle. Randomly choose one of the symbols to be the odd beer out. You can use a die to help here. Roll the die — a 1 or 2 means circle, 3 or 4 means square, 5 or 6 means triangle. Then choose whether the tweaked beer or untweaked beer is the odd beer. This may be dictated by circumstances like available beer volume or just use the dice again to pick the beer served only once (odd numbers — tweaked, even — untweaked). Pour a flight of these for your club and present three cups to each member.

Give them no instructions beyond: “Please taste these samples. Please tell me if you can detect which one is different and if so, indicate which one. No talking amongst your fellow tasters.”

Stand back and watch the mayhem. It’s fun!

Why? Because people will struggle with the tasting and non-serious struggle is kind of funny. They’ll be perplexed and outthink themselves. They’ll be so rock solid certain of their choice that even when told the “correct” result, they’ll insist they noticed something.

Drew’s done this test multiple times with a simple setup that every homebrewer claims should be no sweat — can you tell the difference reliably between Budweiser and Bud Light? Surprisingly, despite their confidence, blind tasters have a hard time telling the difference. For a more educational, as opposed to cruel approach, homebrew clubs and judges can purchase off-flavor testing kits and run triangle tests with each off flavor — help your tasters learn what they can’t taste!

When doing a triangle test, please remember to be encouraging. The point of the test is that you’re likely to fail if the difference is slight. The whole reason the test works is because by making three choices, we drop the chances of a taster being randomly right from 50% to 33%. That 17% is where we derive its significance.

But how do we tell if our findings from our triangle test have any significance? Well for that we refer you back our very first column in May-June 2019 titled “A Bitter Challenge,” where we introduced the p-value (you can find the chart on page 96). It’s a magical and very misunderstood tool and it all comes down to that silly name, “p-value.” Everyone gets it confused and thinks it means “probability” — i.e. “how likely is it that we’re right?” But what it really means is how confident you are that your results are more significant than chance would be. In a truly scientific world, it’s a way of saying that you’ve found interesting results and that more testing is needed. There is an outstanding PBS Nova episode called “Prediction by the Numbers” that illustrates how statistics work. It’s available on several streaming services and we definitely recommend the program to those who would like to dive deeper.

A significance chart — in this chart, and for many experiments, we use the common p-value (probability value) threshold for significance at 0.05. Anything above that and the test is considered inconclusive (red zone). At that level or below, the findings would be considered significant (green zone).

Unfortunately, in the homebrew world, too may homebrewers take a low (significant) p-value to mean a conclusion has been reached. Nothing could be further from the truth! A significant p-value simply means that the idea is worthy of further testing, not that a conclusion has been reached. A low p-value has just shown that the null hypothesis (that there is insignificant difference between the beers) is less likely. In the world of academics, the weight of the p-value is being hotly debated. Expect a lot of ripples to flow through the scientific community.

Science is dependent on repetition. A single trial doesn’t prove or disprove anything. An experiment needs to be published along with the methodology and results, so it can be repeated by other researchers. That’s where you come in. When you need to make a tweak to your beer, it’s useful to look at the experiments that have been done by other homebrewers. Then try what they found and see how it works for you. It might, or it might not . . . that’s how science works! Be sure to realize that if you don’t duplicate their methods exactly, you might have different results. But no matter what happens, you’ll learn something!

When you know the tweaks you’ve made and the desired results, it’s pretty difficult to gauge the tweaks objectively.

To bring it full circle — beyond all the complicated stuff, distant to the groups of tasters, remote from the math — at the heart of all this you’re trying to figure out, “did this matter? Did it make the beer better for me?” We’re rarely looking for grand and gaudy truths. We are but simple brewers trying to make our beer better for us. But use your friends, your partner, your kids — get them to help you get out of your own head and figure out if the things you’re doing really do work. You’ve been given a powerful tool — let’s make some beer with it!

The Basic Guide to Triangle Testing

Drawing three geometric shapes helps distinguish to the pourer what beer is in the cup and makes for a non-suggestive reference for triangle tests.

What You Need:

  • 2 beers, treated the same except the variable tweaked
  • 3 opaque sample cups, either different colors or marked with a symbol
  • 1 die

Step-by-step for making a proper objective analysis:

  1. Choose which cup will be the “different” cup by rolling a die. (A result of 1 or 2 means the first cup, 3 or 4 – the second, 5 or 6 – the third.) Repeat the process to choose which beer will go in that cup. (Even number on the die means the “normal” beer. Odd number means the tweaked beer.)
  2. Carefully pour the chosen beer in the chosen cup. Fill the other two cups with the remaining beer.
  3. Present the beers to your tasters, ideally in a nice quiet space. Ask them to taste the samples and see if they
    can tell which beer is different. Record their choice.

Triangle Testing in Action

Like we mentioned in last month’s column on tweaking a recipe, when Denny is developing a new recipe he likes to brew multiple test versions and compare them until he gets the beer he wants. Triangle testing is vital to assessing the results. When you know the tweaks you’ve made and the desired results, it’s pretty difficult to gauge the tweaks objectively.

Denny has been working on a recipe for an American mild ale for several years. The objective was to make a beer with the qualities of a British mild (malt character, low integrated hop flavor, body, 1.035 OG, under 4% alcohol) but with only American ingredients. For some ingredients, it was an easy choice. Wyeast 1450 would be the perfect yeast (why do you think they call it Denny’s Favorite?). So it was time to start figuring out what else would go into the beer. The first several batches were made with Great Western malts. Good flavor, but not enough flavor for such a low-gravity beer, and a thin body. His wife referred to the beer as “hop water.”

So he began by manipulating the base malts, starting off with adding more Munich. Doing a triangle test against the first batch showed increased flavor, but not enough and not much increase in body and mouthfeel. Then he tried increasing mash temperature from 153 °F (67 °C) to 168 °F (76 °C). There was pretty much no change in attenuation, body, or mouthfeel. In addition, no matter what he did to increase maltiness, the hop character still came through as harsh even though the IBUs were in the mid-20s.

The next tweak was to try malts from a different maltster. Using craft malt was the breakthrough here. Using Lamonta and Metolius malts (Mecca Grade Estate Malt’s versions of pale ale and Munich malts — if Mecca Grade Estate Malt isn’t available where you live, look around for a local craft malt) made an enormous difference in the amount of flavor in the beer. This was an easy triangle test. It was immediately obvious that there was a difference, which Denny preferred.

The next tweak was to try American Noble Hop™ pellets from Yakima Chief Hops. American Noble Hops™ are made from the bract material left from the cryo hop process. They are very low alpha and have a unique “soft” character to them, kind of like what you get from European noble hops. Tasting this version against the previous version showed that he had achieved the integrated hop flavor and bitterness that he was going for.
He needed a bit more body and sweetness, so he decided to try adding some 60 °L crystal malt. He didn’t have any luck finding that from a craft maltster, but Briess makes a nice organic C60. Adding a hefty dose of that did the trick. Triangle tests performed this time revealed the difference was significant and Denny had a clear preference.

Without using the triangle test when he was developing the recipe, Denny would have had to just make a guess if some of these changes had made a difference. Of the eight test batches he brewed, about half of them resulted in beers with so little change that he would have had to flip a coin. But using triangle testing he was able to clearly tell what factors had made a perceptible difference and if the resulting beer was preferred to the previous version.

You may not be crazy enough to brew eight test batches of a recipe like Denny did, but triangle testing can be a valuable tool in your toolbox when you’re coming up with a new recipe, tweaking an old one, or even comparing a “clone” of a commercial beer against the real thing. And at the very least, you’ll see that tasting beer is a lot harder than drinking beer!

Denny’s American Mild (Final Version, #8)

(5 gallons/19 L, all-grain)
OG = 1.041 FG = 1.012
IBU = 20 SRM = 13 ABV = 3.8%

4 lbs. (1.8 kg) Mecca Grade Estate Malts Lamonta malt (3 °L)
4 lbs. (1.8 kg) Mecca Grade Estate Malts Metolius malt (14 °L)
1 lb. (454 g) Briess organic crystal malt (60 °L)
3 AAU American Noble Citra® pellet hops (first wort hop) (1 oz./28 g at 3% alpha acids)
2.2 AAU American Noble Simcoe® pellet hops (60 min.) (1 oz./28 g at 2.2% alpha acids)
1.85 AAU American Noble Mosaic® pellet hops (60 min.) (0.5 oz./14 g at 2.7% alpha acids)
0.88 AAU American Noble Simcoe® pellet hops (1 min.) (0.4 oz./11 g at 2.2% alpha acids)
3 AAU American Noble Citra® pellet hops (1 min.) (1 oz./28 g at 3% alpha acids)
Wyeast 1450 (Denny’s Favorite) or White Labs WLP051 (California V Ale) or Mangrove Jack’s M36 (Liberty Bell Ale) yeast
1/2 cup corn sugar (for priming)

Step by Step
Due to the low gravity of this wort, no yeast starter is needed.

On brew day, mash in the crushed malt at 158 °F (70 °C) in 7.5 gallons (28.4 L) of water in a mash tun brew-in-a-bag setup. Hold at this temperature for 60 minutes. Raise mash temperature to 170 °F (77 °C), hold for 5 minutes then recirculate or remove the grain bag. Do not sparge, run-off wort into the kettle if using a separate mash tun. Bring wort to a boil and boil for 60 minutes, adding hops at times indicated.

Chill the wort down to yeast-pitching temperature, aerate, and pitch the yeast. Ferment around 66 °F (19 °C). Condition for one week then package as normal.

Issue: December 2019