Article

Brewing Experiments

Information on how to brew at home ultimately comes from a variety of sources. For starters, we have the long history of brewing to draw from. This history stretches back long before the scientific method was developed and much of what we know today about brewing must have been discovered by trial and error, perhaps with the occasional educated guess thrown in. More recently, brewing has been studied scientifically and much of the information we have about brewing comes from the careful study of English ales, German Pilsners and especially American-style Pilsners.

Of course, much of what we do as homebrewers has not been studied scientifically. The scale we brew at is not economically important (to the people who fund brewing science studies) and the wide variety of adapted pieces of equipment we brew on and the improvised techniques we employ would keep a large research lab busy for decades. And finally, let’s face it, a lot of what we “know” about homebrewing is just someone’s opinion. But what if you wanted to answer a homebrewing question or settle a homebrew argument once and for all? How would you approach this?

The only real way of gaining knowledge of the physical world is through scientific experimentation. And, with a little effort, any homebrewer can make a real contribution to the (almost non-existent) science of homebrewing. In this article, I’ll describe the basic method for testing a simple idea (or hypothesis) scientifically. To do so, we’ll imagine that two homebrewers from “BYOzarro World” want to determine which of two yeast strains will work better for brewing a batch of their beloved traditional Norwegian ale. We’ll follow them through a variety of ill-considered approaches and eventually end up with a decent experimental method for testing this question.

The hypothesis

To set up our story, let’s say that there are two companies in BYOzarro World that supply homebrewers with yeast — Black Labs and Zyeast. Black Labs has a strain of yeast they call BLP003 (Oslo Ale) yeast and Zyeast has a strain they call Zyeast 2112 (Norwegian Ale) yeast. Ole and Sven, both avid homebrewers and lovers of traditional Norwegian ale, wonder if there is a difference between the two and decide to find out.

Take one

As a first approach, Ole goes to his homebrew shop and picks up a tube of Black Labs yeast. Black Labs is the only yeast Ole’s shop carries and Ole has always been happy with it. He brews a batch of traditional Norwegian ale and tries it. “Wow, this is great,” he tells Sven. “I think Black Labs definitely makes the best yeast for traditional Norwegian ale.”

Is Ole’s “experiment” a good test of their hypothesis? No, in fact this isn’t a test of the question at all. Even if the Black Labs strain did a good job in Ole’s beer, no information about the Zyeast strain was gained in Ole’s “experiment.”

Take two

After trying Ole’s beer, Sven suggests that they should each brew a beer — Ole using the Black Labs strain and Sven using the Zyeast strain — and compare the two. Ole agrees and brews another batch of beer. Sven grabs some Zyeast from his local shop and makes a Norwegian ale, using his grandfather’s recipe for Norwegian Christmas ale (the one with dark grains and lingonberries). They meet a month later at Ole’s house and taste the beers side by side. “You see,” said Ole, “the Black Labs strain is better.”

At this point, Ole’s son Lars — who has been watching the taste-off — chimes in. “You know,” he says, “your experiment really isn’t a good test of your question. You are comparing beers made from two different recipes, brewed by two different brewers, on two different systems, with two different water sources and many other differences. You need to construct your experiment such that the only difference between batches of beer is the yeast strain — and you really need to do the experiment more than once to show that your results are repeatable.”

“Repeatable,” says Ole quizzically, “If I do it once, isn’t that enough?”

“Not really,” says Lars, “what if one of your batches got contaminated or had some problem that influenced the experiment? The results from a single trial could just be a fluke. You need to show that — if there is a difference between Black Labs and Zyeast — that the difference shows up repeatably.”

At first Ole is reluctant. “That seems like a lot of brewing,” he says.

“Yes,” says Sven, “but that also means we get to drink a lot of Norwegian ale.”

“In that case, “says Ole,” I’ll do it . . . for science.”

So, the three put their heads together and begin to devise their experiment.

Take three

“I can brew a batch each weekend,” says Ole, “This weekend with Black Labs yeast and the next weekend with Zyeast. Then I can brew the next two weekends and repeat the experiment. I’ll use the same recipe and same ingredients each time.”

Sven counters with, “Why not use the same exact wort for all four beers? Why not brew one big batch of beer, split the wort into four identical carboys. Pitch two of the carboys with two different tubes of Black Labs yeast and the other two with two different smack packs of Zyeast. That way, the only difference is the yeast!”

“Well not quite,” Lars said, “You still need to aerate all four carboys separately. And, even if you set the carboys side-by-side, it’s always possible some difference between them will exist.”

“I don’t see how,” said Ole. “If I put them, say, both in the basement right next to each other, they are in exactly the same environment, aren’t they?”

“They might be, “said Lars, “but what if one carboy is a little closer to the radiator downstairs and gets just a little bit hotter? What if light from the window falls on one of the carboys for an hour or so during the day, but not the other? There are a lot of ways the two could experience different conditions, you’ll just have try to eliminate every little difference.”

Just then, Sven has a thought. “Hey,” he says, “Black Labs yeast comes in a
35 mL tube, while Zyeast comes in a 125 mL smack pack — won’t that affect our experiment?”

“It could,” said Lars, “but maybe you could eliminate, or at least minimize, that difference by making four yeast starters.”

“OK,” said Ole,” That sounds like a plan — are we set? I’ll brew the beers and give them a taste, with a little help from Sven, and we’ll see that Black Labs really is the best!”

At this point, Ole’s wife Lena chimes in. “You know Ole,” she said, “You already have a strong preference for Black Labs, isn’t that right?”

“Ya sure, you betcha. I always bet on Black,” replied Ole.

“Perhaps then, you aren’t the best candidate to both brew and judge the beers. You’d have many opportunities to consciously or unconsciously influence the experiment.”

“How so?” asked Ole.

“Well, if you knew which carboys had which yeast, you might give the Black Labs a little extra attention because you ‘know’ that it contains the best beer. And, of course, if you knew which two beers were the Black Labs beers, you might — consciously or unconsciously — judge the Zyeast beers unfairly.”

“But,” said Ole, “Sven doesn’t have time to do the experiment. So, where does that leave us?”

“Well,” said Lena, “maybe Sven could make the four yeast starters and label them so that you won’t know which is which. That way, when you brew the beer, you won’t have an opportunity to play favorites. Then, once the beers are ready, you could taste them first, and only get the code from Sven after you’ve written down your conclusions.”

“You could also send the beers to a few different homebrew contests and bring them to a homebrew club meeting to get other people’s opinions,” said Lars.

Is it a plan?

OK, so let’s leave BYOzarro World and discuss their planned approach. As it stands, it’s a fairly decent experiment that could yield useful results (although, as we’ll see, that’s not guaranteed). In their experiment, the null hypothesis is that the two yeast strains cannot be distinguished by the experiment. Ole, however, thinks there may be a difference between the two yeast strains. In the lingo, this is called the alternative hypothesis.

Ole and Sven are definitely on the right track in their experimental design. They have eliminated most of the major differences between their four planned trials (the four carboys of traditional Norwegian ale). Splitting the wort means that the four yeast samples are working on wort made from the same recipe, on the same equipment by the same brewer. Of course, there still could be some differences between these trials. One of the carboys could be contaminated, the fill levels and degree of aeration might not all be exactly equal and — unless they counted their yeast — the pitching rate might vary. Would any of these make a difference? Maybe so, maybe not, but — when designing an experiment — scientists try to make everything the same between trials so they don’t need to speculate about which differences might have an effect. Still, if reasonable care were taken to treat all four carboys equally, differences between the two strains should be detected, if they existed.

One very positive aspect of their experimental plan is that there are four independent trials — each yeast strain is used twice so they can see if their experimental results are repeatable. But is four trials enough to test between two yeast strains? The answer is . . . maybe. If the differences in performance between the two strains were fairly large, and the differences between batches made with the same yeast were fairly small, Ole and Sven may probably be able to detect a difference between the strains. However, if the differences between the strains were small, and the variability among tests of the same strain were fairly large, Ole and Sven might have to brew many, many batches of ale before a clear difference emerges. (To determine how many batches they would need and when the data suggested that there was a difference, the two would have to learn about statistics.)

In this experiment, Ole’s wife proposed that the experiment be conducted blind — without Ole knowing which trial contained which yeast — because of Ole’s known affinity for Black Labs yeast. Most scientific experiments, however, are not done blind. As a practical matter, most homebrewers performing experiments will likely perform and judge their own experiment. The availability of homebrew contests, however, at least allows that some comparison between beers can be made blindly. (However, you can’t request specific data from a homebrew contest. For example, if you wanted to compare diacetyl levels in four experimental trials, you couldn’t instruct the judges to compare the diacetyl level for your four ales.)

Finally, it should be possible to collect “hard” numerical data for most experiments, in addition to subjective observations (i.e. taste and smell). For example, Ole could record how long it took each carboy to show signs of fermentation. Likewise, he could record the specific gravity (and pH) of the wort each day until fermentation was complete.

Issue: July-August 2007