A Guide to the Benjamini-Hochberg Procedure - Statology (2024)

Whenever you conduct a statistical test, it’s possible that you could get a p-value that is less than 0.05 purely by chance, even if your null hypothesis is true.

For example, suppose you want to know if a certain plant has a mean height greater than 10 inches. Your null and alternative hypotheses for the test would be:

H0: μ = 10 inches

HA:μ > 10 inches

To test this hypothesis, you may go out and collect a random sample of 20 plants to measure. Even if the true mean height of this species of plant is 10 inches, it’s possible that you could have selected a sample of 20 plants that were unusually tall, which will lead you to reject the null hypothesis.

Although the null hypothesis was true (the mean height of this plant really was 10 inches), you rejected it. In statistics, we call this a “false discovery.” You claimed to have made a discovery – a “significant result” – but it’s actually a false one.

Now imagine that you conduct 100 statistical tests at once. Using an alpha level of 0.05, there’s only a 5% chance of making a false discovery with any individual test, but because you’re conducting such a large amount of tests, you would expect about 5 of the 100 to lead to false discoveries.

In the modern world, false discoveries can be a common problem since technology has enabled researchers to conduct hundreds or even thousands of statistical tests at once.

For example, medical researchers can run statistical tests on tens of thousands of genes at once. Even with a false discovery rate of just 5%, this means hundreds of tests could result in false discoveries.

One way to control the false discovery rate is to use something known as the Benjamini-Hochberg Procedure.

TheBenjamini-Hochberg Procedure

The Benjamini-Hochberg Procedure works as follows:

Step 1:Conduct all of your statistical tests and find the p-value for each test.

Step 2:Arrange the p-values in order from smallest to largest, assigning a rank to each one – the smallest p-value has a rank of 1, the next smallest has a rank of 2, etc.

Step 3:Calculate the Benjamini-Hochberg critical value for each p-value, using the formula (i/m)*Q

where:

i = rank of p-value

m = total number of tests

Q = your chosen false discovery rate

Step 4: Find the largest p-value that is less than the critical value. Designate every p-value that is smaller than this p-value to be significant.

The following example illustrates how to conduct this procedure with concrete values.

Example

Suppose researchers are interested in determining whether or not 20 different variables are linked to heart disease. They conduct 20 individual statistical tests at once and receive a p-value for each test. The following table shows the p-values for each test, ranked in order from smallest to largest.

A Guide to the Benjamini-Hochberg Procedure - Statology (1)

Suppose researchers are willing to accept a 20% false discovery rate. Thus, to calculate the Benjamini-Hochberg critical value for each p-value, we can use the following formula: (i/20)*0.2wherei = rank of p-value.

The following table shows the Benjamini-Hochberg critical value for each individual p-value:

A Guide to the Benjamini-Hochberg Procedure - Statology (2)

The test with the largest p-value that is less than its Benjamini-Hochberg critical value is Variable #11, which has a p-value of 0.039 and a B-H critical value of 0.040.

Thus, this test and all tests with a smaller p-value will be considered significant.

A Guide to the Benjamini-Hochberg Procedure - Statology (3)

Note that even though Variable #17 and Variable #3 didn’t have p-values that were smaller than their B-H critical values, they are still considered to be significant since they have smaller p-values than Variable #11.

How to choose a false discovery rate

One of the most important steps in the Benjamini-Hochberg procedure is choosing a false discovery rate. You should choose your false discovery rate before you actually collect any data or conduct any statistical tests.

Typically you will conduct a large number of statistical tests during the exploratory phase of your analysis, which you will then follow up with more tests to further investigate your findings.

If the follow-up tests are low-cost, then you may consider setting a higher false discovery rate because even if you have a few false discoveries you’re likely to uncover these false discoveries with future tests.

Also, if the cost of missing an important discovery is high then you may want to set your false discovery rate higher so that you don’t miss anything important.

Depending on the costs of your research and the importance of not missing any important discoveries, the false discovery rate will vary from one situation to the next.

Additional Resources

An Explanation of P-Values and Statistical Significance
What is the Family-wise Error Rate?

Featured Posts

A Guide to the Benjamini-Hochberg Procedure - Statology (4)

5 Tips for Interpreting P-Values Correctly in Hypothesis Testing

May 23, 2024

A Guide to the Benjamini-Hochberg Procedure - Statology (5)

7 Best YouTube Channels to Learn Statistics for Free

May 20, 2024

A Guide to the Benjamini-Hochberg Procedure - Statology (6)

5 Regularization Techniques You Should Know

May 13, 2024

A Guide to the Benjamini-Hochberg Procedure - Statology (7)

Statistics Cheat Sheets to Get Before Your Job Interview

May 6, 2024

A Guide to the Benjamini-Hochberg Procedure - Statology (8)

5 Statistical Biases to Avoid

April 25, 2024

A Guide to the Benjamini-Hochberg Procedure - Statology (9)

5 Free Statistics Courses for Beginners

April 19, 2024

A Guide to the Benjamini-Hochberg Procedure - Statology (2024)

References

Top Articles
Latest Posts
Article information

Author: Geoffrey Lueilwitz

Last Updated:

Views: 6149

Rating: 5 / 5 (60 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Geoffrey Lueilwitz

Birthday: 1997-03-23

Address: 74183 Thomas Course, Port Micheal, OK 55446-1529

Phone: +13408645881558

Job: Global Representative

Hobby: Sailing, Vehicle restoration, Rowing, Ghost hunting, Scrapbooking, Rugby, Board sports

Introduction: My name is Geoffrey Lueilwitz, I am a zealous, encouraging, sparkling, enchanting, graceful, faithful, nice person who loves writing and wants to share my knowledge and understanding with you.