Welcome to our p-value calculator! You will never again have to wonder how to find the p-value, as here you can determine the one-sided and two-sided p-values from test statistics, following all the most popular distributions: normal, t-Student, chi-squared, and Snedecor's F.
P-values appear all over science, yet many people find the concept a bit intimidating. Don't worry – in this article, we will explain not only what the p-value is but also how to interpret p-values correctly. Have you ever been curious about how to calculate the p-value by hand? We provide you with all the necessary formulae as well!
🙋 If you want to revise some basics from statistics, our normal distribution calculator is an excellent place to start.
Formally, the p-value is the probability that the test statistic will produce values at least as extreme as the value it produced for your sample. It is crucial to remember that this probability is calculated under the assumption that the null hypothesis H0 is true!
More intuitively, p-value answers the question:
Assuming that I live in a world where the null hypothesis holds, how probable is it that, for another sample, the test I'm performing will generate a value at least as extreme as the one I observed for the sample I already have?
It is the alternative hypothesis that determines what "extreme" actually means, so the p-value depends on the alternative hypothesis that you state: left-tailed, right-tailed, or two-tailed. In the formulas below, S stands for a test statistic, x for the value it produced for a given sample, and Pr(event | H0) is the probability of an event, calculated under the assumption that H0 is true:
As a picture is worth a thousand words, let us illustrate these definitions. Here, we use the fact that the probability can be neatly depicted as the area under the density curve for a given distribution. We give two sets of pictures: one for a symmetric distribution and the other for a skewed (non-symmetric) distribution.
values for symmetric distribution — left-tailed, right-tailed, and two-tailed tests." width="100%" />
In the last picture (two-tailed p-value for skewed distribution), the area of the left-hand side is equal to the area of the right-hand side.
To determine the p-value, you need to know the distribution of your test statistic under the assumption that the null hypothesis is true. Then, with the help of the cumulative distribution function (cdf) of this distribution, we can express the probability of the test statistics being at least as extreme as its value x for the sample:
The probability distributions that are most widespread in hypothesis testing tend to have complicated cdf formulae, and finding the p-value by hand may not be possible. You'll likely need to resort to a computer or to a statistical table, where people have gathered approximate cdf values.
Well, you now know how to calculate the p-value, but… why do you need to calculate this number in the first place? In hypothesis testing, the p-value approach is an alternative to the critical value approach. Recall that the latter requires researchers to pre-set the significance level, α, which is the probability of rejecting the null hypothesis when it is true (so of type I error). Once you have your p-value, you just need to compare it with any given α to quickly decide whether or not to reject the null hypothesis at that significance level, α. For details, check the next section, where we explain how to interpret p-values.
As we have mentioned above, the p-value is the answer to the following question:
Assuming that I live in a world where the null hypothesis holds, how probable is it that, for another sample, the test I'm performing will generate a value at least as extreme as the one I observed for the sample I already have?
What does that mean for you? Well, you've got two options:
However, it may happen that the null hypothesis is true, but your sample is highly unusual! For example, imagine we studied the effect of a new drug and got a p-value of 0.03. This means that in 3% of similar studies, random chance alone would still be able to produce the value of the test statistic that we obtained, or a value even more extreme, even if the drug had no effect at all!
The question "what is p-value" can also be answered as follows: p-value is the smallest level of significance at which the null hypothesis would be rejected. So, if you now want to make a decision on the null hypothesis at some significance level α, just compare your p-value with α:
Obviously, the fate of the null hypothesis depends on α. For instance, if the p-value was 0.03, we would reject the null hypothesis at a significance level of 0.05, but not at a level of 0.01. That's why the significance level should be stated in advance and not adapted conveniently after the p-value has been established! A significance level of 0.05 is the most common value, but there's nothing magical about it. Here, you can see what too strong a faith in the 0.05 threshold can lead to. It's always best to report the p-value, and allow the reader to make their own conclusions.
Also, bear in mind that subject area expertise (and common reason) is crucial. Otherwise, mindlessly applying statistical principles, you can easily arrive at statistically significant, despite the conclusion being 100% untrue.
As our p-value calculator is here at your service, you no longer need to wonder how to find p-value from all those complicated test statistics! Here are the steps you need to follow:
In terms of the cumulative distribution function (cdf) of the standard normal distribution, which is traditionally denoted by Φ, the p-value is given by the following formulae:
🙋 To learn more about Z-tests, head to Omni's Z-test calculator.
We use the Z-score if the test statistic approximately follows the standard normal distribution N(0,1). Thanks to the central limit theorem, you can count on the approximation if you have a large sample (say at least 50 data points) and treat your distribution as normal.
A Z-test most often refers to testing the population mean, or the difference between two population means, in particular between two proportions. You can also find Z-tests in maximum likelihood estimations.
The p-value from the t-score is given by the following formulae, in which cdft,d stands for the cumulative distribution function of the t-Student distribution with d degrees of freedom:
Use the t-score option if your test statistic follows the t-Student distribution. This distribution has a shape similar to N(0,1) (bell-shaped and symmetric) but has heavier tails – the exact shape depends on the parameter called the degrees of freedom. If the number of degrees of freedom is large (>30), which generically happens for large samples, the t-Student distribution is practically indistinguishable from the normal distribution N(0,1).
The most common t-tests are those for population means with an unknown population standard deviation, or for the difference between means of two populations, with either equal or unequal yet unknown population standard deviations. There's also a t-test for paired (dependent) samples.
🙋 To get more insights into t-statistics, we recommend using our t-test calculator.
Use the χ²-score option when performing a test in which the test statistic follows the χ²-distribution.
This distribution arises if, for example, you take the sum of squared variables, each following the normal distribution N(0,1). Remember to check the number of degrees of freedom of the χ²-distribution of your test statistic!
How to find the p-value from chi-square-score? You can do it with the help of the following formulae, in which cdfχ²,d denotes the cumulative distribution function of the χ²-distribution with d degrees of freedom:
The most popular tests which lead to a χ²-score are the following:
Finally, the F-score option should be used when you perform a test in which the test statistic follows the F-distribution, also known as the Fisher–Snedecor distribution. The exact shape of an F-distribution depends on two degrees of freedom.
To see where those degrees of freedom come from, consider the independent random variables X and Y, which both follow the χ²-distributions with d1 and d2 degrees of freedom, respectively. In that case, the ratio (X/d1)/(Y/d2) follows the F-distribution, with (d1, d2)-degrees of freedom. For this reason, the two parameters d1 and d2 are also called the numerator and denominator degrees of freedom.
The p-value from F-score is given by the following formulae, where we let cdfF,d1,d2 denote the cumulative distribution function of the F-distribution, with (d1, d2)-degrees of freedom:
Below we list the most important tests that produce F-scores. All of them are right-tailed tests.
No, the p-value cannot be negative. This is because probabilities cannot be negative, and the p-value is the probability of the test statistic satisfying certain conditions.
A high p-value means that under the null hypothesis, there's a high probability that for another sample, the test statistic will generate a value at least as extreme as the one observed in the sample you already have. A high p-value doesn't allow you to reject the null hypothesis.
A low p-value means that under the null hypothesis, there's little probability that for another sample, the test statistic will generate a value at least as extreme as the one observed for the sample you already have. A low p-value is evidence in favor of the alternative hypothesis – it allows you to reject the null hypothesis.