## How the Johnson Transformation Works

The Johnson Transformation relies on the fact that any function of a normal random variable is also normal. Suppose we have a random variable X that is not normally distributed. We can create a new random variable Y=g(X) where g() is a function that transforms X into a new random variable Y that is normally distributed. If we can find such a function g(), then we can use Y in place of X in any statistical analysis that assumes normality.

There are several cases where the Johnson Transformation can be used, but the most common case is when X is skewed to the left or right. In these cases, the transformation can be used to create normally distributed variables from skew variables. The advantage of using the Johnson Transformation is that it preserves many of the features of the original skewed distribution, including location, range, and mode, while still providing data that can be analyzed using methods that assume normality.

There are four types of Johnson Transforms: **linear**, **cumulative probability**, **uniform**, and **logit**. Each type of transform has its own characteristics and uses.

The linear transformation is the most common and simplest form of the Johnson Transformation. It transforms data by adding or subtracting a constant and then multiplying or dividing by another constant. This transformation creates a new variable that is linearly related to the original variable.

Cumulative probability transformations are used when you want to preserve information about how likely it is for values to fall above or below certain thresholds. These transformations are especially useful for extreme value analysis.

Uniform transformations are used when you want to create variables that are uniformly distributed between 0 and 1. These transformed variables can then be used in Monte Carlo simulations.

Logit transformations are used when you want to create dummy variables from categorical data. Dummy variables are commonly used in regression analysis.

## What does the p-value mean in Johnson transformation?

The p-value in Johnson transformation is an essential indicator of the reliability of the test results. It reflects how likely any observed difference between two samples is due to chance alone and not a real effect on the studied population. A low p-value, usually 0.05 or less, indicates that an observed difference is unlikely due to chance and is statistically significant.<

Conversely, a high p-value suggests that the observed difference could be easily explained by random variation in the sample being tested. Thus, if the p-value of your Johnson transformation test is low, you can confidently say there is an actual difference between the two samples. However, if it is high, then you should reconsider your test and try another approach.

By understanding p-values in Johnson transformation, you can obtain more accurate results from your tests and draw better conclusions from them. With this knowledge, you can make decisions that are informed and reliable.

## Example of a Johnson Transformation in Minitab

You have a random variable that is not normally distributed, and you want to use it in statistical analysis that assumes normality.

How do you go about doing a Johnson Transformation in Minitab?

Solution: The steps for doing a Johnson Transformation in Minitab are as follows:

1. Open the Minitab software program.

2. Click on the “Stat” menu and select “Basic Statistics.”

3. Select “Descriptive Statistics” and click the “OK” button.

4. In the Variable box, enter the name of the original random variable. In the Statistic box, select “Johnson Transformation.”

5. Click on the “Options” button and make sure that the “Display transformed values” option is checked.

6. Click on the “OK” button to run the analysis. The results will be displayed in a new window.

Learn more about Minitab and the Johnson Transformation

## When to use Data Transformation:

### 1. When the data is non-normal

One reason to use data transformation is when the data is non-normal. Non-normal data is data that does not follow a normal distribution, which is a symmetrical bell-shaped curve. Many statistical tests assume that the data is normal, so transforming non-normal data can help to make the results of these tests more accurate.

### 2. When the data is skewed

Another reason to use data transformation is when the data is skewed. Skewed data is data that is not evenly distributed, with most of the values clustered around one side of the distribution. Skewed data can often be transformed into a more normal distribution, which can be helpful for statistical testing.

### 3. When the data has outliers

Outliers are extreme values that are significantly different from the rest of the data. When outliers are present, they can often have a significant impact on the results of statistical tests. Therefore, it can be helpful to transform the data in order to reduce the impact of outliers on the results.

### 4. When you want to compare two or more groups of data

Another reason to use transformation is when you want to compare two or more groups of data. Some statistical tests require that the groups being compared have equal variance, which is a measure of how spread out the values are within a group. Transforming the data can often help to equalise the variance between groups, which can make comparisons more accurate.

### 5. When you want to stabilise variance over time

If you have data that was collected over time, you may want to use transformation in order to stabilise variance over time. Variance stabilisation helps to ensure that any changes in the mean are not due simply to changes in variance over time, which can make interpretation of results more difficult

### Conclusion:

The **Johnson Transformation** is a mathematical transformation used to create new variables from existing variables. It can be used to linearize nonlinear relationships and to create normally distributed variables from non-normal ones. The transformation derives its name from British statistician Norman Lloyd Johnson who invented it in 1958. There are four types of Johnson Transforms: linear, cumulative probability, uniform, and logit, each with its own characteristics and uses depending on the intended purpose of the transformed data set. Because of its wide array of applications, the Johnson Transformation has become an essential tool for statisticians worldwide.

Learn more on our Lean Six Sigma Black Belt Course