Good to see you again! Hope you are doing fine

ðŸ˜€

The Run Test is actually one of the most interesting statistical test ever and it is so easy to understand, or even the easiest. Remember that if you have any difficulty following this lesson, let me know in the comment box below. ðŸ‘‡ðŸ‘‡ðŸ‘‡ I would clearify

In this simple lesson, we are going to explain in very simple and clear terms, the concept of run test in statiscs

Run test is a statistical test used to determine of the data obtained from a sample is ramdom. That is why it is called Run Test for Randomness.

Randomness of the data is determined based on the number and nauture of runs present in the data of interest.

A run is a sequence of similar or like events, items or symbols that is preceded by and followed by an event, item or symbol of a different type, or by none at all.

Randomness of of the series is unlikely when there appear to be either too many or two few runs. In this case, a run test need to be carried out to determine the randomness.

The Run Test when performed helps us to decide whether a sequence of events, items or symbol is the result of a random process.

A data scientist carrying out a research interviewed 10 persons during a survey. We denote the genders of the poeple by M for men and W for women.

Assuming the respondents were chosen as follows:

M M M M M F F F F F

Scenario 2

F M F M F M F M F M

Scenario 3

F F F M M F M M F F

Scenario 1 has only 2 runs and therefore the scenario cannot be considered random because there are to few runs

Scenario 2 has too many runs, 5 runs. And therefore would not be considered as random

Scenario 3 has 5 runs and therefore we need to perform a test to determine the randomness of the data.

First we need to assume that the data available for the analysis consistes of a sequence of observations, recorded in order of occurence, which we can categorize into two mutually exclusive types.

First, you need to determine the total sample size, then the number of observation ofeach type as presented below:

n = total sample size

n

n

Then State the null and alternate hypothesis

A. TWO-SIDED

H

H

B. ONE-SIDED

H

H

C. ONE-SIDED

H

H

The test statistic is r = total number of runs

The decision rules is also called the acceptance or rejection criteria. It depends on the test statistic(calculated) and the value and the values of upper and lower limits(from statistical tables)

Critical value is determined from statistical table using n

We can solve some examples to clarify this.

On a commuter train, the conductor want to see whether the passengers entering a train enter in a random manner. He observes the first 25 people, with the following sequence of males(M) and females(F).

F F F M M F F F F M F M M M F F F F M M F F F M M

Test for randomness at Î± = 0.05

H

H

You can easily get this by grouping each run as shown below:

FFF MM FFFF M F MMM FFFF MM FFF MM

Test statistic, r = 10

n

n

We can find the lower and upper critical value from statistical run table

n

Lower critical value = 7

Upper critical value = 18

Since r = 10 which is between 7 and 18, we accept the null hypothesis (we fail to reject it)

There are not enough evidence to reject the claim hat the pattern of occurence of males and femals enter the train is determined by a random process

We have 20 people that enrolled in a drug abuse program. Test the claim that the ages of the people, according to the order in whihc they enroll occur at random, at Î± = 0.05.

The data are as follows:

18, 36, 19, 22, 25, 44, 23, 27, 27, 35, 19, 43, 37, 32, 28, 43, 46, 19, 20, 22

The claim is the null hypothesis H

H

H

To find the number of runs we first arrange the data in ascending order and find the median of the data set.

Then compare the original data with the median. The replace the above median in the original sequence with an A if it is above the median and with B if it is below the median.(you can also use the mean instead of median)

I have done this using excel and the result is shown below:

We can now arrange the data according to runs and we would have the output below:

B A BBB A B A B AAAAAA BBB

From the above we have

Test statistic, r = 9

n

n

n

From statistical table of Runs Test, we get the critical values

Upper critical value = 5

Lower critical value =15

Since the statistic r is between the upper and lower critical values, we accept H

There is not enough evidence to reject the claim that the patter of occurence of ages of people in th program is determined by a random process

Table 1.0 shows the departures from normal of daily temperatures recorded at Atlanta, Georgia during February1969. We would like to know whether we may conclude that the pattern of departures above and below normal is the result of a non-random process.

H

H

To get the number of runs, we need to find the departures from normal above and below zero. The departures from normal that is above 0 are recorded as A and those that are below 0 are recorded as B.

If we do this we would have the arrangement as follows:

AAAAAA B A B AA BBBBB AAAAAAAA BBBBBB

Test Statistic (number of runs) r = 8

n

n

n

Using statistical table we find the:

Lower critcal value = 10

Upper critical value = 22

Since r = 8, which is lower than the critical value, we reject the null hypothesis (H

There is enough evidence to support the claim that the pattern of occurence of positive and negative departures from normal is not random

What if the sample size is large? In this case we could use a formula to solve it. This formular calculates the the test statistic based on n

The formula is given by:

Then we can look up the critical value in the table of normal distribution

Now that you have completed the lessons on run tests. Thumbs up to you! One thing you can be sure is it does not get more complicated than this.

Just a a quiz, try to solve the three examples using the formula presented and compare the result you have with the result gotten without using the formula

ðŸ˜€

The Run Test is actually one of the most interesting statistical test ever and it is so easy to understand, or even the easiest. Remember that if you have any difficulty following this lesson, let me know in the comment box below. ðŸ‘‡ðŸ‘‡ðŸ‘‡ I would clearify

In this simple lesson, we are going to explain in very simple and clear terms, the concept of run test in statiscs

**Content**- What is Run Test?
- What is a Run?
- Example of Runs
- Run Test Procedure
- Hypothesis
- Test Statistic and Decision Rule
- Critical Value
- Example 1 and Solution
- Example 2 and Solution
- Example 3 and Solution
- Solving With Formular for large samples
- Final Notes

### 1. What is Run Test

Run test is a statistical test used to determine of the data obtained from a sample is ramdom. That is why it is called Run Test for Randomness.

Randomness of the data is determined based on the number and nauture of runs present in the data of interest.

### 2. What is a Run?

A run is a sequence of similar or like events, items or symbols that is preceded by and followed by an event, item or symbol of a different type, or by none at all.

Randomness of of the series is unlikely when there appear to be either too many or two few runs. In this case, a run test need to be carried out to determine the randomness.

The Run Test when performed helps us to decide whether a sequence of events, items or symbol is the result of a random process.

### 3. Example of Runs

A data scientist carrying out a research interviewed 10 persons during a survey. We denote the genders of the poeple by M for men and W for women.

Assuming the respondents were chosen as follows:

*Scenario 1*M M M M M F F F F F

Scenario 2

F M F M F M F M F M

Scenario 3

F F F M M F M M F F

Scenario 1 has only 2 runs and therefore the scenario cannot be considered random because there are to few runs

Scenario 2 has too many runs, 5 runs. And therefore would not be considered as random

Scenario 3 has 5 runs and therefore we need to perform a test to determine the randomness of the data.

### 4. Run Test Procedure

First we need to assume that the data available for the analysis consistes of a sequence of observations, recorded in order of occurence, which we can categorize into two mutually exclusive types.

First, you need to determine the total sample size, then the number of observation ofeach type as presented below:

n = total sample size

n

_{1}= the number of observation of one typen

_{2}= the number of observations of the other type**Hypothesis**Then State the null and alternate hypothesis

A. TWO-SIDED

H

_{0}: the pattern of occurence is randomH

_{1}: the pattern of occurence is not randomB. ONE-SIDED

H

_{0}: the pattern of occurence is randomH

_{1}: the pattern of occurence is not random (because there are too few runs to be atributed as random)C. ONE-SIDED

H

_{0}: the pattern of occurence is randomH

_{1}: the pattern of occurence is not random (because there are too few runs to be atributed as random)**Test Statistic and Decision Rule**The test statistic is r = total number of runs

The decision rules is also called the acceptance or rejection criteria. It depends on the test statistic(calculated) and the value and the values of upper and lower limits(from statistical tables)

Table 1: Decision Rule

**Critical Value**Critical value is determined from statistical table using n

_{1}and n_{2}We can solve some examples to clarify this.

### 5. Example 1

On a commuter train, the conductor want to see whether the passengers entering a train enter in a random manner. He observes the first 25 people, with the following sequence of males(M) and females(F).

F F F M M F F F F M F M M M F F F F M M F F F M M

Test for randomness at Î± = 0.05

**Solution Steps****Step 1**: State the null and alternate hypothesisH

_{0}: The patter of occurence of males and females enter the train is randomH

_{1}: The pattern of occurence of males and females entering the train is not random**Step 2:**Find the test statistic (number of runs)You can easily get this by grouping each run as shown below:

FFF MM FFFF M F MMM FFFF MM FFF MM

Test statistic, r = 10

n

_{1}= number of females = 15n

_{2}= number of males = 10**Step 3:**Find the critical valueWe can find the lower and upper critical value from statistical run table

n

_{1}= 15, n_{2}= 10Lower critical value = 7

Upper critical value = 18

**Step 4:**Make your decisionSince r = 10 which is between 7 and 18, we accept the null hypothesis (we fail to reject it)

**Step 5:**Draw a ConclusionThere are not enough evidence to reject the claim hat the pattern of occurence of males and femals enter the train is determined by a random process

### 6. Example 2

We have 20 people that enrolled in a drug abuse program. Test the claim that the ages of the people, according to the order in whihc they enroll occur at random, at Î± = 0.05.

The data are as follows:

18, 36, 19, 22, 25, 44, 23, 27, 27, 35, 19, 43, 37, 32, 28, 43, 46, 19, 20, 22

**Solution Steps****Step 1:**State the hypothesis and identify the claimThe claim is the null hypothesis H

_{0}and the hypothesis is the alternate hypothesis H_{1}.H

_{0}: The pattern of occurence of ages of the people enrolled in a drug abuse program is determined by a random processH

_{1}: The pattern of occurence ofa ages of people enrolled in a drug abuse program is not random**Step 2:**Find the test statistic (number of runs)To find the number of runs we first arrange the data in ascending order and find the median of the data set.

Then compare the original data with the median. The replace the above median in the original sequence with an A if it is above the median and with B if it is below the median.(you can also use the mean instead of median)

I have done this using excel and the result is shown below:

We can now arrange the data according to runs and we would have the output below:

B A BBB A B A B AAAAAA BBB

From the above we have

Test statistic, r = 9

n

_{1}= number or A runs = 9n

_{2}= number of B runs =9**Step 3:**Find the Critical Valuen

_{1}= 9, n_{2}= 9From statistical table of Runs Test, we get the critical values

Upper critical value = 5

Lower critical value =15

**Step 4:**Make the decisionSince the statistic r is between the upper and lower critical values, we accept H

_{0}**Step 5**: Draw your conclusionThere is not enough evidence to reject the claim that the patter of occurence of ages of people in th program is determined by a random process

### 7. Example 3:

Table 1.0 shows the departures from normal of daily temperatures recorded at Atlanta, Georgia during February1969. We would like to know whether we may conclude that the pattern of departures above and below normal is the result of a non-random process.

**Solution Steps****Step 1:**State the null and the alternate hypothesisH

_{0}: The pattern of occurence of negative and positive deviations from normal is determined by a random processH

_{1}: The pattern of occurrences of negative and positive deviations from normal is not determined by a random process (claim)**Step 2:**Find the test statistics (number of runs)To get the number of runs, we need to find the departures from normal above and below zero. The departures from normal that is above 0 are recorded as A and those that are below 0 are recorded as B.

If we do this we would have the arrangement as follows:

AAAAAA B A B AA BBBBB AAAAAAAA BBBBBB

Test Statistic (number of runs) r = 8

n

_{1}= number of A = 17n

_{2}= number of B = 13**Step 3:**Find the critical valuen

_{1}= 17, n_{2}= 13Using statistical table we find the:

Lower critcal value = 10

Upper critical value = 22

**Step 4**: Make the decisionSince r = 8, which is lower than the critical value, we reject the null hypothesis (H

_{0})**Step 5:**Draw the conclusionThere is enough evidence to support the claim that the pattern of occurence of positive and negative departures from normal is not random

### 8. Formular for Large Samples

What if the sample size is large? In this case we could use a formula to solve it. This formular calculates the the test statistic based on n

_{1}and n_{2}.The formula is given by:

Then we can look up the critical value in the table of normal distribution

### 9. Final Notes

Now that you have completed the lessons on run tests. Thumbs up to you! One thing you can be sure is it does not get more complicated than this.

Just a a quiz, try to solve the three examples using the formula presented and compare the result you have with the result gotten without using the formula