Hello, good to see you!

I made this post to help you prepare for oral and written quiz or exam on advanced or mathematical statistics.

So basically, I would be explaining the concepts in very easy to understand way to help you understand it and be able to answer in your own words.

**Question 1: What is Partial F-Test and when is it applied**

Partial F-Test is a statistical test used to in multivariate linear regression where extra variables have been included, to determing whether the extra variables provide enough extra explanatory power as a group.

It is used when the simultaneous test of the statistical significance of a group of variables is being tested and requires two regression models.

**Question 2: Outline and Expain the 3 Modelling Algorithms**

The three modelling algorithms are:

- Forward Selection
- Backward Selection
- Stepwise Selection

*Foward Selection*

Step 1: In the first step of this algorithm, a list of independent variables with the highest correlation coefficient in the absolute value with the target variable is selected.

Then calculate the F-statistic to see if a very strong linear relationship exists between the variables.

If there is a strong relationship, then stop, else go to step 1

Step 2: In the second step, take the next variable, the highest partial correlation coefficient among the residues with the target variable.. Then calculate Fchange, the extended linear regression. If the calculated value, shows a strong relationship stop, else repeat the process

*Backward Elimination*

This algorithm works in opposite direction to the Foward Selection. In this case, we start by including all the variables from the start, then we select the least suitable ones. Then we calculate the error. Based on the error value, we remove or eliminate the least suitable variables from the selection. We repeat the caculation until the error is at a minimum value.

*Stepwise Selection*

This algorithm is a combination of the Foward Selection and Backward Elimination. This involves gradually removing variables fromthe list of independent variables repeatadly. The stop rule now has a minimum and maximum value. We stop when the F statistic has reached a significant level.

**Question 3: What is Adjusted Coefficient of Determination R**

^{2}and What is it Used For? What is the difference between R^{2}and Adjusted R^{2}?
The Adjusted Coefficient of Determination is used in multiple regression to determine how well a multiple regression equation fits the sample data.

The main difference between R

^{2}and adjusted R^{2}is that R^{2}increases automatically as new independent variables are added to the regression equation(even if they don't contribute to the explanatory power of the equation)
However, the adjusted R

^{2}increases only if the added new independent variables contribute to the explanatory power of the regression equation. Therefore, the adjusted R^{2}is mor useful measure of regression fit than just the R^{2}.**Question 4: What is Heteroscedasticity?**

Heteroscedasticity is a concept in regression that means unequal scatter. It is a systematic variation is the spread of the residues oeer the change of measured values.Remember that residue is the errror measured between the regression line and the data point. It is the opposit of homoscedasticity(which means constant variation in the residues)

**Question 5: What are the Type of Residues**

Common Residual

Deleted Residual

Standardized Residual

Studendized Residual

**Question 6: What is Covariance Ratio?**

The coveriance ratio is given by:

CovR = CovTL/CovTLj

where:

CovTL is the covariance between the target variable and the linear regression

CovTLj is the covariance betweent the target variable and teh linear regression gotten by omitting the jth case.

**Question 7: What is Multicollinearity in Regression?**

Multicolinearity means the linear correlation relationship between the two or more explanatory variables.

**Question 8: Mention some problems caused by Multicollinearity?**

- the effect of the explanatory variables on the depended variables cannot really be separated
- the explanatory variables can assume the role of other explanatory variables
- estimation of regression coefficients becomes unreliable
- in extreme situations, analysis cannot be performed on the data

We continue with the next part. You can ask a question in the box below or in the form by the left of this page.

Thank you!