Complete Guide to Ordinal Regression Analysis Using SmartstatXL

SmartstatXL offers various types of regression analysis to model the relationship between independent and dependent variables. One type of analysis that can be carried out with SmartstatXL is Ordinal Regression.

Ordinal Regression is specifically used to model dependent variables that are ordinal in nature, i.e., they have categories with a specific order. Although it is a part of regression analysis, ordinal regression distinguishes itself by focusing on ordinal dependent variables, while the independent variables can be ordinal, interval, or ratio.

In the context of statistics, the dependent variable is also often referred to as the response, endogenous variable, prognostic variable, or regression. Meanwhile, independent variables may be known as exogenous variables, predictor variables, or regressors. For practical example, consider movie ratings given on a scale of 1 to 5.

Key Features of Ordinal Regression Analysis in SmartstatXL

SmartstatXL offers various advanced features to support ordinal regression analysis, including:

Regression Diagnostics:
- Information about outlier data.
Reference Settings:
- Ability to specify the order of response values as the reference, either based on the lowest or highest order.
Analysis Output:
- Regression Equation.
- Regression/Goodness-of-Fit Statistics: R², Cox-Snell R², Nagelkerke R², AIC, AICc, BIC, and Log Likelihood.
- Coefficient Estimates: Including Coefficient Value, Standard Error, Wald Stat, p-value, Upper/Lower, and VIF.
- Deviance Analysis Table.
- Confusion Matrix: Presents the Classification Table and various related metrics.

Case Example

This research aims to identify the factors influencing the decisions of third-year students to continue onto graduate school. They were asked how likely they were to continue: "not likely," "likely," or "very likely." The outcome variable for this research is divided into three categories based on their answers. Additionally, data about their parents' educational background, the type of university (public or private), and the students' Grade Point Average (GPA) were also collected. Interestingly, the researcher believes that the differences between the three answer categories are not uniform. For instance, the difference between "not likely" and "likely" may be smaller compared to "likely" and "very likely."

In this data, there is a variable named "apply" that reflects the students' answers with code 0 for "not likely," 1 for "likely," and 2 for "very likely." Aside from "apply," there are three other variables used as supporting factors in the analysis: "pared" indicating whether either parent has a postgraduate degree (0 for no and 1 for yes); "public" signifying the type of university (0 for private and 1 for public); and "gpa" indicating the students' average grade.

Source:https://stats.oarc.ucla.edu/r/dae/ordinal-logistic-regression/

Steps for Ordinal Regression Analysis

Activate the worksheet (Sheet) to be analyzed.
Place the cursor on the dataset (for dataset preparation, see Data Preparation guide).
If the active cell is not on the dataset, SmartstatXL will automatically attempt to identify the dataset.
Activate the SmartstatXL Tab
Click on the Menu Regression > Ordinal Regression.
SmartstatXL will display a dialog box to confirm whether the dataset is correct or not (usually, the dataset is automatically selected correctly).
If it is correct, click the Next Button
Next, the Ordinal Regression Analysis Dialog Box will appear. Select the Factor Variables (Independent) and one or more Response Variables (Dependent).
Press the "Next" button
Select the regression output as shown in the following display:

The reference option can be either the first (lowest) or last (highest) order outcome. In this example, the Outcome consists of 3 levels/classes, namely 0, 1, and 2. For instance, the first order (number 0 is used as the reference).
Press the OK button to generate the output in the Output Sheet

Analysis Results

Analysis Information: type of regression used, response, and predictors

Type of Analysis: The regression used is Ordinal Regression. This means the response variable (or dependent) has an order or categories that can be sorted, like in this case, which are "not likely," "likely," and "very likely."

Response and Predictor Variables:

The response variable is "apply," which reflects the students' decision to proceed to graduate school.
The predictor variables used are: "pared," "public," and "gpa."

Regression Equation

In ordinal logistic regression, the regression equation describes the log-odds of the response categories relative to the reference category. In this case, the reference category is "not likely", while the other two categories are "likely" and "very likely".

Here are the regression equations:

For the "likely" category: Y=2.2033−1.0477×pared+0.0587×public−0.6157×gpa
For the "very likely" category: Y=4.2988−1.0477×pared+0.0587×public−0.6157×gpa

Coefficient Interpretation:

pared: For each one-unit increase in the "pared" variable (e.g., from not having a parent with a postgraduate degree to having one), the log-odds of students saying "likely" or "very likely" will decrease by 1.0477 units, holding other variables constant.
public: For each one-unit increase in the "public" variable (e.g., from private to public university), the log-odds of students saying "likely" or "very likely" will increase by 0.0587 units, holding other variables constant.
gpa: For each one-unit increase in the "gpa" variable, the log-odds of students saying "likely" or "very likely" will decrease by 0.6157 units, holding other variables constant.

Model Statistics:

R² is 0.033, which means that this model explains approximately 3.3% of the variability in the response data.
The Chi-Squared value is 24.180 with a significance of 0.00. This indicates that the overall model is significant in predicting the response variable.

From these analysis results, the R² value is small, but the Chi-Squared shows significance. Both statistics provide different information about the model and are important in evaluating the quality and fit of the model. Let's elaborate further:

R² (Coefficient of Determination):
- R² measures how well the regression model explains the variability in the response data. In the context of ordinal logistic regression, this can be interpreted as how well the model explains changes in the probability of response categories.
- A small R² value, like 0.033 in this case, indicates that the model only explains about 3.3% of the variability in the response data. This means there is a lot of variability unexplained by the model, which might be due to other factors not included in the model or the intrinsic nature of the data itself.
Chi-Squared:
- The Chi-Squared test in the context of logistic regression measures how well our model predicts the response variable compared to a model without predictors (null model).
- In this case, a significant Chi-Squared value indicates that the model with predictors ("pared", "public", and "gpa") is better at predicting "apply" compared to a model without any predictors.

Why the Discrepancy?

It is possible to have a model with a relatively small R² but still significant in the Chi-Squared test. This might mean that although our model provides a significant improvement compared to a model without predictors, it still doesn't explain most of the variability in the response data.
Another factor that can influence this is the sample size. With a large sample, you might achieve statistical significance (as indicated by the Chi-Squared test) even if the actual effect is small (as indicated by a low R²).

In practice, it is important to consider both of these statistics as well as the research context when evaluating the quality and fit of the regression model.

Model Goodness of Fit

The analysis results show various goodness-of-fit metrics for the regression model. Let's explain each of these metrics:

R² (Pseudo R²): 0.0326
- In the context of logistic regression, R² is often referred to as Pseudo R² and is not the square of the correlation coefficient as in linear regression. This value indicates how well the model explains the variability in the response data. In this case, the model accounts for about 3.26% of the variability.
Cox-Snell R²: 0.0587
- This is one method for calculating Pseudo R² in logistic regression. This value suggests that the model explains about 5.87% of the variability.
Nagelkerke R²: 0.0696
- This is another adaptation of R² for logistic regression and often gives a higher value compared to other methods. In this case, the model explains about 6.96% of the variability.
AIC (Akaike Information Criterion): 727.0249
- AIC is a metric that measures the relative quality of a statistical model for a given data set. A model with a lower AIC is considered better. AIC takes into account both the complexity of the model and how well it fits the data.
AICc (Akaike Information Criterion with correction): 727.1772
- Similar to AIC but with a correction for sample size. It is usually used when the ratio of the number of observations to the number of parameters in the model is small (e.g., less than 40).
BIC (Bayesian Information Criterion): 746.9822
- Like AIC, but imposes a larger penalty for more complex models. A model with a lower BIC is considered better.
Log Likelihood: -358.5124
- This measures how well the model fits the data. The higher the log likelihood value, the better the model. In the context of logistic regression, this is the log of the likelihood that the model we have is the correct model for that data.

From the above results, it appears that the regression model's goodness of fit is relatively low based on the R² values and other variants. Nevertheless, AIC and BIC can be used to compare this model with other alternative models to determine which has the best goodness of fit.

Regression Coefficient Estimation

Here is the interpretation for the coefficient estimates of the ordinal logistic regression model:

Intercept1: ->1 (2.203)
- This is the constant or intercept for the category "likely" compared to "unlikely."
- This coefficient, 2.203, represents the log-odds of a student choosing "likely" to continue to graduate school compared to "unlikely" when all other variables are zero.
- A Wald Stat of 7.989 with a p-value of 0.005 indicates that this intercept is significant at the 1% level, which means it is highly significant.
Intercept2: ->2 (4.299)
- This is the constant or intercept for the category "very likely" compared to "unlikely."
- This coefficient, 4.299, represents the log-odds of a student choosing "very likely" to continue to graduate school compared to "unlikely" when all other variables are zero.
- A Wald Stat of 28.565 with a p-value less than 0.0001 indicates that this intercept is highly significant at the 1% level.
pared (-1.048)
- This negative coefficient indicates that if one of the parents has a graduate degree, the log-odds of a student choosing "likely" or "very likely" to continue to graduate school (compared to "unlikely") decrease.
- In other words, the tendency to continue studies is lower if one of the parents has a graduate degree.
- A Wald Stat of 15.537 with a p-value less than 0.0001 indicates that this variable is highly significant at the 1% level.
public (0.059)
- This positive coefficient indicates that if the student comes from a public university, their log-odds of choosing "likely" or "very likely" to continue to graduate school (compared to "unlikely") slightly increase.
- However, the Wald Stat is only 0.039 with a p-value of 0.844, indicating that this variable is not significant.
gpa (-0.616)
- This negative coefficient indicates that with every 1-unit increase in GPA, the log-odds of a student choosing "likely" or "very likely" to continue to graduate school (compared to "unlikely") decrease.
- This may sound counterintuitive, but it is important to understand the context and other variables in the model.
- A Wald Stat of 5.581 with a p-value of 0.018 indicates that this variable is significant at the 5% level.

From the analysis results above, it can be concluded that the factors "pared" and "gpa" have a significant impact on the students' decisions to continue to graduate school, while the "public" factor is not significant.

Deviance Analysis

The analysis results show the deviance analysis table for the ordinal logistic regression model. Deviance is a measure of how well the model fits the data, similar to the sum of squared residuals in linear regression. Let's interpret the table:

Regression:
- DF (Degrees of Freedom) = 4: This indicates the number of predictors in the model (including the intercept).
- Deviance = 24.180: This measures how well the model (with predictors) fits the data compared to a model that only has an intercept (without predictors). A lower value indicates better fit.
- P-value = 0.000: This is the result of the deviance test comparing the model with a model that only has an intercept. A very low p-value (less than 0.01) indicates that the model is significantly better than a model without predictors.
- Chi.05 and Chi.01: These are the critical values of the chi-square distribution with 4 degrees of freedom at the 5% and 1% significance levels. Because the deviance of the model (24.180) is higher than both of these values, this indicates that the model is significant at both significance levels.
Error:
- This is the deviance resulting from a model that only has an intercept. In other words, this is how bad the model would be if there were no predictors at all.
- DF: 395
- Deviance: 717.0249
Total:
- This is the total deviance from the model.
- DF: 399
- Deviance: 741.2053

From this deviance analysis table, it can be concluded that the regression model has a significant fit with the data, and there is strong evidence that at least one predictor in the model is significant in explaining the variability in the response variable ("apply").

Classification Table (Confusion Matrix)

Confusion Matrix

The analysis results show the classification table, also known as a confusion matrix, for the ordinal logistic regression model. This table illustrates how well the model predicts the response categories based on actual data. Let's interpret the table:

Actual vs. Prediction for category 0 ("unlikely"):
- Out of a total of 220 respondents who actually fall into the "unlikely" category, the model correctly predicts 201 of them, while 19 others are misclassified into the "likely" category. None are classified as "very likely".
- The classification accuracy for this category is \( \frac{201}{201+19+0} \times 100\% = 91.36% \).
Actual vs. Prediction for category 1 ("likely"):
- Out of a total of 140 respondents who actually fall into the "likely" category, the model correctly predicts only 30 of them. As many as 110 respondents are misclassified as "unlikely," and none are classified as "very likely."
- The classification accuracy for this category is \( \frac{30}{110+30+0} \times 100\% = 21.43% \).
Actual vs. Prediction for category 2 ("very likely"):
- The model fails to correctly predict any respondents who actually fall into the "very likely" category. A total of 27 respondents are classified as "unlikely," and 13 others as "likely."
- The classification accuracy for this category is 0.00%.
Overall percentage correctly classified:
- Out of all respondents, the model correctly classifies 57.75% of them.

From this classification table, we can see that the model performs well in predicting the "unlikely" category but does not perform well in predicting the "likely" and "very likely" categories. This may indicate that the model needs improvement or that additional variables may be required to enhance the model's performance in predicting these categories.

Other Classification Metrics

These metrics provide an overview of how well the model predicts each category.

Recall (Sensitivity or True Positive Rate):
- Indicates the proportion of actual positives correctly identified by the model out of all actual positives.
- For category 0 ("unlikely"), the recall is 91.36%. This means the model correctly identifies 91.36% of all actual "unlikely" cases.
- For category 1 ("likely"), the recall is 21.43%. This means the model only identifies 21.43% of all actual "likely" cases.
- For category 2 ("very likely"), the recall is 0%. This means the model fails to identify any actual "very likely" cases.
Precision:
- Indicates the proportion of positives identified by the model that are actually positive.
- For category 0, the precision is 59.47%. This means that of all "unlikely" predictions made by the model, 59.47% are actually "unlikely."
- For category 1, the precision is 48.39%. This means that of all "likely" predictions made by the model, 48.39% are actually "likely."
- For category 2, no precision is given (indicated by "*") because the model makes no predictions for this category.
F1 Score:
- Is the harmonic mean of recall and precision. It provides a balance between recall and precision.
- For category 0, the F1 Score is 72.04%.
- For category 1, the F1 Score is 29.70%.
- For category 2, the F1 Score is not provided because the model makes no predictions for this category.

From these classification metrics, it can be seen that the model performs best for the "unlikely" category, but performance for the "likely" and "very likely" categories needs improvement.

Residual Table and Outlier Data Examination

Class: 1

Class: 2

The results are snippets of tables for two classes (1 and 2) containing various prediction and residual metrics. Let's explain each column in these tables:

pared, public, gpa: These are the values of the predictors in the dataset. "pared" indicates whether either parent has a postgraduate degree or not, "public" indicates the type of university (public or private), and "gpa" is the student's Grade Point Average.
apply: This is the actual value of the response variable. In this case, the value of "apply" indicates whether the student decided to proceed to graduate school or not, with category 0 for "unlikely," 1 for "likely," and 2 for "very likely."
Predicted: This is the model's predicted value for each observation. It shows the probability of each observation falling into a particular category.
Residual: This is the difference between the actual and predicted values. Residuals give an idea of how large the prediction error is for each observation.
Pearson Residual: These are the residuals normalized based on the variance of the prediction. Pearson residuals provide a measure of the relative error of each prediction.
Deviance Residual: Like Pearson residuals, but based on the model's deviance. Deviance residuals provide another perspective on how large the prediction error is for each observation.

From these table snippets:

For Class 1, the first observation with "pared" 0, "public" 0, and "gpa" 3.26 actually falls into the "very likely" category (apply = 2). However, the model predicts a probability of about 0.5488 that this observation falls into the "likely" category.
For Class 2, the first observation with "pared" 0, "public" 0, and "gpa" 3.26 actually falls into the "very likely" category (apply = 2). The model predicts a probability of about 0.3593 that this observation falls into the "very likely" category.

By examining the residuals, Pearson residuals, and deviance residuals, we can assess how well the model performs for each observation and determine which observations may be outliers or have high prediction errors.

Conclusion

Regression Model: An ordinal regression model was used to predict the likelihood of a student proceeding to graduate school based on several predictors: parents' educational background ("pared"), type of university ("public"), and Grade Point Average ("gpa").
Model Fit:
1. The model has a Pseudo R² of 0.0326, indicating that the model explains approximately 3.26% of the variability in the response data. Nevertheless, based on the Chi-Squared test, the model is significant in explaining variability in the data.
2. Other goodness-of-fit metrics such as Cox-Snell R² and Nagelkerke R² also indicate that the model has a relatively low fit.
Coefficient Estimation:
1. The variables "pared" and "gpa" have a significant effect on a student's decision to proceed to graduate school.
2. The variable "public," although included in the model, is not significant in influencing that decision.
Classification Performance:
1. The model has an overall classification accuracy of 57.75%.
2. In predicting the "unlikely" category, the model has a recall of 91.36%, but its performance significantly declines for the "likely" category with a recall of only 21.43%. For the "very likely" category, the model fails to correctly predict any observations.
Deviance Analysis:
1. Based on the deviance analysis, the regression model has a significant fit with the data, indicating the presence of at least one significant predictor in explaining the variability in the response variable.
Residual Analysis:
1. From the residual tables for classes 1 and 2, we can see the differences between the model's predictions and the actual observations. This allows us to assess how well the model performs for each observation and determine which observations may have high prediction errors.

Writing Results and Discussion in Scientific Work

In this study, an ordinal regression analysis was conducted to identify factors influencing third-year students' decisions to proceed to graduate school. Based on the analysis, it was found that variables such as parents' educational background (pared) and Grade Point Average (gpa) have a significant impact on the decision. However, the type of university (public) did not show a significant influence.

Although the model shows statistical significance in the deviance analysis, its classification performance needs improvement. This is evident from the overall classification accuracy that only reached 57.75%. Additionally, the model performs well in predicting the "unlikely" category, but its performance significantly declines for the "likely" and "very likely" categories.

Based on the residual analysis, several observations were found to have high prediction errors. This indicates that there may be other factors not yet included in the model, or a different modeling approach is needed.

To improve the model's performance, it is recommended to consider the addition of other potentially relevant predictor variables. Furthermore, cross-validation techniques and regularization can be applied to ensure the model's robustness and generalization. Further evaluation of model assumptions and residual analysis are also needed to address specific areas for improvement.

Sidebar Menu

Main Menu EN

How to Analyze Ordinal Regression