Creating a residual vs. predicted values plot is a common diagnostic tool in regression analysis. It helps you visually assess the relationship between the predicted values from your regression model and the corresponding residuals (the differences between the observed and predicted values).
Plot for residual vs predicted values:
With above graph I am able to analyse the presence of outliers in my dataframe and how much they are diverted from the actual values. And the y =0 line represents the magnitude of deviation, Larger vertical distances indicate more substantial deviations, suggesting that the model made a more significant prediction error for those specific cases.
Points above the line represent positive residuals (model underpredicted), while points below the line represent negative residuals (model overpredicted).
Positive outliers suggest that the model underestimated the actual values, while negative outliers suggest overestimation.
From the above we analyse that there is a presence of heteroscedasticity.