i need fixt it For question 2a, check the “id” variable to be sure you have the correct cases numbers that need to be removed. There are three cases that should be removed that you did not list. Also, 18 and 1129 will automatically be removed since there is missing data for these cases (no MAH_1 value). Some of your values are different than what I have. Be sure you run the multiple regression with the profile-b data set and only with cases that are MAH_1 ≤ 22.458. For question 2h, you should not include the values for the variables that are not statistically significant. The regression equation should only include those variables that are statistically significant. For 2i, also mention that these two variables are not statistically significant and provide the p (sig) value.
Unformatted Attachment Preview
1. The following output was generated from conducting a forward multiple
regression to identify which IVs (urban, birthrat, lnphone, and lnradio) predict
lngdp. The data analyzed were from the SPSS country-a.sav data file.
a. Evaluate the tolerance statistics. Is multicollinearity a problem?
In order to evaluate the presence of multicollinearity, we can exploit the tolerance
statistics, calculated as 1-R2. A small tolerance is an indication of the fact that the
variable considered is almost a perfect linear combination of the other independent
variables already in the equation. Usually, a value of 0.1 serves as the cutoff point.
Looking at the table, we can see assess that multicollinearity is not a problem because all
tolerance statistics are greater than .1 for all the independent variable in both
b. What variables create the model to predict lngdp? What statistics support your
The model summary output indicates that the variables used for the forward multiple
regression are are respectively lnphone (for the simple regression) and lnphone +
birthrate (for the multiple regression).
If we look at the p-values, we can see that both of the coefficients are statistically
significant in explaining the variation of lngdp. However, that of birthrat is significant at
a 5% significance level, differently from that of lnphone which is significant at 1%
significance level. Moreover, despite its significance, the coefficient of birthrat is rather
small in magnitude and the R^2 change between the regression including only lnphone
and the following one with the added birthrate is only 0.004. This is a suggestion of the
fact that the explicative power of birthrat is not much high.
c. Is the model significant in predicting lngdp? Explain.
Regression results indicate an overall model of two predictors (lnphone and birthrat) that
significantly predicts lngdp.
The R squared = .890, the Adjusted R squared = .888
d. What percentage of variance in lngdp is explained by the model?
The model accounted for 89% of the variance in lndgp, as it can be retrieved from the
e. Write the regression equation for lngdp.
lngdp = 6.878 + .663*(lnphone) – .013*(birthrat)
2. This question utilizes the data sets profile-a.sav and profile-b.sav,
You are interested in examining whether the variables shown here in brackets
[years of age (age), hours worked per week (hrs 1), years of education (educ), years
of education for mother (maeduc), and years of education for father (paeduc)] are
predictors of individual income (rincmdol). Complete the following steps to conduct
a. Using profile-a.sav, conduct a preliminary regression to calculate Mahalanobis
distance. Identify the critical value for chi-square. Conduct Explore to identify outliers.
Which cases should be removed from further analysis?
In order to calculate Mahalanobis distance, I conducted a preliminary regressio
Std. Error of
a. Predictors: (Constant), Highest Year of School
Completed, Father, Number of Hours Worked Last
Week, Age of Respondent, Highest Year of School
Completed, Highest Year of School Completed,
b. Dependent Variable: RESPONDENTS INCOME
The model summary indicates the general statistics of the regression where all the IVs
were included into the model
a. Dependent Variable: RESPONDENTS INCOME
b. Predictors: (Constant), Highest Year of School Completed, Father,
Number of Hours Worked Last Week, Age of Respondent, Highest Year
of School Completed, Highest Year of School Completed, Mother
The ANOVA table presents the model significantly predicts the dependent variable of
rincmdol, with the F-test for the overall significance telling us that at least one of the
predictors are statistically significant. F(5, 642) = 64.994, p<.001. Coefficientsa Unstandardized Coefficients B Std. Error -5.487 1.302 Model 1 (Constant) Age of .133 .016 Respondent Highest Year of School .507 .071 Completed Number of Hours Worked .142 .012 Last Week Highest Year of School .005 .074 Completed, Mother Highest Year of School .041 .055 Completed, Father a. Dependent Variable: RESPONDENTS INCOME Standardize d Coefficients Beta t -4.215 Sig. .000 .291 8.585 .000 .256 7.145 .000 .385 11.788 .000 .003 .066 .948 .030 .733 .464 The coefficient table indicates the coefficients that were used to predict the regression equation. Residuals Statisticsa Minimu Maximu m m Mean Predicted Value Std. Predicted Value Standard Error of Predicted Value Adjusted Predicted Value Residual Std. Residual Stud. Residual Deleted Residual Stud. Deleted Residual Mahal. Distance Cook's Distance Centered Leverage Value Std. Deviation N 4.16 24.16 13.64 3.080 648 -3.077 3.415 .000 1.000 648 .176 1.005 .398 .128 648 4.23 24.48 13.64 3.083 648 -15.499 -3.567 -3.583 13.759 3.166 3.188 .000 .000 .000 4.329 .996 1.001 648 648 648 -15.637 13.945 -.003 4.375 648 -3.616 3.211 -.001 1.003 648 .059 33.575 4.992 4.223 648 .000 .042 .002 .004 648 .000 .052 .008 .007 648 a. Dependent Variable: RESPONDENTS INCOME Case Processing Summary Cases Missing Valid N Mahalanobis Distance Percent 677 N 45.1% 823 Total Percent 54.9% N Percent 1500 100.0% The sample consisted of 1500 (823 missing values). Descriptives Statistic Mahalanobis Distance Mean 95% Confidence Interval for Mean Lower Bound Upper Bound 5% Trimmed Mean Median Variance Std. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Std. Error 4.9925522 .16121086 4.6760180 5.3090864 4.5041201 3.8432691 17.595 4.19458144 .05899 33.57526 33.51627 4.15036 2.310 7.910 .094 .188 The skewness statistics has a z-score of 2.310 /.094= 24.574. Based on this, we can conclude that the skewness is substantial and the distribution is non-normal. The kurtosis values are in line with that, 7.910/.188 = 42.074 shows there is no significance. Using a chi-squared table, critical value 22.458 was found, therefore, cases 406, 508, 18, 1129, and 351 exceeded that value so should be eliminated. The box plots is not normal and there are outliers at the highest end of the distribution. The critical value for chi-square is 22.458. Any cases with mahlabnobis>22.458 should
be eliminated from the regression analysis. Therefore, cases 406, 508, 18, 1129, and 351
were eliminated following this reasoning.
For all subsequent analyses, use profile-b.sav. Make sure that only cases where
MAH_1<22.458 are selected. b. Create a scatterplot matrix. Can you assume linearity and normality? The scatterplot matrix with the transformed variables displays elliptical shapes, suggesting that the variables are linear normally distributed. Tests of Normality Kolmogorov-Smirnova Shapiro-Wilk Statistic df Sig. Statistic df Sig. Age of Respondent Highest Year of School Completed .057 609 .000 .975 609 .000 .151 609 .000 .944 609 .000 Number of Hours Worked .184 609 Last Week Highest Year of School .270 609 Completed, Mother Highest Year of School .180 609 Completed, Father RESPONDENT .115 609 S INCOME a. Lilliefors Significance Correction .000 .960 609 .000 .000 .891 609 .000 .000 .964 609 .000 .000 .952 609 .000 The Shapiro-Wilk test is particularly useful for testing the non-normality of the variables. The null hypothesis of this test is that the variables are normally distributed. From the results, we can reject the null hypothesis for all the variables, concluding that none of them is normally distributed. From the plot we can see that the residuals of the regression do not cluster on a horizontal line; in fact, there is an even distribution above and below the reference. From this, it seems that there is a moderate violation of linearity and homoscedasticity, which however should not invalidate the analysis. d. Conduct multiple regression using the Enter method. Evaluate the tolerance statistics. Is multicollinearity a problem? Multicollinearity is not a problem because all tolerance statistics is greater than .1. Descriptive Statistics Std. Mean Deviation RESPONDENT 13.25 5.058 S INCOME Age of 39.45 11.547 Respondent Highest Year of School 14.25 2.587 Completed Number of Hours Worked 42.88 14.059 Last Week Highest Year of School 11.81 2.802 Completed, Mother Highest Year of School 11.65 3.862 Completed, Father N 609 609 609 609 609 609 Correlations Numbe RESPO Highest r of NDENT Year of Hours S Age of School Worked INCOM Respond Complet Last E ent ed Week Pearson Correlation RESPON DENTS INCOME Age of Responde nt Highest Year of School Comple ted, Mother Highest Year of School Comple ted, Father 1.000 .270 .335 .522 .036 .050 .270 1.000 -.017 .053 -.305 -.275 Sig. (1-tailed) Highest Year of School Complete d Number of Hours Worked Last Week Highest Year of School Complete d, Mother Highest Year of School Complete d, Father RESPON DENTS INCOME Age of Responde nt Highest Year of School Complete d Number of Hours Worked Last Week .335 -.017 1.000 .145 .321 .370 .522 .053 .145 1.000 .037 .049 .036 -.305 .321 .037 1.000 .578 .050 -.275 .370 .049 .578 1.000 . .000 .000 .000 .185 .109 .000 . .337 .097 .000 .000 .000 .337 . .000 .000 .000 .000 .097 .000 . .180 .112 N Highest Year of School Complete d, Mother Highest Year of School Complete d, Father RESPON DENTS INCOME Age of Responde nt Highest Year of School Complete d Number of Hours Worked Last Week Highest Year of School Complete d, Mother Highest Year of School Complete d, Father .185 .000 .000 .180 . .000 .109 .000 .000 .112 .000 . 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 609 Correlation table indicates number of hours worked has highest correlation (.522) and highest year of school completed (.355) is the second highest correlation. Also indicates both mom (.036) and dad (.050) have the lowest correlation. All variables were entered using the enter method. Model Summaryb Model 1 R .635a Std. Error of R Adjusted R the Square Square Estimate .404 .399 3.922 a. Predictors: (Constant), Highest Year of School Completed, Father, Number of Hours Worked Last Week, Age of Respondent, Highest Year of School Completed, Highest Year of School Completed, Mother b. Dependent Variable: RESPONDENTS INCOME ANOVAa Sum of Squares Mean Square Model df F Sig. 1 Regressio 6280.935 5 1256.187 81.677 .000b n Residual 9274.119 603 15.380 Total 15555.054 608 a. Dependent Variable: RESPONDENTS INCOME b. Predictors: (Constant), Highest Year of School Completed, Father, Number of Hours Worked Last Week, Age of Respondent, Highest Year of School Completed, Highest Year of School Completed, Mother The ANOVA table suggests that the model significantly predicts the dependent variable of income, with the F test of the overall significance telling us that at least one variable is useful in predicting the income, F(5, 603) = 81.677, p<.001. The coefficient table indicates the coefficients that were used to predict the regression equation. Collinearity Diagnosticsa Variance Proportions Di m en Conditi Age of si Eigen on (Cons Respo Model on value Index tant) ndent 1 1 5.716 1.000 .00 .00 2 .131 6.611 .00 .22 3 .082 8.342 .00 .19 4 .034 12.888 .03 .20 5 .024 15.366 .00 .13 Highest Year of School Comple ted .00 .00 .00 .07 .63 6 .012 21.772 .96 .26 .29 a. Dependent Variable: RESPONDENTS INCOME Residuals Statisticsa Minimu m Maximu m Mean Number of Hours Worked Last Week .00 .06 .85 .04 .02 Highest Year of School Complet ed, Mother .00 .03 .00 .26 .49 Highe st Year of Scho ol Comp leted, Fathe r .00 .17 .00 .80 .00 .03 .22 .01 Std. Deviation Predicted 3.26 23.15 13.25 Value Residual -15.487 8.673 .000 Std. Predicted -3.106 3.082 .000 Value Std. Residual -3.949 2.211 .000 a. Dependent Variable: RESPONDENTS INCOME N 3.214 609 3.906 609 1.000 609 .996 609 e. Does the model significantly predict rincmdol? Explain. The results indicate the model significantly predicts rincmdol. The explanation power is given by R square = .404 (not too high), Adjusted R squared = .399, F(5, 603) = 81.677, p <.001. f. Which variables significantly predict rincmdol? Which variable is the best predictor of the DV? The variables of age (B=.110, Beta=.252, t=7.485, p<.001), edu (B=.531, Beta=.271, t=7.818, p<.001), and hrs1(B=.169, Beta=.469, t=14.741, p<.001) significantly predict the DV. The variable of hrs1 is the best predictor of rincmdol as indicated by the beta weight and respective t and p-values. g. What percentage of variance in rincmdol is explained by the model? The model accounted for 40.4% of the variance rincmdol. h. Write the regression equation for the standardized variables. Income= -6.052 + .252 * age + .272 * educ + .469 * hrs + .017*maeduc + -.014*paeduc i. Explain why the variables of mother’s and father’s education are not significant predictors of rincmdol. Bivariate and partial correlation coefficients of these two variables with the DV are very low. Therefore, there seems to be not much evidence of these variables being important in explaining the DV. ... Purchase answer to see full attachment
Why Choose Us
- 100% non-plagiarized Papers
- 24/7 /365 Service Available
- Affordable Prices
- Any Paper, Urgency, and Subject
- Will complete your papers in 6 hours
- On-time Delivery
- Money-back and Privacy guarantees
- Unlimited Amendments upon request
- Satisfaction guarantee
How it Works
- Click on the “Place Order” tab at the top menu or “Order Now” icon at the bottom and a new page will appear with an order form to be filled.
- Fill in your paper’s requirements in the "PAPER DETAILS" section.
- Fill in your paper’s academic level, deadline, and the required number of pages from the drop-down menus.
- Click “CREATE ACCOUNT & SIGN IN” to enter your registration details and get an account with us for record-keeping and then, click on “PROCEED TO CHECKOUT” at the bottom of the page.
- From there, the payment sections will show, follow the guided payment process and your order will be available for our writing team to work on it.