# [[EDUC 7215]] Final Assignment Jethro Jones ## Assignment Directions Final Exam (100 points) The dataset used for this final exam was created from a Value-Belief-Norm (VBN) Theory-based survey designed to assess the factors that influence pro-environmental behavior in the Missouri area.  It is the same dataset that you used for your homework assignments.  You will recall that PEB is the arithmetic mean of 20 questions from of the pro-environmental behavior scale developed by Markle (2013).  In previous homework assignments, you analyzed NEP (New Environmental Paradigm), which is the arithmetic mean of 15 questions from the hypothesized facet scale developed by Hawcroft and Milfont (2010), and you also analyzed PN (Personal Norms), which is the arithmetic mean of 9 questions from the personal normative scale developed by Steg et al. (2005).  You also discovered that PEB is influenced by education, which is a nominal scale classification variable with four levels (2 = high school/GED, 3 = associates degree/certificate, 4 = college degree, 5 = masters/PhD/professional degree.  So, we can hypothesize a linear model where PEB is a linear function of NEP, PN, and Education:  PEB = 0 + 1NEP + 2PN + 3Ed2 + 4Ed3 + 5Ed4 + , Where: PEB = Pro-environmental behavior (ratio scale dependent variable), NEP = New Environmental Paradigm (ratio scale independent variable), PN = Personal Norms (ratio scale independent variable), Ed2 = High school diploma or GED (nominal scale independent variable), Ed3 = Associates degree or certificate/license (nominal scale independent variable), Ed4 = Four-year college degree (nominal scale independent variable), 0 - 5 = model parameters to be estimated, and  = residuals that are independent and normally distributed with equal variances. You will recognize that this model is like the model in Assignment 8, except that the nominal scale variable Education replaced the nominal scale variable Gender. Since Education is a classification variable with four levels, we need to add three slopes (3, 4, 5) to the model and use the fourth level (5=masters/PhD/professional degree) as the reference level.  We only needed one slope in the model with Gender from Assignment 8, because Gender only had two levels. For this final exam, use multiple linear regression (MLR) to test if this model describes the relationship between PEB and NEP, PN, and Education.  You will answer the research question, “Is pro-environmental behavior (PEB) a linear function of NEP, PN, and Education for survey respondents in the Missouri area?”  You will need to address the two assumptions of MLR, normality and equal variances of the residuals, through visual assessments of the histogram and Q-Q plots (for normality) and the residual plot (for equal variances).  After you have analyzed the assumptions, you will test the following hypotheses to see if each model parameter (0, 1, 2, 3, 4, 5) differs from zero at the type 1 error rate of 0.05 or 5%. | | | | | | | | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | | Intercept | Slope of NEP | Slope of PN | Slope of Ed2 | Slope of Ed3 | Slope of Ed4 | | H<sub>0</sub>  0 = 0 | H<sub>0</sub>:  1 = 0 | H<sub>0</sub>:  2 = 0 | H<sub>0</sub>:  3 = 0 | H<sub>0</sub>:  4 = 0 | H<sub>0</sub>:  5 = 0 | | H<sub>a</sub>:  0 ≠ 0 | H<sub>a</sub>:  1 ≠ 0 | H<sub>a</sub>:  2 ≠ 0 | H<sub>a</sub>:  3 ≠ 0 | H<sub>a</sub>:  4 ≠ 0 | H<sub>a</sub>:  5 ≠ 0 | You will also discuss the model fit statistics (RMSE, adjusted R-square) and the Variance Inflation Factors (VIFs). You should also include a Methods paragraph and Summary Statistics for PEB, NEP, and PN by Education. I also recommend that you set the Parameterization of Effects to Reference Coding on the Data tab in SAS, which will help with interpretation of the three slopes for Education (recall that you did this in chapter 13 for classification variables in the binary logistic regression model). For this final exam, you can work together, but I will not be available to answer specific questions or check your results. I will not accept late submissions since it is the end of the semester. The due date for submission is Saturday, May 17, 2025 by 11:59pm. ## Assignment - [ ] MLR to test if this describes relationship between PEB and NEP, PN, and Education. - [ ] Research Question: Is pro-environmental behavior (PEB) a linear function of NEP, PN, and Education for survey respondents in the Missouri area? - [ ] two assumptions of MLR, normality and equal variances of the residuals, visual assessment of - [ ] Histogram and QQ plot (for normality) - [ ] Residual plot (for equal variances) - [ ] Test the hypotheses - [ ] error rate of 0.05 or 5% - [ ] parameterization of effects to reference coding in data tab - [ ] Model fit stats (RMSE, adjusted R-square) - [ ] Variance inflation factors - [ ] Summary stats for PEB, NEP, and PN by education ## Methods I clicked Tasks and Utilities, then Tasks, then Linear Models, then Linear Regression. I selected our work set from my work library. Under Roles, I selected PEB for the dependent variable, Education as the classification variable, and NEP and PN for the continuous variable. Under the Model tab, I selected PN, NEP and education and add. In options, I selected Individual plots for both the diagnostic and residual plots. After I clicked Run, I opened it in a new tab and saved the graphs and took screenshots as you'll see below. I used an AI renaming tool to rename the screenshots and graphs appropriately. I then used Tasks and Utilities, then Tasks, then Statistics, then Summary Statistics to generate summary statistics for PEB and NEP, PN and Education. Under Option, then Basic Statistics, I check Mean, Standard Deviation, minimum Value, Maximum Value and Median. Under Additional Statistics, I checked 95% Confidence Limits for the mean. Under Plots, I notched the Comparative box plot. After I clicked Run, I opened it in a new tab and saved the graphs and took screenshots as you'll see below. I used an AI renaming tool to rename the screenshots and graphs appropriately. Here are my screenshots from SAS ![[2025-05-05 SAS_Studio_Regression_Analysis-KO3ei5e9KO.png]] ![[2025-05-05 SAS_Model_Effects_Builder.png]] ![[2025-05-05 SAS_Studio_Interface.png]] ![[2025-05-05 SAS_Studio_Statistical_Analysis.png]] ![[2025-05-05 SAS_Studio_Regression_Analysis.png]] ## Hypothesis | | | | | | | | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | --------------------- | | Intercept | Slope of NEP | Slope of PN | Slope of Ed2 | Slope of Ed3 | Slope of Ed4 | | H<sub>0</sub>  0 = 0 | H<sub>0</sub>:  1 = 0 | H<sub>0</sub>:  2 = 0 | H<sub>0</sub>:  3 = 0 | H<sub>0</sub>:  4 = 0 | H<sub>0</sub>:  5 = 0 | | H<sub>a</sub>:  0 ≠ 0 | H<sub>a</sub>:  1 ≠ 0 | H<sub>a</sub>:  2 ≠ 0 | H<sub>a</sub>:  3 ≠ 0 | H<sub>a</sub>:  4 ≠ 0 | H<sub>a</sub>:  5 ≠ 0 | Error rate = 0.05 or 5% ## Analysis The summary statistics show that as education increases, so does PEB (2.6->2.7->2.88->3.09), NEP (4.63->4.64->4.76->4.91), and PN (4.63->4.37->4.71->4.99). Note the small dip in PN when education = 3 (associates). Lower CL and upper CL for mean indicate there are not wild variations, so the data is probably solid. ![[2025-05-05 Statistical Summary Table.png]] PN scores tend to increase with educational attainment, with the most pro-environmentally oriented norms observed among those with advanced degrees. This supports the hypothesis that education may strengthen internal moral obligations toward environmental behaviors. ![[2025-05-05 Distribution of PN by education.png]] NEP scores consistently rise with education, showing that individuals with more education tend to hold stronger pro-environmental beliefs. This supports the inclusion of education in the regression model and may signal its moderating effect on environmental attitudes. ![[2025-05-05 Distribution of NEP by education.png]] Pro-environmental behaviors increase with educational attainment, supporting the hypothesis that education influences not only environmental attitudes (NEP, PN) but also actions (PEB). This trend justifies the inclusion of Education as a categorical predictor in your multiple regression model. ![[2025-05-05 Distribution of PEB by education.png]] This table shows that the data set was what I chose, the number of observations include (n=379). It also explains there are 4 effects, and 6 parameters. ![[2025-05-05 DataSetSummary-SeealeRQuT.png]] The Least Squares Summary suggests that the full model including NEP, PN, and Education provides the best balance of accuracy and simplicity. Each added variable improves model fit, and because the lowest score is step 3, that means that this is the best model to use for this assessment. ![[2025-05-05-Least-Squares-Summary.png]] According to the analysis of variance, this is a statistically significant and well-fitting model. It explains a substantial portion of PEB variance and predicts with good accuracy. Reject H₀ for Ed2, Ed3 Fail to reject H₀ for Ed4 ![[2025-05-05_Least_Squares_Model_Analysis.png]] Looking at the table above, parameter estimates: **Intercept**: Expected PEB score when all predictors = 0 (not substantively meaningful since NEP/PN don’t range that low). - Estimate: **1.156** - p < 0.0001 → significant **NEP**: For each 1-point increase in NEP, **PEB increases by 0.196**, controlling for PN and Education. - Estimate: +0.196 - p < 0.0001 → significant Reject H₀: 1 = 0 - Supports NEP as a positive predictor of PEB. **PN**: - Estimate: +0.194 - p < 0.0001 → significant Reject H₀: 2 = 0 - Strong effect PN has the largest t value, making it the strongest predictor. **Education Effects** (reference = level 5: graduate/professional degree) | Education Group | Coef (β) | p-value | Interpretation | | ------------------ | -------- | ------- | ------------------------------------------------------ | | **Ed 2 (HS/GED)** | –0.348 | <0.0001 | Significantly lower PEB than Ed 5 | | **Ed 3 (Assoc.)** | –0.185 | 0.0068 | Also significantly lower PEB than Ed 5 | | **Ed 4 (College)** | –0.125 | 0.0633 | Not significant at α = 0.05, but suggestive (p ≈ 0.06) | Reject H₀ for Ed2, Ed3 Fail to reject H₀ for Ed4 NEP and PN are both strong, positive, statistically significant predictors of PEB. Education matters — especially when comparing those with less than a college degree to those with graduate degrees. College degree vs graduate is not statistically significant (p = 0.0633), but the trend still aligns. --- For the table below: - VIFs under 5 (and especially under 2) indicate very low multicollinearity. - Tolerance values > 0.2 are also **safe**. No multicollinearity issues detected — all predictors provide unique information in the model. None of the predictors load heavily (>0.5) on the same high-index component (e.g., Component 6). While one condition index exceeds 10 (and another 17.13), no two or more variables have high (>0.5) proportions of variation on the same component, so no serious multicollinearity problem is present. ![[2025-05-05_Model1_ParameterEstimates_CollinearityDiagnostics.png]] The Distribution of residuals for PEB appears uniform and bell shaped. It's a little off the normal curve, but not enough to raise concern. ![[2025-05-05 Distribution of Residuals for PEB.png]] Looking at the residual by predicted for PEB, this plot shows no clear pattern, which supports the assumption of equal variance ![[2025-05-05 Residuals_Predicted_PEB.png]] The RStudent by Predicted for PEB shows no evidence of serious outliers or leverage points. Residuals do not increase or decrease in spread across predictions, so this adds further confidence that your model is well-behaved. ![[2025-05-05 RStudent Predicted PEB Scatterplot.png]] The Observed by Predicted for PEB shows no evidence of outliers or leverage points. Supports the confidence that the model is well-behaved. The predicted values closely align with actual PEB values, indicating strong overall model accuracy. This plot visually confirms the good fit reflected in the R² (0.39) and low RMSE (~0.48). ![[2025-05-05 Observed_vs_Predicted_PEB_Scatterplot.png]] In the Cook's D for PEB, we see no observations exert disproportionate influence on the model’s estimates. There are no problematic outliers, and all data points contribute reasonably to the regression line. ![[2025-05-05 CooksDPlot.png]] While there are some individual points with slightly high residuals or leverage, only one observation qualifies as both, and none are extreme enough to distort the model. There’s no evidence of influential data points requiring removal, confirming what we saw in Cook's D above. ![[2025-05-05 Outlier Leverage Diagnostics PEB.png]] The Q-Q plot shows a normal curve as well, as the plots are along the line. Normal distribution visually confirmed. ![[2025-05-05 QQPlotResidualsPEB.png]] This plot suggests the model captures variance in PEB appropriately and does not miss nonlinear patterns. There’s no evidence of systematic bias or inadequate fit. ![[2025-05-05 ResidualFitSpreadPlotPEB.png]] The following graphs continue to support the above-mentioned results. There is no evidence of bias in the residuals. ![[2025-05-05 ResidualsForPEBScatterPlot.png]] ![[2025-05-05 ResidualsForPEBScatterPlot-KTl_foLDxr.png]] ![[2025-05-05 ResidualsForPEB.png]] ![[2025-05-05 ResidualsForPEB-vK4TxUeT6G.png]] ![[2025-05-05 Residuals_for_PEB.png]] ## Conclusion A multiple linear regression was conducted to determine whether pro-environmental behavior (PEB) could be predicted by NEP (New Environmental Paradigm), PN (Personal Norms), and education level. The overall model was statistically significant, F(5, 373) = 47.85, _p_ < .0001, and explained approximately 39.1% of the variance in PEB scores (R² = 0.391; Adjusted R² = 0.383). **Hypothesis Testing of Model Parameters** To evaluate whether each variable significantly contributed to the prediction of pro-environmental behavior (PEB), the following hypotheses were tested at the α = 0.05 level: | Parameter | Null Hypothesis (H₀) | Alternative Hypothesis (Hₐ) | Result | | -------------------- | -------------------- | --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ | | **Intercept (β₀)** | β₀ = 0 | β₀ ≠ 0 | **Reject H₀** (*p* < .0001) — The intercept is significantly different from zero. | | **NEP (β₁)** | β₁ = 0 | β₁ ≠ 0 | **Reject H₀** (*p* < .0001) — NEP significantly predicts PEB. | | **PN (β₂)** | β₂ = 0 | β₂ ≠ 0 | **Reject H₀** (*p* < .0001) — PN significantly predicts PEB. | | **Education 2 (β₃)** | β₃ = 0 | β₃ ≠ 0 | **Reject H₀** (*p* < .0001) — Individuals with a high school diploma or GED have significantly lower PEB. | | **Education 3 (β₄)** | β₄ = 0 | β₄ ≠ 0 | **Reject H₀** (*p* = .0068) — Individuals with an associate’s degree have significantly lower PEB. | | **Education 4 (β₅)** | β₅ = 0 | β₅ ≠ 0 | **Fail to reject H₀** (*p* = .0633) — The difference in PEB for college degree holders is not statistically significant at the 0.05 level. | Four of the six parameters — NEP, PN, Ed2, and Ed3 — were statistically significant contributors to the model. Only the college degree group (Ed4) did not differ significantly from the reference group (graduate degree holders), though the trend was in the expected direction. - **NEP** was a significant positive predictor of PEB, β = 0.196, _t_(373) = 5.44, _p_ < .0001. - **PN** also significantly predicted PEB, β = 0.194, _t_(373) = 7.74, _p_ < .0001. - **Education level** showed meaningful effects when compared to the reference group (graduate/professional degree): - Participants with a high school diploma or GED (Ed2) had significantly lower PEB scores, β = –0.348, _p_ < .0001. - Those with an associate’s degree (Ed3) also scored lower, β = –0.185, _p_ = .0068. - The college degree group (Ed4) trended lower but did not reach statistical significance, β = –0.125, _p_ = .0633. - Assumptions of normality were supported through a histogram of residuals, Q-Q plot, and residual vs. predicted plot. - Multicollinearity was not a concern (all VIFs < 1.5, tolerances > 0.7). - Cook’s D values were all well below 1, indicating no influential outliers. - Leverage and RStudent plots identified one case with both moderate leverage and residual, but not beyond concerning thresholds. - The Observed vs. Predicted plot showed a strong linear pattern, indicating that predicted values closely matched actual PEB scores. These findings suggest that environmental education and moral framing may be key strategies for encouraging pro-environmental behavior.