Report on the Development of a Predictive Model for Student Teaching Satisfaction
Executive Summary
This report details a study focused on enhancing educational evaluation through machine learning, directly contributing to the achievement of Sustainable Development Goal 4 (SDG 4): Quality Education. By leveraging data-driven insights, this research aims to foster more inclusive, equitable, and effective learning environments. A dataset of student evaluations from Turkey was used to compare ten distinct machine learning algorithms for their ability to predict student satisfaction. The findings identified the Support Vector Machine (SVM) algorithm as the most effective model, achieving a superior accuracy of 0.9765. To ensure transparency and accountability, the SHAP (SHapley Additive exPlanations) framework was employed to interpret the SVM model’s predictions, identifying the key factors that influence student satisfaction. Finally, a practical online prediction tool was developed using the Shiny application framework, translating research findings into an accessible resource for educators. This work provides a robust, data-driven methodology to support personalized teaching and curriculum optimization, advancing the goals of quality education for all.
1.0 Introduction: Aligning Educational Advancement with Sustainable Development Goals
1.1 The Imperative for Quality Education (SDG 4)
Educational Data Mining (EDM) has emerged as a critical tool for transforming the educational landscape in alignment with global sustainability targets. The core objective of SDG 4 is to ensure inclusive and equitable quality education and promote lifelong learning opportunities for all. Traditional teaching evaluation methods, often characterized by high subjectivity and low efficiency, present significant barriers to achieving this goal. The integration of machine learning and data science, a key component of SDG 9 (Industry, Innovation, and Infrastructure), offers a pathway to overcome these limitations. By analyzing complex educational data, EDM provides the technical support necessary to understand student needs, optimize teaching strategies, and ultimately enhance the quality and equity of education.
1.2 Research Objectives for Sustainable Education
This study aims to contribute to the advancement of SDG 4 through the following objectives:
- To systematically evaluate and compare ten machine learning algorithms to identify the most accurate model for predicting student teaching satisfaction.
- To understand the multifaceted factors influencing student satisfaction, including course characteristics, teaching resources, and instructor feedback.
- To enhance the transparency of predictive models by interpreting their outcomes, ensuring that data-driven decisions are fair and understandable.
- To develop a practical, accessible tool that empowers educators to use predictive insights for decision-making, bridging the gap between research and implementation.
2.0 Literature Review: The Role of Data Science in Modern Education
2.1 Limitations of Traditional Evaluation in the Context of SDG 4
Traditional evaluation methods have demonstrated significant limitations that impede progress toward SDG 4. These methods often focus narrowly on academic performance, neglecting crucial non-cognitive areas such as practical skills and emotional development. Furthermore, their inherent subjectivity and delayed feedback mechanisms reduce their effectiveness in fostering instructional improvement. The evolution toward intelligent, data-driven evaluation systems is essential for creating educational environments that are responsive, fair, and capable of supporting the holistic development of every student, a cornerstone of SDG 10 (Reduced Inequalities).
2.2 Machine Learning Applications in Education
A growing body of research demonstrates the potential of machine learning to revolutionize education. Previous studies have focused on several key areas:
- Prediction of Academic Performance: Algorithms such as Random Forest and Support Vector Machines have been used to predict student outcomes based on learning data, online behavior, and internet usage.
- Student Risk Identification and Intervention: Models have been developed to identify students at risk of dropping out or facing learning difficulties, enabling timely interventions that promote educational equity.
- Analysis of Key Factors Affecting Performance: Research has identified core factors like class size, parental engagement, and demographic attributes that influence academic outcomes, providing insights for policy-making.
- Development of Intelligent Teaching Systems: Machine learning has been used to construct intelligent systems that optimize teaching strategies and enhance instructional effectiveness.
2.3 Identifying Research Gaps for Enhanced Educational Outcomes
Despite significant progress, existing research reveals several gaps. There has been insufficient focus on subjective evaluation dimensions like student satisfaction, a critical component of educational quality. Moreover, the “black-box” nature of many high-performing algorithms limits their practical value for educators. This study addresses these shortcomings by focusing on student satisfaction, systematically comparing a wide range of algorithms, employing the SHAP framework for interpretability, and developing a practical application to translate findings into action, thereby providing a more comprehensive contribution to SDG 4.
3.0 Methodology: A Framework for Data-Driven Educational Evaluation
3.1 Dataset and Feature Engineering for Equitable Analysis
The study utilized a dataset from the Turkish Student Evaluation Project, comprising 5,820 records and 33 feature variables. To ensure a robust and unbiased analysis supportive of SDG 10, the data was first processed using the K-means clustering algorithm, which segmented the dataset into three balanced categories. Subsequently, the Boruta algorithm, a feature selection method, was applied to identify the 30 most significant features for predicting student satisfaction. This rigorous preprocessing ensures that the subsequent models are built on the most relevant and impactful data.
3.2 Machine Learning Models and Evaluation Protocol
Ten machine learning algorithms were employed to construct predictive models. The dataset was partitioned into a 70% training set and a 30% testing set, with 5-fold cross-validation used to optimize model performance. The algorithms included:
- Random Forest (RF)
- Gradient Boosting Machine (GBM)
- Naive Bayes (NB)
- K-Nearest Neighbors (KNN)
- Neural Networks Algorithm (Nnet)
- Flexible Discriminant Analysis (FDA)
- Support Vector Machine (SVM)
- Classification and Regression Trees (CART)
- Sparse Linear Discriminant Analysis (SLDA)
- AdaBoost (ADA)
Model performance was assessed using a comprehensive set of metrics, including Accuracy, Sensitivity, Specificity, Precision, Recall, and F1 Score, to ensure a reliable evaluation consistent with the high standards required for tools impacting SDG 4.
3.3 Ensuring Transparency and Accountability with SHAP
To address the critical need for model interpretability in educational settings, the SHAP framework was utilized. SHAP is a game theory-based technique that explains the output of any machine learning model by assigning an importance value to each feature for a particular prediction. This approach provides both global and local explanations, making the model’s decision-making process transparent and fostering trust among educators who use these tools.
4.0 Results and Analysis: Evidence for Enhancing Quality Education
4.1 Comparative Performance of Predictive Models
The comparative analysis revealed that the SVM model delivered the most favorable performance across all evaluation metrics. Key results for the SVM model were:
- Accuracy: 0.9765
- Sensitivity: 0.9887
- Specificity: 0.9789
- Precision: 0.9765
- Recall: 0.9777
- F1 Score: 0.9777
The high accuracy and robustness of the SVM model confirm its suitability for developing a reliable predictive tool to support educators. The McNemar statistical test further validated the significant superiority of the SVM model over the other nine algorithms.
4.2 Interpreting Model Predictions for Actionable Insights
The SHAP analysis of the SVM model identified several key indicators that significantly influence student satisfaction. Features Q5, Q1, and Q7 emerged as the top three most impactful variables. This analysis revealed that factors such as the diversity of course content, the appropriateness of course difficulty, and the timeliness of instructor support are critical drivers of student satisfaction. These insights provide educators with clear, evidence-based guidance on where to focus their efforts to improve the learning experience, directly contributing to the objectives of SDG 4.
4.3 An Online Tool for Sustainable Educational Improvement
To translate these findings into a practical solution, an online prediction system was developed using the Shiny framework. This interactive tool allows users to input parameters related to the 30 identified features and receive a real-time prediction of learning satisfaction. This application serves as a powerful example of SDG 9 in action, leveraging technological innovation to create scalable and accessible solutions that empower educators and support the continuous improvement of educational quality.
5.0 Discussion: Implications for Sustainable Educational Practices
5.1 Translating Data Insights into Improved Learning Experiences (SDG 4)
The findings from the SHAP analysis provide a clear roadmap for enhancing educational quality. The importance of indicators related to course design, difficulty, and instructor support underscores the need for a student-centered approach. Educational institutions can use these insights to:
- Dynamically adjust course difficulty based on real-time student data.
- Prioritize the development of robust Q&A support systems, especially in online environments.
- Design training for instructors focused on communication and feedback skills.
- Align course objectives more clearly with career development goals, enhancing student motivation and contributing to SDG 8 (Decent Work and Economic Growth).
5.2 Advancing Towards Inclusive and Equitable Digital Education
The predictive models and tools developed in this study are instrumental in promoting educational equity. By identifying students who are likely to have low satisfaction, educators can intervene proactively with personalized support, preventing disengagement and reducing inequalities in learning outcomes (SDG 4.5, SDG 10). While the SVM model demonstrated superior performance, future research should continue to explore hybrid models that combine predictive power with causal inference to build a more comprehensive and sustainable teaching feedback system. Integrating unstructured data, such as classroom videos, and conducting long-term tracking studies will further refine our understanding of the dynamic factors driving student satisfaction.
6.0 Conclusion: Fostering Quality Education through Technological Innovation
This report has detailed the development and validation of a machine learning framework for predicting student teaching satisfaction. By systematically comparing ten algorithms, the study identified the SVM model as the most accurate and robust. The application of SHAP technology provided crucial interpretability, revealing the core drivers of student satisfaction and offering data-driven support for optimizing teaching strategies. The creation of an interactive online prediction tool demonstrates a successful translation of research into a practical resource for educators.
This work makes a significant contribution to the Sustainable Development Goals, particularly SDG 4, by providing an intelligent, transparent, and actionable system for educational evaluation. The integration of machine learning and explainable AI represents a critical step toward building a future where quality education is more personalized, equitable, and effective for all learners.
Analysis of Sustainable Development Goals (SDGs) in the Article
1. Which SDGs are addressed or connected to the issues highlighted in the article?
The article primarily addresses and connects to the following Sustainable Development Goals:
-
SDG 4: Quality Education
The core focus of the article is on enhancing the quality of education. It explores the use of Educational Data Mining (EDM) and machine learning to improve teaching evaluation, predict student academic performance, and understand student satisfaction. The introduction states that EDM can “effectively improving students’ knowledge and optimizing the overall effectiveness of educational institutions.” The entire study is geared towards providing “educators with a scientific basis for evaluation and technical support for personalized teaching and curriculum optimization,” which are central tenets of ensuring quality education.
-
SDG 9: Industry, Innovation, and Infrastructure
The article is a clear example of leveraging innovation and technology to improve a societal sector. It discusses the application of “cutting-edge technologies such as big data and artificial intelligence” in educational evaluation. The research itself, which involves comparing ten machine learning algorithms and developing a new predictive tool (the Shiny app), contributes to scientific research and technological advancement in the field of education. This aligns with the goal of fostering innovation and upgrading technological capabilities.
2. What specific targets under those SDGs can be identified based on the article’s content?
Based on the article’s content, the following specific SDG targets can be identified:
-
Under SDG 4: Quality Education
-
Target 4.1: By 2030, ensure that all girls and boys complete free, equitable and quality primary and secondary education leading to relevant and effective learning outcomes.
The article directly supports the aim of achieving “effective learning outcomes.” It focuses on predicting student performance and satisfaction, which are crucial measures of educational effectiveness. By identifying at-risk students and factors influencing success, the methods described can help institutions intervene and improve learning outcomes for all students.
-
Target 4.c: By 2030, substantially increase the supply of qualified teachers, including through international cooperation for teacher training in developing countries, especially least developed countries and small island developing States.
The article’s findings and tools are designed to empower educators. It notes that for teachers, “evaluation results are an essential basis for reflecting on teaching methods, diagnosing teaching problems, optimizing curriculum design, and helping teachers to achieve accurate teaching.” By providing data-driven insights, the proposed system acts as a tool for professional development and improves the quality and effectiveness of teachers.
-
Target 4.1: By 2030, ensure that all girls and boys complete free, equitable and quality primary and secondary education leading to relevant and effective learning outcomes.
-
Under SDG 9: Industry, Innovation, and Infrastructure
-
Target 9.5: Enhance scientific research, upgrade the technological capabilities of industrial sectors in all countries, in particular developing countries, including, by 2030, encouraging innovation and substantially increasing the number of research and development workers per 1 million people and public and private research and development spending.
The study is a piece of scientific research that enhances the technological capabilities of the education sector. It systematically evaluates ten machine learning algorithms and develops an innovative application (“Shiny application for online prediction”). This work directly contributes to the body of scientific research and encourages the adoption of advanced technology and innovation in education.
-
Target 9.5: Enhance scientific research, upgrade the technological capabilities of industrial sectors in all countries, in particular developing countries, including, by 2030, encouraging innovation and substantially increasing the number of research and development workers per 1 million people and public and private research and development spending.
3. Are there any indicators mentioned or implied in the article that can be used to measure progress towards the identified targets?
Yes, the article mentions and implies several indicators that can be used to measure progress towards the identified targets, even if they are not the official UN-designated indicators.
-
For Target 4.1 (Effective Learning Outcomes):
- Student Satisfaction Ratings: The primary outcome variable of the study is “predicting student satisfaction ratings.” This serves as a direct indicator of the perceived quality and effectiveness of the educational experience.
- Student Academic Performance: The literature review extensively discusses the prediction of “students’ academic performance,” “exam scores,” and identifying “students at risk of dropping out.” These are key indicators of learning outcomes.
- Model Performance Metrics: The article uses several metrics to evaluate its predictive models, such as “Accuracy, Sensitivity, Specificity, Positive Predictive Value (PPV), Negative Predictive Value (NPV), Precision, Recall, F1 Score.” The high accuracy of the SVM model (0.9765) is an indicator of the technological capacity to effectively monitor and predict learning outcomes.
-
For Target 4.c (Qualified and Effective Teachers):
- Optimization of Teaching Strategies: The conclusion states that the research provides “data-driven decision-making support for optimizing teaching strategies.” The ability to adjust teaching methods based on predictive feedback is an indicator of enhanced teacher quality.
- Curriculum Optimization: The developed Shiny app is intended to provide “technical support for personalized teaching and curriculum optimization.” Changes made to the curriculum based on these data-driven insights can be tracked as an indicator of progress.
-
For Target 9.5 (Scientific Research and Innovation):
- Application of Advanced Algorithms: The use and comparison of “ten machine learning algorithms” in an educational context is an indicator of the integration of advanced technology and scientific research.
- Development of Technological Tools: The creation of the “Shiny application for online prediction of learning effect satisfaction” is a tangible indicator of innovation, translating theoretical research into a practical tool.
4. Summary Table of SDGs, Targets, and Indicators
SDGs | Targets | Indicators |
---|---|---|
SDG 4: Quality Education | 4.1: Ensure quality education leading to effective learning outcomes. |
|
4.c: Increase the supply of qualified teachers. |
|
|
SDG 9: Industry, Innovation, and Infrastructure | 9.5: Enhance scientific research and upgrade technological capabilities. |
|
Source: nature.com