Journal of Educational Innovations

Journal of Educational Innovations

Prediction of Reading Achievement using Multiple Regression based on Machine Learning

Document Type : Original Article

Authors
1 Ph.D. Student in Assessment & Measurement, Allameh Tabatabi’ University, Tehran, Iran.
2 Associate Professor, Curriculum Studies Dept. Faculty of Psychology and Educational Sciences, Kharazmi University, Tehran, Iran.
Abstract
This study employed a machine learning-based multiple regression model to predict reading literacy scores among 4th-grade students in Iran. The primary objective was to create a predictive model for reading literacy using advanced machine learning techniques. Data was sourced from the 2021 Progress in International Reading Literacy Study (PIRLS), encompassing a substantial sample of 5,943 students, selected through a stratified random sampling method to ensure representativeness. The analysis was conducted using the Scikit-learn package in Python, which facilitated the application of the multiple regression technique. The results indicated that specific attitude variables, notably self-confidence in reading and familiarity with digital devices, emerged as significant predictors of reading literacy scores. This finding underscores the importance of psychological and technological factors in literacy development. The research highlights the need for educational policymakers to prioritize initiatives aimed at enhancing students' self-confidence in reading and their comfort with digital tools. By addressing these areas, it is anticipated that literacy outcomes can be significantly improved, ultimately contributing to better educational achievements and lifelong learning skills among students.
Keywords

حسن‌آبادی، حمیدرضا، طلایی، ابراهیم، سیدمیرزایی جهقی، آزاده، و برارپور، گلرخ. (1396). شواهد تجربی از تعامل متن و خواننده در دانش‌آموزان جهشی پایة چهارم: آیا سواد خواندن معیاری برای تسریع تحصیلی محسوب می‌شود؟. روانشناسی تحولی: روانشناسان ایرانی، 14(54)، 135-146.https://journals.iau.ir/article_539236.html
زارع، حسین، و نقش، سیمین. (1395). بررسی سهم متغیرهای سطح دانش آموز و معلم بر عملکرد خواندن دانش‌آموزان پایه چهارم براساس داده های پرلز 2006. دوماهنامه علمی - پژوهشی رهیافتی نو در مدیریت آموزشی، 7(26)، 52-39. https://dorl.net/dor/20.1001.1.20086369.1395.7.26.3.4
سادات‌رضایی، مهناز، کیامنش، علیرضا، و زهرا، نقش. (1393). بررسی سهم متغیرهای سطح دانش آموز و مدرسه بر عملکرد خواندن دانش‌آموزان براساس داده های پرلز 2006. فصلنامه تعلیم و تربیت، 30(4)، 140-123. http://qjoe.ir/article-1-189-fa.html
قائدامینی، رقیه، کیامنش، علیرضا، و قربانی، رقیه. (1393). رابطه بین وضعیت اقتصادی-اجتماعی خانواده, فعالیت های خواندن در خانه, خودپنداره و نگرش دانش آموزان به خواندن با عملکرد سواد خواندن دانش آموزان (بر اساس مطالعات پرلز 2006). پژوهش در برنامه ریزی درسی (دانش و پژوهش در علوم تربیتی-برنامه‌ریزی درسی)، 11(41)، 103-88. https://sanad.iau.ir/Journal/jsre/Article/898018
نامداری پژمان، مهدی، و کیامنش، علیرضا. (1390). ارتباط عوامل شناختی فردی و خانوادگی با ابعاد درک مطلب خواندن دانش آموزان چهارم ابتدایی شرکت کننده در مطالعه پرلز 2006. مطالعات برنامه درسی، 5(20)، 57-37.https://sid.ir/paper/101172/fa
Allington, R. L., & McGill-Franzen, A. M. (2021). Reading volume and reading achievement: A review of recent research. Reading Research Quarterly, 56(S1), 231–238. https://doi.org/10.1002/rrq.404
Bai, Y., Liu, J., Wang, S., & Yang, F. (2018). Machine learning applied to star–galaxy–QSO classification and stellar effective temperature regression. The Astronomical Journal, 157(1), Article 9. https://doi.org/10.3847/1538-3881/aaf009
Bender, R. (2009). Introduction to the use of regression models in epidemiology. In M. Verma (Ed.), Cancer epidemiology: Methods in molecular biology (Vol. 471, pp. 179-195). Humana Press. https://doi.org/10.1007/978-1-59745-416-2_9
Botchkarev, A. (2019). Performance metrics (error measures) in machine learning regression, forecasting and prognostics: Properties and typology. Interdisciplinary Journal of Information, Knowledge, and Management, 14, 45–79. https://doi.org/10.48550/arXiv.1809.03006
Chen, G., Kumar, V., Huang, R., & Kong, S. C. (Eds.). (2015). Emerging issues in smart learning. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-662-44188-6
Cho, B. Y., Hwang, H., & Jang, B. G. (2021). Predicting fourth grade digital reading comprehension: A secondary data analysis of (e) PIRLS 2016. International Journal of Educational Research, 105, Article 101696. https://doi.org/10.1016/j.ijer.2020.101696
Cunningham, P., Cord, M., & Delany, S. J. (2008). Supervised learning. In M. Cord & P. Cunningham (Eds.), Machine learning techniques for multimedia: Case studies on organization and retrieval (pp. 21-49). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-75171-7_2
Ebbs, D., Flicop, S., Hidalgo, M. M., & Netten, A. (2023). Systems and instrument verification in PIRLS 2021. In M. von Davier, I. V. S. Mullis, B. Fishbein, & P. Foy (Eds.), Methods and procedures: PIRLS 2021 technical report (pp. 5.1-5.24). Boston College, TIMSS & PIRLS International Study Center. https://doi.org/10.6017/lse.tpisc.tr2103.kb2485
El Naqa, I., Li, R., & Murphy, M. J. (Eds.). (2015). Machine learning in radiation oncology: Theory and applications. Springer International Publishing. https://doi.org/10.1007/978-3-319-18305-3
Fishbein, B., Yin, L., & Foy, P. (2024). PIRLS 2021 user guide for the international database (2nd ed.). Boston College, TIMSS & PIRLS International Study Center. https://pirls2021.org/data
Fox, J. (2015). Applied regression analysis and generalized linear models. Sage Publications.
George, D., & Mallery, M. (2021). SPSS for Windows step by step: A simple guide and reference (17th ed.). Routledge. https://doi.org/10.4324/9781003205333
Geske, A., & Ozola, A. (2008). Factors influencing reading literacy at the primary school level. Problems of Education in the 21st Century, 6, 71-77. https://www.scientiasocialis.lt/pec/node/112
Issa, M. A., & Nadal, K. L. (2011). Homoscedasticity. In S. Goldstein & J. A. Naglieri (Eds.), Encyclopedia of child behavior and development (p. 752). Springer. https://doi.org/10.1007/978-0-387-79061-9_1382
Iyengar, S., & Ball, D. (2007). To read or not to read: A question of national consequence. Washington, DC: National Endowment for the Arts. https://www.arts.gov/impact/research/publications/read-or-not-read-question-national-consequence
Jiang, T., Gradus, J. L., & Rosellini, A. J. (2020). Supervised machine learning: A brief primer. Behavior Therapy, 51(5), 675-687. https://doi.org/10.1016/j.beth.2020.05.002
Lenkeit, J., Chan, J., Hopfenbeck, T. N., & Baird, J. A. (2015). A review of the representation of PIRLS related research in scientific journals. Educational Research Review, 16, 102-115. https://doi.org/10.1016/j.edurev.2015.10.002
Liu, D., Wang, L., Xu, Z., Li, M., Joshi, R. M., Li, N., & Zhang, X. (2023). Understanding Chinese children’s word reading by considering the factors from cognitive, psychological and ecological factors. Contemporary Educational Psychology, 73, Article 102163. https://doi.org/10.1016/j.cedpsych.2023.102163
Lopes, J., Oliveira, C., & Costa, P. (2022). Determinantes escolares y de los estudiantes en el rendimiento lector: Un análisis multinivel con estudiantes portugueses. Revista de Psicodidáctica, 27(1), 29-37. https://doi.org/10.1016/j.psicod.2021.05.001
Luan, J. (2002). Data mining and knowledge management in higher education: Potential applications (ED474143). ERIC. https://eric.ed.gov/?id=ED474143
Ma, L., Xiao, L., & Hau, K. T. (2022). Teacher feedback, disciplinary climate, student self-concept, and reading achievement: A multilevel moderated mediation model. Learning and Instruction, 79, Article 101602. https://doi.org/10.1016/j.learninstruc.2022.101602
Mahesh, B. (2020). Machine learning algorithms: A review. International Journal of Science and Research (IJSR), 9(1), 381-386. https://doi.org/10.21275/ART20203995
Marôco, J. (2021). What makes a good reader? Worldwide insights from PIRLS 2016. Reading and Writing, 34(1), 231-272. https://doi.org/10.1007/s11145-020-10068-8
Mullis, I. V. S., & Martin, M. O. (Eds.). (2019). PIRLS 2021 assessment frameworks. Boston College, TIMSS & PIRLS International Study Center. https://timssandpirls.bc.edu/pirls2021/frameworks/
Reynolds, K., & Martin, M. O. (2023). Updating the PIRLS 2021 instruments for describing the contexts for student learning. In M. von Davier, I. V. S. Mullis, B. Fishbein, & P. Foy (Eds.), Methods and procedures: PIRLS 2021 technical report (pp. 2.1-2.8). Boston College, TIMSS & PIRLS International Study Center. https://doi.org/10.6017/lse.tpisc.tr2102.kb8382
Schneider, A., Hommel, G., & Blettner, M. (2010). Linear regression analysis: Part 14 of a series on evaluation of scientific publications. Deutsches Ärzteblatt International, 107(44), 776-782. https://doi.org/10.3238/arztebl.2010.0776
Schober, P., & Vetter, T. R. (2020). Confounding in observational research. Anesthesia & Analgesia, 130(3), 636-643. https://doi.org/10.1213/ANE.0000000000004627
Yin, L., & Reynolds, K. A. (2023). Creating and interpreting the PIRLS 2021 context questionnaire scales. In M. von Davier, I. V. S. Mullis, B. Fishbein, & P. Foy (Eds.), Methods and procedures: PIRLS 2021 technical report (pp. 15.1-15.161). Boston College, TIMSS & PIRLS International Study Center. https://doi.org/10.6017/lse.tpisc.tr2103.kb6994

Articles in Press, Accepted Manuscript
Available Online from 11 October 2025

  • Receive Date 20 March 2025
  • Revise Date 17 August 2025
  • Accept Date 11 October 2025