Automated Essay Assessment using Machine Learning: A Case Study on Newton's Laws of Motion
DOI:
https://doi.org/10.57092/ijetz.v4i3.511Keywords:
Essay Assesment, Machine Learning, Newton’s Laws, Physics Learning, Research and Development.Abstract
This study aims to develop and evaluate a machine learning–based essay assessment website designed to measure students’ understanding of Newton’s Laws of Motion. A Research and Development (R&D) approach was employed using the ADDIE model (Analysis, Design, Development, Implementation, and Evaluation). This research involved physics teachers in Jambi Province, with a total population of 60 teachers. A random sampling technique was applied to select 20 teachers as respondents. The product underwent two stages of expert validation: Stage 1 scored 3.51 (70.2%), categorized as feasible, while Stage 2 achieved 3.95 (79.04%), classified as highly feasible. Field evaluations conducted with teachers experienced in assessment reported a score of 4.45 (89%), indicating a very high level of practicality and usability. These findings demonstrate that the developed system is effective, reliable, and user-friendly, supporting teachers in providing deeper, constructive feedback and improving assessment efficiency. This research highlights the potential of machine learning integration in educational assessment and offers an innovative solution to enhance the quality of physics learning and school-based evaluation practice
Downloads
References
Agyei, E., Jita, L., & Jita, T. (2024). Technology integration in science classrooms: Empowering student teachers for improved physics teaching with simulations. Contemporary Mathematics and Science Education. https://doi.org/10.30935/conmaths/14688
Akram, H., Abdelrady, A., Al-Adwan, A., & Ramzan, M. (2022). Teachers’ Perceptions of Technology Integration in Teaching-Learning Practices: A Systematic Review. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.920317
Al-Mansouri, J. (2024). The Impact of Real-Time Feedback on Optimizing Teachers’ Classroom Teaching Pace. Research and Advances in Education. https://doi.org/10.56397/rae.2024.11.06
Amanda, F. D., Dewi, U. P., Mufit, F., & Festiyed, F. (2023). The Influence of Essay Assessment on Student Competency Achievement in Science Learning: Literature Review. Jurnal Penelitian Pendidikan IPA, 9(9), 539-549. https://doi.org/10.29303/jppipa.v9i9.4994
Ariely, M., Nazaretsky, T., & Alexandron, G. (2024). Causal‐mechanical explanations in biology: Applying automated assessment for personalized learning in the science classroom. Journal of Research in Science Teaching. https://doi.org/10.1002/tea.21929
Attali, Y., & Burstein, J. (2006). Automated essay scoring with e-rater® V.2. The Journal of Technology, Learning, and Assessment, 4(3), 1–30. https://ejournals.bc.edu/index.php/jtla/article/view/1650
Barrow, R. (2015). Understanding skills: Thinking, feeling, and caring 1st edition. London: Routledge. https://doi.org/10.4324/9781315678276
Basith, S., Manavalan, B., Shin, T. H., & Lee, G. (2019). SDM6A: a web-based integrative machine-learning framework for predicting 6mA sites in the rice genome. Molecular Therapy-Nucleic Acids, 18, 131-141.
https://doi.org/10.1016/j.omtn.2019.08.011
Black, P., & Wiliam, D. (2018). Classroom assessment and pedagogy. Assessment in Education: Principles, Policy & Practice, 25(6), 551–575. https://doi.org/10.1080/0969594X.2018.1441807
Blundell, C. (2021). Teacher use of digital technologies for school-based assessment: a scoping review. Assessment in Education: Principles, Policy & Practice, 28, 279 - 300. https://doi.org/10.1080/0969594x.2021.1929828
Cahyadi, R. A. H. (2019). Pengembangan bahan ajar berbasis ADDIE model. Halaqa: Islamic Education Journal, 3(1), 35-42. https://doi.org/10.21070/halaqa.v3i1.2124
Carney, M., Webster, B., Alvarado, I., Phillips, K., Howell, N., Griffith, J., ... & Chen, A. (2020). Teachable machine: Approachable Web-based tool for exploring machine learning classification. In Extended abstracts of the 2020 CHI conference on human factors in computing systems (pp. 1-8).
https://dl.acm.org/doi/abs/10.1145/3334480.3382839
Collins, R. (2014). Skills for the 21st Century: Teaching higher-order thinking. Curriculum & Leadership Journal, 12(14), 1-7.
Ding, Y., Zhu, G., Bian, Q., & Bao, L. (2024). Analysis of students’ conceptual change in learning Newton’s third law with an integrated framework of model analysis and knowledge integration. Physical Review Physics Education Research. https://doi.org/10.1103/physrevphyseducres.20.020141
Estrada-Molina, O., Fuentes-Cancell, D., & Morales, A. (2021). The assessment of the usability of digital educational resources: An interdisciplinary analysis from two systematic reviews. Education and Information Technologies, 27, 4037 - 4063. https://doi.org/10.1007/s10639-021-10727-5
Gabon, D. (2025). Automated Grading of Essay Using Natural Language Processing: A Comparative Analysis with Human Raters Across Multiple Essay Types. Journal of Information Systems Engineering and Management. https://doi.org/10.52783/jisem.v10i6s.700
Gikandi, J. W., Morrow, D., & Davis, N. E. (2011). Online formative assessment in higher education: A review of the literature. Computers & Education, 57(4), 2333–2351. https://doi.org/10.1016/j.compedu.2011.06.004
Gombert, S., Fink, A., Giorgashvili, T., Jivet, I., Di Mitri, D., Yau, J., Frey, A., & Drachsler, H. (2024). From the Automated Assessment of Student Essay Content to Highly Informative Feedback: a Case Study. Int. J. Artif. Intell. Educ., 34, 1378-1416. https://doi.org/10.1007/s40593-023-00387-6
Heil, J., & Ifenthaler, D. (2023). Online Assessment in Higher Education: A Systematic Review. Online Learning, 27(1), 187-218.
https://doi.org/10.24059/olj.v27i1.3398
Huang, W., & Fang, N. (2022). Development of a machine learning-based automated assessment system for engineering problem-solving and writing. International Journal of Engineering Education, 38(2), 456–468.
Huang, Q., & Chen, J. (2024). Enhancing academic performance prediction with temporal graph networks for massive open online courses. Journal of Big Data, 11, 1-26. https://doi.org/10.1186/s40537-024-00918-5
Hussein, M. A., Hassan, H., & Nassef, M. (2019). Automated language essay scoring systems: A literature review. PeerJ Computer Science, 5, e208.
http://dx.doi.org/10.7717/peerj-cs.208
Indrastoeti, J., & Istiyati, S. (2017). Asesmen dan evaluasi pembelajaran di sekolah dasar. UNS Press.
Kamiludin, K., & Suryaman, M. (2017). Problematika pada pelaksanaan penilaian pembelajaran Kurikulum 2013. Jurnal Prima Edukasia, 5(1), 58-67.
https://doi.org/10.21831/jpe.v5i1.8391
Kim, N. J., Belland, B. R., & Walker, A. E. (2018). Effectiveness of computer-based scaffolding in the context of problem-based learning for STEM education: Bayesian meta-analysis. Educational Psychology Review, 30(2), 397-429.
https://doi.org/10.1007/s10648-017-9419-1
Koe, L., Kustandi, C., & Siregar, E. (2025). AI-driven feedback system: Implementing advanced NLP and openAI for online learning. Jurnal Inovasi dan Teknologi Pembelajaran. https://doi.org/10.17977/um031v11i32024p137
Loewenthal, K., & Lewis, C. A. (2018). An introduction to psychological tests and scales. London: Psychology Press.
Loureiro, P., & Gomes, M. J. (2022). The impact of online peer assessment on student learning in higher education: A systematic review of literature. EDULEARN22 Proceedings, 4490-4496. https://doi.org/10.21125/edulearn.2022.1074
Luo, Y. (2024). Enhancing educational interfaces: Integrating user-centric design principles for effective and inclusive learning environments. Applied and Computational Engineering. https://doi.org/10.54254/2755-2721/64/20241427
Matt C., Charles L. (2015). Learning Flask Framework Build dynamic, data-driven websitesites and modern website applications with Flask. Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
Marchisio, M., Barana, A., Fioravera, M., Rabellino, S., & Conte, A. (2018). A Model of Formative Automatic Assessment and Interactive Feedback for STEM. 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), 01, 1016-1025. https://doi.org/10.1109/compsac.2018.00178
Mbusi, N., & Luneta, K. (2023). Implementation of an Intervention Program to Enhance Student Teachers’ Active Learning in Transformation Geometry. SAGE Open, 13. https://doi.org/10.1177/21582440231179440
Morris, R., Perry, T., & Wardle, L. (2021). Formative assessment and feedback for learning in higher education: A systematic review. Review of Education, 9(3), e3292.
https://doi.org/10.1002/rev3.3292
Newcombe, N. S., & Shipley, T. F. (2014). Studying visual and spatial reasoning for design creativity: Thinking about spatial thinking. Dordrecht: Springer.
Nicol, D. J., & Macfarlane‐Dick, D. (2006). Formative assessment and self‐regulated learning: A model and seven principles of good feedback practice. Studies in Higher Education, 31(2), 199–218. https://doi.org/10.1080/03075070600572090
Oktaviyanti, I., & Rosyidah, A. N. K. (2019). Korelasi antara Hasil Tes Lisan dengan Hasil Tes Tertulis pada siswa PGSD UNRAM. Jurnal Ilmu Pendidikan, 2(1), 9-19.
https://doi.org/10.33366/ilg.v2i1.1514
Ole, F. C. B. (2020). Development and Validation of Teachers’ Practices on Formative Assessment Scale (TPFAS): A Measure Using Feedback Loop Model. International Journal of Education, 13(1), 53–62.
https://doi.org/10.17509/ije.v13i1.24715
Otter, D. W., Medina, J. R., & Kalita, J. K. (2020). A survey of the usages of deep learning for natural language processing. IEEE transactions on neural networks and learning systems, 32(2), 604-624. https://doi.org/10.1109/TNNLS.2020.2979670
Perkasa, D. A., Saputra, E., & Fronita, M. (2015). Sistem Ujian Online Essay Dengan Penilaian Menggunakan Metode Latent Sematic Analysis (LSA). Jurnal Ilmiah Rekayasa dan Manajemen Sistem Informasi, 1(1), 1-9.
https://doi.org/10.24014/rmsi.v1i1.1313
Plasencia-Calaña, Y. (2025). Operationalizing Automated Essay Scoring: A Human-Aware Approach. arXiv preprint arXiv:2506.21603.
https://doi.org/10.48550/arXiv.2506.21603
Ramalingam, V., Pandian, A., Chetry, P., & Nigam, H. (2018). Automated Essay Grading using Machine Learning Algorithm. Journal of Physics: Conference Series, 1000. https://doi.org/10.1088/1742-6596/1000/1/012030
Ramesh, P., & Sanampudi, S. K. (2023). Interpretable machine learning models for automated essay scoring. Education and Information Technologies, 28(5), 5793–5814. https://doi.org/10.1007/s10639-022-11560-1
Retnoningsih E, Pramudita R. (2020). Mengenal Machine Learning Dengan Teknik Supervised dan Unsupervised Learning Menggunakan Python. Bina Insani ICT Journal. 7(2): 156-165. http://dx.doi.org/10.51211/biict.v7i2.1422
Ruseti, S., Paraschiv, I., Dascalu, M., & McNamara, D. (2024). Automated Pipeline for Multi-lingual Automated Essay Scoring with ReaderBench. Int. J. Artif. Intell. Educ., 34, 1460-1481. https://doi.org/10.1007/s40593-024-00402-4
Sadler, D. R. (2010). Beyond feedback: Developing student capability in complex appraisal. Assessment & Evaluation in Higher Education, 35(5), 535–550. https://doi.org/10.1080/02602930903541015
Sedrakyan, G., Malmberg, J., Verbert, K., Järvelä, S., & Kirschner, P. (2020). Linking learning behavior analytics and learning science concepts: Designing a learning analytics dashboard for feedback to support learning regulation. Comput. Hum. Behav., 107, 105512. https://doi.org/10.1016/j.chb.2018.05.004
Shermis, M. D., & Burstein, J. (2019). Handbook of automated essay evaluation: Current applications and new directions (2nd ed.). Routledge. https://doi.org/10.4324/9780429467377
Shute, V. J., & Rahimi, S. (2017). Review of computer‐based assessment for learning in elementary and secondary education. Journal of Computer Assisted Learning, 33(1), 1-19. http://dx.doi.org/10.1111/jcal.12172
Vashishth, T. K., Sharma, V., Sharma, K. K., Kumar, B., Panwar, R., & Chaudhary, S. (2024). AI-driven learning analytics for personalized feedback and assessment in higher education. In Using traditional design methods to enhance AI-driven decision making (pp. 206-230). IGI Global Scientific Publishing.
http://dx.doi.org/10.4018/979-8-3693-0639-0.ch009
Vittorini, P., Menini, S., & Tonelli, S. (2020). An AI-Based System for Formative and Summative Assessment in Data Science Courses. International Journal of Artificial Intelligence in Education, 31, 159 - 185. https://doi.org/10.1007/s40593-020-00230-2
Yavuz, F., Çelik, Ö., & Yavaş Çelik, G. (2025). Utilizing large language models for EFL essay grading: An examination of reliability and validity in rubric‐based assessments. British Journal of Educational Technology, 56(1), 150-166.https://doi.org/10.1111/bjet.13494
Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent trends in deep learning based natural language processing. ieee Computational intelligenCe magazine, 13(3), 55-75. https://doi.org/10.1109/MCI.2018.2840738
Yu, M., & Tsai, M. (2021). ACS: Construction Data Auto-Correction System—Taiwan Public Construction Data Example. Sustainability. https://doi.org/10.3390/su13010362
U. Vashishth, Sharma, S., & Singh, P. (2024). Artificial intelligence in formative assessment: A review of applications and challenges. Computers & Education: Artificial Intelligence, 7(1), 100212. https://doi.org/10.1016/j.caeai.2023.100212
Zupanc, K., & Bosnić, Z. (2020). Automated essay scoring: A survey of the state of the art. IEEE Transactions on Learning Technologies, 13(4), 821–840. https://doi.org/10.1109/TLT.2020.2996890
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Wawan Kurniawan, m feby khoiru sidqi sidqi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
























