Automated Essay Assessment using Machine Learning: A Case Study on Newton's Laws of Motion

Wawan Kurniawan; M Feby Khoiru Sidqi

doi:10.57092/ijetz.v4i3.511

Authors

Wawan Kurniawan Universitas Jambi
M Feby Khoiru Sidqi Universitas Jambi

DOI:

https://doi.org/10.57092/ijetz.v4i3.511

Keywords:

Essay Assesment, Machine Learning, Newton’s Laws, Physics Learning, Research and Development.

Abstract

This study aims to develop and evaluate a machine learning–based essay assessment website designed to measure students’ understanding of Newton’s Laws of Motion. A Research and Development (R&D) approach was employed using the ADDIE model (Analysis, Design, Development, Implementation, and Evaluation). This research involved physics teachers in Jambi Province, with a total population of 60 teachers. A random sampling technique was applied to select 20 teachers as respondents. The product underwent two stages of expert validation: Stage 1 scored 3.51 (70.2%), categorized as feasible, while Stage 2 achieved 3.95 (79.04%), classified as highly feasible. Field evaluations conducted with teachers experienced in assessment reported a score of 4.45 (89%), indicating a very high level of practicality and usability. These findings demonstrate that the developed system is effective, reliable, and user-friendly, supporting teachers in providing deeper, constructive feedback and improving assessment efficiency. This research highlights the potential of machine learning integration in educational assessment and offers an innovative solution to enhance the quality of physics learning and school-based evaluation practice

Downloads

Download data is not yet available.

References

Aizikovitsh-Udi, E., & Cheng, D. (2015). Developing critical thinking skills from dispositions to abilities: Mathematics education from early childhood to high school. Creative. http://dx.doi.org/10.4236/ce.2015.64045

Agyei, E., Jita, L., & Jita, T. (2024). Technology integration in science classrooms: Empowering student teachers for improved physics teaching with simulations. Contemporary Mathematics and Science Education. https://doi.org/10.30935/conmaths/14688

Akram, H., Abdelrady, A., Al-Adwan, A., & Ramzan, M. (2022). Teachers’ Perceptions of Technology Integration in Teaching-Learning Practices: A Systematic Review. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.920317

Al-Mansouri, J. (2024). The Impact of Real-Time Feedback on Optimizing Teachers’ Classroom Teaching Pace. Research and Advances in Education. https://doi.org/10.56397/rae.2024.11.06

Amanda, F. D., Dewi, U. P., Mufit, F., & Festiyed, F. (2023). The Influence of Essay Assessment on Student Competency Achievement in Science Learning: Literature Review. Jurnal Penelitian Pendidikan IPA, 9(9), 539-549. https://doi.org/10.29303/jppipa.v9i9.4994

Ariely, M., Nazaretsky, T., & Alexandron, G. (2024). Causal‐mechanical explanations in biology: Applying automated assessment for personalized learning in the science classroom. Journal of Research in Science Teaching. https://doi.org/10.1002/tea.21929

Attali, Y., & Burstein, J. (2006). Automated essay scoring with e-rater® V.2. The Journal of Technology, Learning, and Assessment, 4(3), 1–30. https://ejournals.bc.edu/index.php/jtla/article/view/1650

Barrow, R. (2015). Understanding skills: Thinking, feeling, and caring 1st edition. London: Routledge. https://doi.org/10.4324/9781315678276

Basith, S., Manavalan, B., Shin, T. H., & Lee, G. (2019). SDM6A: a web-based integrative machine-learning framework for predicting 6mA sites in the rice genome. Molecular Therapy-Nucleic Acids, 18, 131-141.

https://doi.org/10.1016/j.omtn.2019.08.011

Black, P., & Wiliam, D. (2018). Classroom assessment and pedagogy. Assessment in Education: Principles, Policy & Practice, 25(6), 551–575. https://doi.org/10.1080/0969594X.2018.1441807

Blundell, C. (2021). Teacher use of digital technologies for school-based assessment: a scoping review. Assessment in Education: Principles, Policy & Practice, 28, 279 - 300. https://doi.org/10.1080/0969594x.2021.1929828

Cahyadi, R. A. H. (2019). Pengembangan bahan ajar berbasis ADDIE model. Halaqa: Islamic Education Journal, 3(1), 35-42. https://doi.org/10.21070/halaqa.v3i1.2124

Carney, M., Webster, B., Alvarado, I., Phillips, K., Howell, N., Griffith, J., ... & Chen, A. (2020). Teachable machine: Approachable Web-based tool for exploring machine learning classification. In Extended abstracts of the 2020 CHI conference on human factors in computing systems (pp. 1-8).

https://dl.acm.org/doi/abs/10.1145/3334480.3382839

Collins, R. (2014). Skills for the 21st Century: Teaching higher-order thinking. Curriculum & Leadership Journal, 12(14), 1-7.

Ding, Y., Zhu, G., Bian, Q., & Bao, L. (2024). Analysis of students’ conceptual change in learning Newton’s third law with an integrated framework of model analysis and knowledge integration. Physical Review Physics Education Research. https://doi.org/10.1103/physrevphyseducres.20.020141

Estrada-Molina, O., Fuentes-Cancell, D., & Morales, A. (2021). The assessment of the usability of digital educational resources: An interdisciplinary analysis from two systematic reviews. Education and Information Technologies, 27, 4037 - 4063. https://doi.org/10.1007/s10639-021-10727-5

Gabon, D. (2025). Automated Grading of Essay Using Natural Language Processing: A Comparative Analysis with Human Raters Across Multiple Essay Types. Journal of Information Systems Engineering and Management. https://doi.org/10.52783/jisem.v10i6s.700

Gikandi, J. W., Morrow, D., & Davis, N. E. (2011). Online formative assessment in higher education: A review of the literature. Computers & Education, 57(4), 2333–2351. https://doi.org/10.1016/j.compedu.2011.06.004

Gombert, S., Fink, A., Giorgashvili, T., Jivet, I., Di Mitri, D., Yau, J., Frey, A., & Drachsler, H. (2024). From the Automated Assessment of Student Essay Content to Highly Informative Feedback: a Case Study. Int. J. Artif. Intell. Educ., 34, 1378-1416. https://doi.org/10.1007/s40593-023-00387-6

Heil, J., & Ifenthaler, D. (2023). Online Assessment in Higher Education: A Systematic Review. Online Learning, 27(1), 187-218.

https://doi.org/10.24059/olj.v27i1.3398

Huang, W., & Fang, N. (2022). Development of a machine learning-based automated assessment system for engineering problem-solving and writing. International Journal of Engineering Education, 38(2), 456–468.

Huang, Q., & Chen, J. (2024). Enhancing academic performance prediction with temporal graph networks for massive open online courses. Journal of Big Data, 11, 1-26. https://doi.org/10.1186/s40537-024-00918-5

Hussein, M. A., Hassan, H., & Nassef, M. (2019). Automated language essay scoring systems: A literature review. PeerJ Computer Science, 5, e208.

http://dx.doi.org/10.7717/peerj-cs.208

Indrastoeti, J., & Istiyati, S. (2017). Asesmen dan evaluasi pembelajaran di sekolah dasar. UNS Press.

Kamiludin, K., & Suryaman, M. (2017). Problematika pada pelaksanaan penilaian pembelajaran Kurikulum 2013. Jurnal Prima Edukasia, 5(1), 58-67.

https://doi.org/10.21831/jpe.v5i1.8391

Kim, N. J., Belland, B. R., & Walker, A. E. (2018). Effectiveness of computer-based scaffolding in the context of problem-based learning for STEM education: Bayesian meta-analysis. Educational Psychology Review, 30(2), 397-429.

https://doi.org/10.1007/s10648-017-9419-1

Koe, L., Kustandi, C., & Siregar, E. (2025). AI-driven feedback system: Implementing advanced NLP and openAI for online learning. Jurnal Inovasi dan Teknologi Pembelajaran. https://doi.org/10.17977/um031v11i32024p137

Loewenthal, K., & Lewis, C. A. (2018). An introduction to psychological tests and scales. London: Psychology Press.

Loureiro, P., & Gomes, M. J. (2022). The impact of online peer assessment on student learning in higher education: A systematic review of literature. EDULEARN22 Proceedings, 4490-4496. https://doi.org/10.21125/edulearn.2022.1074

Luo, Y. (2024). Enhancing educational interfaces: Integrating user-centric design principles for effective and inclusive learning environments. Applied and Computational Engineering. https://doi.org/10.54254/2755-2721/64/20241427

Matt C., Charles L. (2015). Learning Flask Framework Build dynamic, data-driven websitesites and modern website applications with Flask. Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

Marchisio, M., Barana, A., Fioravera, M., Rabellino, S., & Conte, A. (2018). A Model of Formative Automatic Assessment and Interactive Feedback for STEM. 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), 01, 1016-1025. https://doi.org/10.1109/compsac.2018.00178

Mbusi, N., & Luneta, K. (2023). Implementation of an Intervention Program to Enhance Student Teachers’ Active Learning in Transformation Geometry. SAGE Open, 13. https://doi.org/10.1177/21582440231179440

Morris, R., Perry, T., & Wardle, L. (2021). Formative assessment and feedback for learning in higher education: A systematic review. Review of Education, 9(3), e3292.

https://doi.org/10.1002/rev3.3292

Newcombe, N. S., & Shipley, T. F. (2014). Studying visual and spatial reasoning for design creativity: Thinking about spatial thinking. Dordrecht: Springer.

Nicol, D. J., & Macfarlane‐Dick, D. (2006). Formative assessment and self‐regulated learning: A model and seven principles of good feedback practice. Studies in Higher Education, 31(2), 199–218. https://doi.org/10.1080/03075070600572090

Oktaviyanti, I., & Rosyidah, A. N. K. (2019). Korelasi antara Hasil Tes Lisan dengan Hasil Tes Tertulis pada siswa PGSD UNRAM. Jurnal Ilmu Pendidikan, 2(1), 9-19.

https://doi.org/10.33366/ilg.v2i1.1514

Ole, F. C. B. (2020). Development and Validation of Teachers’ Practices on Formative Assessment Scale (TPFAS): A Measure Using Feedback Loop Model. International Journal of Education, 13(1), 53–62.

https://doi.org/10.17509/ije.v13i1.24715

Otter, D. W., Medina, J. R., & Kalita, J. K. (2020). A survey of the usages of deep learning for natural language processing. IEEE transactions on neural networks and learning systems, 32(2), 604-624. https://doi.org/10.1109/TNNLS.2020.2979670

Perkasa, D. A., Saputra, E., & Fronita, M. (2015). Sistem Ujian Online Essay Dengan Penilaian Menggunakan Metode Latent Sematic Analysis (LSA). Jurnal Ilmiah Rekayasa dan Manajemen Sistem Informasi, 1(1), 1-9.

https://doi.org/10.24014/rmsi.v1i1.1313

Plasencia-Calaña, Y. (2025). Operationalizing Automated Essay Scoring: A Human-Aware Approach. arXiv preprint arXiv:2506.21603.

https://doi.org/10.48550/arXiv.2506.21603

Ramalingam, V., Pandian, A., Chetry, P., & Nigam, H. (2018). Automated Essay Grading using Machine Learning Algorithm. Journal of Physics: Conference Series, 1000. https://doi.org/10.1088/1742-6596/1000/1/012030

Ramesh, P., & Sanampudi, S. K. (2023). Interpretable machine learning models for automated essay scoring. Education and Information Technologies, 28(5), 5793–5814. https://doi.org/10.1007/s10639-022-11560-1

Retnoningsih E, Pramudita R. (2020). Mengenal Machine Learning Dengan Teknik Supervised dan Unsupervised Learning Menggunakan Python. Bina Insani ICT Journal. 7(2): 156-165. http://dx.doi.org/10.51211/biict.v7i2.1422

Ruseti, S., Paraschiv, I., Dascalu, M., & McNamara, D. (2024). Automated Pipeline for Multi-lingual Automated Essay Scoring with ReaderBench. Int. J. Artif. Intell. Educ., 34, 1460-1481. https://doi.org/10.1007/s40593-024-00402-4

Sadler, D. R. (2010). Beyond feedback: Developing student capability in complex appraisal. Assessment & Evaluation in Higher Education, 35(5), 535–550. https://doi.org/10.1080/02602930903541015

Sedrakyan, G., Malmberg, J., Verbert, K., Järvelä, S., & Kirschner, P. (2020). Linking learning behavior analytics and learning science concepts: Designing a learning analytics dashboard for feedback to support learning regulation. Comput. Hum. Behav., 107, 105512. https://doi.org/10.1016/j.chb.2018.05.004

Shermis, M. D., & Burstein, J. (2019). Handbook of automated essay evaluation: Current applications and new directions (2nd ed.). Routledge. https://doi.org/10.4324/9780429467377

Shute, V. J., & Rahimi, S. (2017). Review of computer‐based assessment for learning in elementary and secondary education. Journal of Computer Assisted Learning, 33(1), 1-19. http://dx.doi.org/10.1111/jcal.12172

Vashishth, T. K., Sharma, V., Sharma, K. K., Kumar, B., Panwar, R., & Chaudhary, S. (2024). AI-driven learning analytics for personalized feedback and assessment in higher education. In Using traditional design methods to enhance AI-driven decision making (pp. 206-230). IGI Global Scientific Publishing.

http://dx.doi.org/10.4018/979-8-3693-0639-0.ch009

Vittorini, P., Menini, S., & Tonelli, S. (2020). An AI-Based System for Formative and Summative Assessment in Data Science Courses. International Journal of Artificial Intelligence in Education, 31, 159 - 185. https://doi.org/10.1007/s40593-020-00230-2

Yavuz, F., Çelik, Ö., & Yavaş Çelik, G. (2025). Utilizing large language models for EFL essay grading: An examination of reliability and validity in rubric‐based assessments. British Journal of Educational Technology, 56(1), 150-166.https://doi.org/10.1111/bjet.13494

Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent trends in deep learning based natural language processing. ieee Computational intelligenCe magazine, 13(3), 55-75. https://doi.org/10.1109/MCI.2018.2840738

Yu, M., & Tsai, M. (2021). ACS: Construction Data Auto-Correction System—Taiwan Public Construction Data Example. Sustainability. https://doi.org/10.3390/su13010362

U. Vashishth, Sharma, S., & Singh, P. (2024). Artificial intelligence in formative assessment: A review of applications and challenges. Computers & Education: Artificial Intelligence, 7(1), 100212. https://doi.org/10.1016/j.caeai.2023.100212

Zupanc, K., & Bosnić, Z. (2020). Automated essay scoring: A survey of the state of the art. IEEE Transactions on Learning Technologies, 13(4), 821–840. https://doi.org/10.1109/TLT.2020.2996890