10 minute read

Published:

Predicting Student’s Performance 文献综述


一直以来学者们都在为提升学生表现预测问题的预测结果而努力,所采用的思路主要分为两大类[16]:第一,将学生表现的预测看作分类或回归问题,使用分类、回归算法进行预测;第二,将学生、题目、做题表现与推荐系统中的用户(user)、商品(item)、评分(rating)相对应,使用协同过滤等推荐系统算法进行预测。而图学习方法在学生表现预测问题中刚刚得到应用不久,且正在为越来越多的学者所关注,更与本研究所提出的模型密切相关,所以在文献综述最后一部分单独梳理利用图学习方法预测学生表现的研究。

分类或回归方法

传统方法

在早期的研究中,学者常使用的分类、回归算法较为基础,如线性回归(Linear Regression) [6]、决策树(Decision Tree) [7]、支持向量机(Support Vector Machine) [89]、人工神经网络(Artificial Neural Network) [17]。同时,学者还对这些算法在不同情景下的表现进行了比较,试图寻找最适合学生表现预测问题的算法类型和可能的提升方向。Hämäläinen et al. (2006)[18]使用支持向量机、贝叶斯网络等多种机器学习模型预测传统教育情境下的学生课程表现;Amrieh et al. (2015)[19]对比朴素贝叶斯、决策树、人工神经网络等模型,发现神经网络能得到更好的结果;Romero et al.(2008)[20]在比较20 余种决策树、模糊规则学习、神经网络算法的同时,认为学生表现预测问题所使用的模型不仅强调精确度,还需要关注教师对算法的可理解性。

以上研究中使用的特征较为简单,仅为学生个人信息(如性别、年龄、家庭状况等),以及学生学业信息(如GPA、考试分数、能力水平等)[2122]。这些特征能够胜任一学期一次的考试分数、GPA 等预测任务,但对在线做题测试平台上频率高的练习准确率预测则显得粒度不够精细。这时便需要加入学生做题的历史记录信息,还有每道题各自的信息(如难度、知识点等)。在之后的研究中序列信息往往成为与学生基本信息同样重要、甚至更加重要的新维度。

随着在线教育系统的广泛使用,学生在系统中的行为信息也得到学者的关注。从登录系统的次数、查看课程的次数、课程中讨论和举手的次数等数值信息[10],到近年来使用人机交互工具捕捉到的鼠标移动轨迹[23],丰富的行为数据提供了精细化、个性化的学生信息,能够进一步提高预测的表现。在线教育系统还存储了更多方面的数据供学者挖掘,例如Assielou et al.[24]在学生做题之前先采用成就情感问卷(Achievement Emotions Questionnaire)对学生的情感状态进行评估;Liu et al. (2019)[15]使用长短期记忆神经网络(LSTM)提取练习题文本信息等。

集成方法

学者也通过改进传统模型来提高预测精度,最常见的方法就是对简单的分类模型进行集成(Ensemble)。Amrieh et al. (2016)[10]发现Bagging、Adaboosting、随机森林等常用集成方法能够更为准确地进行预测,之后Hasibur Rahman et al.(2018)[25]使用Ensemble Filtering 方法得到了比Bagging 和Boosting 更好的结果。

Elbadrawy et al. (2014, 2015)[1112]考虑学生之间的相似性,将多个回归方程组合,提出个性化多回归模型(Personalized Multi-Regression Model),这一模型将相似性纳入预测中的思路,与后文将介绍的协同过滤算法类似。而Tran et al.(2017)[26]同样认为学生之间的相似性在预测任务中较为重要,作者将基于学生基本信息的传统分类模型与基于做题信息的协同过滤算法得到的结果进行加权平均,效果比二者单独预测更好。

知识追踪方法

对新增特征和原有特征的深入挖掘也拓宽了处理学生表现预测问题的思路和方法。部分学者关注不同特征的重要性、特征提取和预处理的方法改进[27],更主流的研究则集中在对学生做题历史记录的序列学习。

挖掘学生做题历史记录的一个常用方法是知识追踪(Knowledge Tracing),即对学生在学习过程中的知识点掌握程度建模,随着做题量的增加,学生对不同知识点的掌握程度也在动态变化,做对下一道练习题的概率将受题目对应知识点掌握程度的影响。它的理论基础是教育心理学的认知理论,所构建的认知模型(Cognitive Model)的本质是用隐变量表示学生对知识点的掌握程度。其中一个经典模型是 Item Response Theory(IRT) [28],各个知识点的掌握程度在模型中使用一维的数值变量表示。之后学者对其进行改进,将隐变量扩充为多维,提出Deterministic Input, Noisy And gate(DINA) [29]、FuzzyCDF[30]等模型。也有学者从学习理论入手,尝试将学习曲线[31]和遗忘曲线[32]理论融入认知模型中,更加真实地模拟学习过程。

除了基础理论方面,在技术上发展序列学习模型是研究的另一个重点。贝叶斯知识追踪模型(Bayesian Knowledge Tracing, BKT) [33]是这方面的代表,它用一系列二值变量来表示各个知识点是否掌握,然后利用隐马尔可夫模型(Hidden Markov Model)更新知识点的状态。进一步地,基于深度学习技术的深度知识追踪(Deep Knowledge Tracing, DKT)也被提出,Piech et al. (2015)[13]等人首次采用循环神经网络中的LSTM 模拟更新学生的知识掌握过程,Zhang et al. (2017)[34]将其拓展为记忆网络(Memory Network),提高了学生建模的精度。实验结果表明当前在大多数数据集中学生学习建模效果最佳的方法基本都是DKT 类模型[35],许多学者也不断从各个问题情境下改进完善DKT 模型[3639]。利用加入注意力机制的LSTM进行知识追踪并挖掘文本信息的EKT 模型[15]可以算是知识追踪方法中的现有最好(state-of-the-art)模型。


推荐系统方法

第二个主要思路是将学生表现预测任务看作推荐系统中的评分预测问题[4041]。需要注意的是,推荐系统在在线教育领域的应用不仅仅是在做题预测问题上,还可以用来向学生推荐合适的题目或课程。实际上后者的应用更为广泛,但本文所讨论的只是学生表现预测中的推荐系统思路,后文所指都为该种应用。

传统方法

在做题准确率预测问题中采用的推荐系统方法可大致分为4 种:基于邻域(Neighborhoodbased)的方法,例如基于用户和项目的协同过滤(Collaborative Filtering)[42];基于模型(Modelbased)的方法,以矩阵分解(Matrix Factorization) [43]为代表;基于内容(Contentbased)的方法[44];以及混合方法(Hybrid)。其中基于内容的方法实际上就是上文中基于分类或回归的方法,在这一部分不作赘述,而以上方法中应用最广、影响最大的是Thai-Nghe et al. (2011a)[45]使用的矩阵分解模型。

矩阵分解模型与第一类分类、回归模型最大的区别体现在预测算法上。矩阵分解即将𝑚 × 𝑛 维的学生题目共现矩阵分解为𝑚 × 𝑘 维、𝑘 × 𝑛 维的学生矩阵和题目矩阵,其中每个学生和每道题目都可以用一个𝑘 维隐向量表示。学生矩阵与题目矩阵相乘得到预测矩阵中的元素需要尽量接近共现矩阵中已知的元素,于是便转化为一个最小化误差的优化问题。ThaiNghe et al. (2010)[43]认为学生表现预测任务的关键在于对学生的知识掌握程度和猜对题目的概率建模,而矩阵分解模型中的隐向量恰好能够模拟学生的“本应做对却失误了”和“本不会做却猜对了”的做题行为。

与基于分类或回归的方法类似,基于推荐系统的方法也经历了特征的深入挖掘和模型的改进。ThaiNghe Nguyen 在他所提出的矩阵分解模型的基础上多年来不断改进:ThaiNghe et al. (2011a)[45]提出Tensor Factorization 模型,扩充时间维度,将模型由静态转换为动态,并验证其表现优于传统的分类或回归模型;ThaiNghe et al. (2015)[46]在学生表现矩阵的基础上加入知识点练习题矩阵和知识点学生矩阵,利用多关系矩阵分解模型(MultiRelational Factorization Model)综合学生、练习题、知识点三个角度的相互信息进行预测。这些新算法都能得到更为精确的预测结果。

深度学习方法

推荐系统凭借着的海量数据及深度学习技术,在电商、视频平台上应用的新模型层出不穷,但却很少有学者继续将基于深度学习的协同过滤族算法应用在学生表现预测任务中。这是因为许多新型深度学习推荐系统模型是多个子模型的集成,协同过滤和矩阵分解只是其中的一部分思路,这样的模型可以归类于前一种基于分类或回归的模型。例如DIEN 模型[47]模拟用户兴趣进化的过程,被应用于阿里巴巴公司推荐系统的点击率预测任务,但这一推荐系统算法本质上与上文提到的EKT 模型[15]使用LSTM 进行知识追踪是同一种思路,而不再归类于基于推荐系统的方法,所以在进入深度学习时代后,基于推荐系统的方法较少被学者提及。

图学习方法

自2017 年Kipf 提出图卷积神经网络(GCN)模型[48]后,图学习方法在各个领域迅速发展,而在2019 年才出现将图学习方法应用在学生表现预测问题中的研究。Hu et al.(2019)[49]使用包含注意力机制的GCN 预测学生在下一学期课程中得到的分数,开启图学习方法应用于学生表现预测的先河。

尽管图学习方法的应用时间较短,但无论是分类或回归方法,还是推荐系统方法中,都融入了图学习的思路。Nakagawa et al.(2019)[50]在DKT 模型的基础上,考虑了知识点之间、练习题之间的图结构关系,在学生每次答题后更新相对应的知识点掌握状态的同时,更新该知识点在知识图中相邻知识点的状态,构建出基于图神经网络技术的图知识追踪模型(Graph-based Knowledge Tracing, GKT)。而在推荐系统方法中,基于图学习技术,练习题之间的联系[51]、学生之间的社会网[52]等图结构信息也被进一步融入经典的矩阵分解模型,为预测提供了丰富的关联信息。

还有一些学者利用部分在线做题系统中存储的用户行为信息,关注“学生题目”二分图,其中边上存储有回答正确与否以及在做题过程中学生的行为信息。常见的处理方法是将边上的信息看作一个“行为”假结点,这样所有边上不再存有属性,只需发掘3 类结点的信息。Karimi et al.(2020)[53]关系图神经网络(Relational GNN)模型提取学生和题目的图嵌入向量,再使用LSTM 学习学生行为序列的嵌入向量,三者拼接作为最终预测的输入。Li et al.(2020)[14]研究中的用户行为信息则是学生在做题时的鼠标移动轨迹,作者构建能够提取鼠标移动特征、分析“学生-互动-练习题”异质网络的新型RRGCN 模型。


References

[1] Romero C, Ventura S. Educational data mining: A review of the state of the art[J/OL]. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 2010, 40(6): 601618.DOI: 10.1109/TSMCC.2010.2053532.

[2] Romero C, Ventura S. Data mining in education[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2013, 3(1): 1227.

[3] Romero C, Ventura S. Educational data science in massive open online courses[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2017, 7(1): e1187.

[4] PeñaAyala A. Educational data mining: A survey and a data miningbased analysis of recent works[J]. Expert systems with applications, 2014, 41(4): 14321462.

[5] RastrolloGuerrero J L, GomezPulido J A, DuranDominguez A. Analyzing and predicting students’performance by means of machine learning: a review[J]. Applied sciences, 2020, 10 (3): 1042.

[6] Şen B, Uçar E, Delen D. Predicting and analyzing secondary education placementtest scores: A data mining approach[J]. Expert Systems with Applications, 2012, 39(10): 94689476.

[7] Bin Mat U, Buniyamin N, Arsad P M, et al. An overview of using academic analytics to predict and improve students’ achievement: A proposed proactive intelligent intervention[C]//2013 IEEE 5th conference on engineering education (ICEED). IEEE, 2013: 126130.

[8] Kabakchieva D. Predicting student performance by using data mining methods for classification [J]. Cybernetics and information technologies, 2013, 13(1): 6172.

[9] Strecht P, Cruz L, Soares C, et al. A comparative study of classification and regression algorithms for modelling students’ academic performance.[J]. International Educational Data Mining Society, 2015.

[10] Amrieh A, Hamtini T, Aljarah I. Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods[J]. International Journal of Database Theory and Application, 2016, 9(8): 119136.

[11] Elbadrawy A, Studham R S, Karypis G. Personalized multiregression models for predicting students’performance in course activities[J]. 2014.

[12] Elbadrawy A, Studham R S, Karypis G. Collaborative multiregression models for predicting students’ performance in course activities[C]//proceedings of the fifth international conference on learning analytics and knowledge. 2015: 103107.

[13] Piech C, Spencer J, Huang J, et al. Deep knowledge tracing[J]. arXiv preprint arXiv:1506.05908, 2015.

[14] Li H, Wei H, Wang Y, et al. Peerinspired student performance prediction in interactive online question pools with graph neural network[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2020: 25892596.

[15] Liu Q, Huang Z, Yin Y, et al. EKT: Exerciseaware knowledge tracing for student performance prediction[J/OL]. arXiv, 2019, 4347(c). DOI: 10.1109/tkde.2019.2924374.

[16] Elbadrawy A, Polyzou A, Ren Z, et al. Predicting student performance using personalized analytics[J]. Computer, 2016, 49(4): 6169.

[17] Chen J F, Hsieh H N, Do Q H. Predicting student academic performance: A comparison of two metaheuristic algorithms inspired by cuckoo birds for training neural networks[J]. Algorithms, 2014, 7(4): 538553.

[18] Hämäläinen W, Vinni M. Comparison of machine learning methods for intelligent tutoring systems[C]//International conference on intelligent tutoring systems. Springer, 2006: 525534.

[19] Amrieh E A, Hamtini T, Aljarah I. Preprocessing and analyzing educational data set using xapi for improving student’s performance[C]//2015 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT). IEEE, 2015: 15.

[20] Romero C, Ventura S, Espejo P G, et al. Data mining algorithms to classify students[C]// Educational data mining 2008. 2008.

[21] Hassan H, Ahmad N B, Anuar S. Improved students’ performance prediction for multiclass imbalanced problems using hybrid and ensemble approach in educational data mining[J/OL]. Journal of Physics: Conference Series, 2020, 1529(5): 08. DOI: 10.1088/17426596/ 1529/5/0 52041.

[22] Adejo O, Connolly T. An integrated system framework for predicting students’academic performance in higher educational institutions[J]. International Journal of Computer Science and Information Technology (IJCSIT), 2017, 9(3): 149157.

[23] Wei H, Li H, Xia M, et al. Predicting student performance in interactive online question pools using mouse interaction features[C]//Proceedings of the Tenth International Conference on Learning Analytics & Knowledge. 2020: 645654.

[24] Assielou K A, Haba C T, Gooré B T, et al. Emotional impact for predicting student performance in intelligent tutoring systems (its)[J/OL]. International Journal of Advanced Computer Science and Applications, 2020, 11(7): 219225. DOI: 10.14569/IJACSA.2020.0110728.

[25] Hasibur Rahman M, Rabiul Islam M. Predict Student’s Academic Performance and Evaluate the Impact of Different Attributes on the Performance Using Data Mining Techniques[J/OL]. 2nd International Conference on Electrical and Electronic Engineering, ICEEE 2017, 2018 (December): 14. DOI: 10.1109/CEEE.2017.8412892. 30

[26] Tran T O, Dang H T, Dinh V T, et al. Performance prediction for students: A multistrategy approach[J/OL]. Cybernetics and Information Technologies, 2017, 17(2): 164182. DOI: 10.1 515/cait20170024.

[27] Dien T T, Luu S H, ThanhHai N, et al. Deep learning with data transformation and factor analysis for student performance prediction[J]. Int. J. Adv. Comput. Sci. Appl.(IJACSA), 2020, 11(8).

[28] Embretson S E, Reise S P. Item response theory[M]. Psychology Press, 2013.

[29] De La Torre J. Dina model and parameter estimation: A didactic[J]. Journal of educational and behavioral statistics, 2009, 34(1): 115130.

[30] Liu Q, Wu R, Chen E, et al. Fuzzy cognitive diagnosis for modelling examinee performance [J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2018, 9(4): 126.

[31] Anzanello M J, Fogliatto F S. Learning curve models and applications: Literature review and research directions[J]. International Journal of Industrial Ergonomics, 2011, 41(5): 573583.

[32] Ebbinghaus H. Memory: A contribution to experimental psychology[J]. Annals of neurosciences, 2013, 20(4): 155.

[33] Corbett A T, Anderson J R. Knowledge tracing: Modeling the acquisition of procedural knowledge[ J]. User modeling and useradapted interaction, 1994, 4(4): 253278.

[34] Zhang J, Shi X, King I, et al. Dynamic keyvalue memory networks for knowledge tracing[C]// Proceedings of the 26th international conference on World Wide Web. 2017: 765774.

[35] Wilson K H, Xiong X, Khajah M, et al. Estimating student proficiency: Deep learning is not the panacea[C]//In Neural Information Processing Systems, Workshop on Machine Learning for Education. 2016: 3.

[36] Shen S, Liu Q, Chen E, et al. Convolutional Knowledge Tracing: Modeling Individualization in Student Learning Process[J/OL]. SIGIR 2020 Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020. DOI: 10.114 5/3397271.3401288.

[37] Chen Y, Liu Q, Huang Z, et al. Tracking knowledge proficiency of students with educational priors[J/OL]. International Conference on Information and Knowledge Management, Proceedings, 2017, Part F1318: 989998. DOI: 10.1145/3132847.3132929.

[38] Huang Z, Liu Q, Chen Y, et al. Learning or Forgetting? A Dynamic Approach for Tracking the Knowledge Proficiency of Students[J/OL]. ACM Transactions on Information Systems, 2020, 38(2). DOI: 10.1145/3379507.

[39] Cully A, Demiris Y. Online Knowledge Level Tracking with DataDriven Student Models and Collaborative Filtering[J/OL]. IEEE Transactions on Knowledge and Data Engineering, 2020, 32(10): 20002013. DOI: 10.1109/TKDE.2019.2912367.

[40] ThaiNghe N, Horv T, SchmidtThieme L, et al. Personalized forecasting student performance [C]//2011 IEEE 11th International Conference on Advanced Learning Technologies. IEEE, 2011: 412414.

[41] Toscher A, Jahrer M. Collaborative filtering applied to educational data mining[J]. KDD cup, 2010.

[42] Pero Š, Horváth T. Comparison of collaborativefiltering techniques for smallscale student performance prediction task[M]//Innovations and Advances in Computing, Informatics, Systems Sciences, Networking and Engineering. Springer, 2015: 111116.

[43] ThaiNghe N, Drumond L, KrohnGrimberghe A, et al. Recommender system for predicting student performance[J/OL]. Procedia Computer Science, 2010, 1(2): 28112819. http://dx.doi .org/10.1016/j.procs.2010.08.006.

[44] KlašnjaMilićević A, Ivanović M, Nanopoulos A. Recommender systems in elearning environments: a survey of the stateoftheart and possible extensions[J/OL]. Artificial Intelligence Review, 2015, 44(4): 571604. DOI: 10.1007/s104620159440z.

[45] ThaiNghe N, Drumond L, Horváth T, et al. Factorization techniques for predicting student performance[ J/OL]. Educational Recommender Systems and Technologies: Practices and Challenges, 2011: 129153. DOI: 10.4018/9781613504895. ch006.

[46] ThaiNghe N, SchmidtThieme L. Multirelational factorization models for student modeling in intelligent tutoring systems[C]//2015 Seventh international conference on knowledge and systems engineering (KSE). IEEE, 2015: 6166.

[47] Zhou G, Mou N, Fan Y, et al. Deep interest evolution network for clickthrough rate prediction [C]//Proceedings of the AAAI conference on artificial intelligence: volume 33. 2019: 59415948.

[48] Kipf T N, Welling M. Semisupervised classification with graph convolutional networks[J]. arXiv preprint arXiv:1609.02907, 2016.

[49] Hu Q, Rangwala H. Academic performance estimation with attentionbased graph convolutional networks[J]. arXiv preprint arXiv:2001.00632, 2019.

[50] Nakagawa H, Iwasawa Y, Matsuo Y. Graphbased knowledge tracing: modeling student proficiency using graph neural network[C]//2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI). IEEE, 2019: 156163.

[51] HuynhLy T N, Le H T, Nguyen T N. Integrating courses’ relationship into predicting student performance[J]. International Journal, 2020, 9(4).

[52] ASSIELOU K A, Cissé Théodore H, KADJO T L, et al. Multirelational and socialinfluence model for predicting student performance in intelligent tutoring systems (its)[J/OL]. International Journal of Engineering and Advanced Technology, 2020, 9(3): 20582066. DOI: 10.35940/ijeat.c5169.029320.

[53] Karimi H, Derr T, Huang J, et al. Online academic course performance prediction using relational graph convolutional neural network[C]//Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020). 2020: 444450.

[54] Wang S, Hu L, Wang Y, et al. Graph learning approaches to recommender systems: A review [J]. arXiv preprint arXiv:2004.11718, 2020.

[55] Shi C, Hu B, Zhao W X, et al. Heterogeneous information network embedding for recommendation[ J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 31(2): 357370.

[56] Han X, Shi C, Wang S, et al. Aspectlevel deep collaborative filtering via heterogeneous information networks.[C]//IJCAI. 2018: 33933399.

[57] Liu Z, Zhou J. Introduction to graph neural networks[M]. 2020.

[58] Leskovec J. Cs224w: Machine learning with graphs[Z].

[59] Cheng H T, Koc L, Harmsen J, et al. Wide & deep learning for recommender systems[J/OL]. ACM International Conference Proceeding Series, 2016, 15Septemb: 710. DOI: 10.1145/29 88450.2988454.

[60] Hamilton W L, Ying R, Leskovec J. Inductive representation learning on large graphs[J]. arXiv preprint arXiv:1706.02216, 2017.

[61] Zhou G, Bian W, Wu K, et al. Can: Revisiting feature coaction for clickthrough rate prediction [J]. arXiv preprint arXiv:2011.05625, 2020.