Open Education Resources (OERs) have increased significantly in the last decade, giving learners access to a wider range of educational material any time and anywhere in the world. However, this trend demands automatic approaches to process and evaluate OERs, with the end goal of identifying and recommending the most suitable educational materials for learners. Our work focuses on modelling learner engagement, which is arguably a more reliable measure than popularity/number of views, is more abundant than user ratings and has also been shown to be a crucial component in achieving learning outcomes. We focus on building models to find the characteristics and features involved in context-agnostic engagement (i.e. population-based), as opposed to other contextualised and personalised approaches that focus more on individual learner engagement. We consider both context-agnostic and contextual engagement to be necessary for building effective recommender systems for education, e.g., context-agnostic engagement can be used to build a prior to solve the common cold-start problem and contextual personalised models be used upon an abundance of user data. However, research on context-agnostic engagement is surprisingly scarce and modality-specific. In this work, we explore the idea of building a predictive model for population-based engagement in education. We first propose two sets of relevant features for our predictive model: i) a set of cross-modal and language-based features that are easily applicable to OERs across multiple modalities and ii) a set of video-specific features. We then test different strategies for quantifying learner engagement signals. We further evaluate different machine learning models to predict population engagement in a large dataset of video lectures. We demonstrate the usefulness of our approach when compared to a personalised approach in a scenario of user data scarcity. Additionally, we perform a sensitivity analysis of the best performing model, which shows promising performance and can be easily integrated into an educational recommender system for OERs.
This paper demonstrates the feasibility of building machine learning models to predict populatino engagement of videolectures. In Proceeding of, 2020.