Motivations for Learning Analytics
Institutions have been gathering data on students and others for many years, ranging from registration information to access control to activity records. Until recently, this data was not used extensively for analytics. (Kay, Korn & Oppenheim, 2012) This is changing rapidly. The applications of analytics in education will be widespread and pervasive. Any assessment of the ethics of learning analytics requires a comprehensive understanding of these applications and their impact.
There’s a lot of optimism surrounding learning analytics, tempered with caution. “Although some of this excitement may be based on unrealistic expectations and limited knowledge of the complexities of the underpinning technologies,” writes Tuomu (2018), “it is reasonable to expect that the recent advances in AI and machine learning will have profound impacts on future labour markets, competence requirements, as well as in learning and teaching practices”
We can look at the motivations for learning analytics to develop a sense of what to expect from the technology. Institutions may desire, for example (Kay, Korn & Oppenheim, 2012):
- responses to economic and competitive pressures
- agility of analysis
- good practice in modern enterprise management.
- intelligent personalised services
- visualization of patterns and trends in large-scale data
In what follows we classify the potential applications of learning analytics based on what analytics can do and how they work. Modern analytics is based mostly in supervised machine learning and neural networks, and these in turn provide algorithms for pattern recognition, regression, and clustering (Raghu & Schmidt, 2020).
Built on these basic capabilities are four widely-used categories (Brodsky, et.al., 2015; Boyer and Bonnin, 2017) to which I add additional fifth and sixth categories:
- descriptive analytics, answering the question “what happened?”;
- diagnostic analytics, answering the question “why did it happen?”;
- predictive analytics, answering the question “what will happen?”;
- prescriptive analytics, answering the question “how can we make it happen?”; and
- generative analytics, which use data to create new things, and
- deontic analytics, answering the question “what should happen?”.
Within each of these categories we can locate the various applications that fall under the heading ‘learning analytics’, as suggested by motivations outlined above. We will find that development in more of these application areas has already started, so that this is less a prediction of future technology (though some applications may yet be years away) and more of a snapshot of the state of the art today.
Descriptive analytics include analytics focused on description, detection and reporting, including mechanisms to pull data from multiple sources, filter it, and combine it. The output of descriptive analytics includes visualizations such as pie charts, tables, bar charts or line graphs. Descriptive analytics can be used to define key metrics, identify data needs, define data management practices, prepare data for analysis, and present data to a viewer. (Vesset, 2018).
Data Mart / Data Lake
In an institutional ‘business intelligence’ environment, data from multiple systems is combined into a structured ‘data warehouse’ or unstructured ‘data lake’. This data is organized along specific business lines to create a ‘data mart’ for each. Data marts in turn provide input for further analysis, reporting, or mining. (Panopoly, 2020) These support institutional reporting functions, tracking such data as marketing, enrollment, course completion, resource utilization, and numerous other factors related to instruction and learning. (Manjunath, et.al., 2011)
Most modern institutions implement tracking systems in order to monitor and report on the condition of their assets, maintain physical security, and manage resources and costs. Assets are tracked by scanning barcode labels or by using tags that broadcast their location using GPS, BLE or RFID. These technologies can also be used to track people or vehicles (NUSTL, 2016:2.1) in order to manage staff, reduce costs, and for security and insurance purposes. Tracking can be used to capture fine-grained movements or activities. “Deployed properly, the tags could be used in a new class of wearable designed to track physical movement and shape change” (Nichols, 2018). This data can be used as input assessment algorithms (see below), to train robots and AIs, and to provide diagnostic information for coaches and doctors. Google and other companies use tracking to support maps.
Call tracking is a function whereby institutions are able to identify and analyze trends in access requests. For example, a call tracking service “helps reveal what platforms, publishers, keywords, and channels drive high-intent customers to call and can help marketers create a more informed media allocation strategy” (Dooley, 2019). Similar functionality is provided by web-based tracking systems. Data can be collected from institutional websites (known as first-party tracking) or from a network of associated websites (known as third-party tracking) using cookies, browser fingerprinting, or small images (web beacons) attached to web pages (Princiya, 2018).
Tracking is playing an increasing role in schools and learning analytics. “There are classroom management tools like Google’s G Suite for Education that track school work and help teachers, parents and students communicate via messaging and email. Smaller apps such as ClassDojo, which claims to be in use at 90 percent of K-8 schools in the United States, tackle specific subjects or problems” (Kelly, 2019). “That app lets teachers communicate with parents and grant students virtual points for positive behaviors like teamwork or subtract them for negative actions like being out of their chair. Newer ‘personalized learning’ programs attempt to develop custom education plans for students based on data they collect about their interests and skills.”
This use of electronic data involves the comparison of one method with another. For example, “if it is known that the goals of an operation can be attained by more than one method, the various alternatives can in principle be simulated on the computer, and their relative costs and benefits thereby compared to find the most cost-effective one” (Ware, et.al., 1973: 9). Systems analysis is a core application of analytics today. “System analysts solve business problems through analysing the requirements of information systems and designing such systems by applying analysis and design techniques” (AU, 2020).
For example, in a trade-off analysis, “In the context of the definition of a system, a trade-off study consists of comparing the characteristics of each system element and of each candidate system architecture to determine the solution that best globally balances the assessment criteria” (Faisandier, et.al., 2019). Other types of systems analysis include effectiveness analysis, value chain analysis, cost analysis and technical risks analysis.
As McKinsey reports, “Many higher-education institutions’ data analytics teams focus most of their efforts on generating reports to satisfy operational, regulatory, or statutory compliance” (Krawitz, Law & Litman, 2018). Descriptive analytics supports these functions, combining data from multiple sources, a capacity that is improving over time. In educational institutions, “AI systems have become better at automatically combining separate datasets, and can piece together much more information about us than we might realise we have provided.” (Clement-Jones, et.al., 2018).
Marketing staff can access and use ‘identity graphs’ compiled from separate sources in order to create a composite profile of an individual person. (Smith and Khovratovich, 2016) Interactions, communities and affinity groups can be identified using social network analysis. Saqr (et.al., 2018) argue that “Social network analysis (SNA) may be of significant value in studying online collaborative learning.” For example, at the University of Wollongong, the Social Networks Adapting Pedagogical Practice initiative (SNAPP) “analyses conversations in online student discussion forums to show patterns and relationships in real time as social network diagrams.” (Sclater, Peasgood and Mullan, 2016)
Dashboards present a ‘view’ of the data that is optimized for comprehension and utility. They are widely used in learning technology. A person’s learning activities, for example, can be graphed and displayed in comparison with other learners. This analysis can contain fine-grained detail, for example, attention metadata. (Duval, 2011) Students have asked for a variety of real-time dashboard analytics, including calendar and pacing information, and review of time spent online (Ifenthaler and Widanapathirana, 2014). For example, at the University of Iowa, a student-facing analytics dashboard, Elements of Success (University of Iowa, 2020), has “the capacity to access summary data and curated visualizations allows students to better measure their progress and motivates them to take action when critical outcomes are not achieved” (O’Brien, 2020). The developers report, “learners benefited from regularly accessing the feedback” (van Horne, et.al., 2020)
In a well-designed dashboard “each activity section of the dashboard would be selected to further query and generate reporting about a certain activity,” as demonstrated in a mock-up by Dringus (2012). Today, a standardized format, the Experience API, is used to collect and store activity data in a Learning Record Store (LRS) (Corbí and Solans, 2014; Kevan and Ryan, 2016). These support dashboards such as LAViEW (Learning Analytics Visualizations & Evidence Widgets) that helps learners analyze learning logs and provide evidence of learning. (Majumdar, et.al., 2019) Similar functionality is also provided by IMS Global’s Caliper learning analytics (Oakleaf, et.al., 2017)
Diagnostic analytics look more deeply into data in order to detect patterns and trends. Such a system could be thought of as being used to draw an inference about a piece of data based on the patterns detected in sample or training data, for example, to perform recognition, classification or categorization tasks.
Audio and Video Transcription
Voice recognition used to transcribe audio input for text output or automated translation. Educational applications include note capture and audio assessment (Dittrich and Star, 2018). Text and voice recognition can be used to support students with disabilities (Raja, 2016) though such means as automated subtitles or lecture capture and transcription. Successful application areas include content identification, with the progress demonstrated through TinEye (https://tineye.com/) reverse image searching or Shazam (https://www.shazam.com/), music identification.
Facial and object recognition technology is being used to augment security in schools and institutions. For example, “A New York school district has announced it will begin using controversial facial recognition software for school safety purposes.” The district is using an application called AEGIS, developed by SN Technologies a Canadian-based company that sells similar systems for hospitals, retail, banks and casinos. It is also offering weapons detection to the Lockport school district. (Klein, 2020).
Spam is a constant problem for any public-facing online service that accepts input. Analytics that help filter unwanted messages (whether sent by humans or bots) is of significant value. Today these are generally available and widely used. Users can learn to train their own machine learning to filter spam (Gan, 2018) or use commercial systems such as Akismet (Barron, 2018).
Pattern recognition is used for plagiarism detection. For example, Amigud, et.al. (2017) describe a “machine-learning based framework (that) learns students’ patterns of language use from data.” This enables them to detect tracts of text not written by the particular student. In a recent high-profile case, a neural network was able to determine how much of Shakespeare’s play Henry VIII was written by a man named John Fletcher. (Plecháč, 2019)
Video recognition and biometrics are used for security purposes and exam proctoring (Rodchua, 2017). Online proctoring companies are employing varying levels of automation “For instance, Examity also uses AI to verify students’ identities, analyze their keystrokes, and, of course, ensure they’re not cheating. Proctorio uses artificial intelligence to conduct gaze detection, which tracks whether a student is looking away from their screens” (Heilweil, 2020). Similar companies include Honorlock, and ProctorU. These are becoming widespread; Examity reports that its monitoring services were used by more than 500 colleges and employers in 2019 (Sawers, 2019).
Fake content is a potentially negative outcome of analytics (see below). However, there is also a nascent industry in developing algorithms that identify them. Amazon, Microsoft and Facebook have initiated a Deepfakes detection challenge (Facebook, 2019). Analytics tools exploit flaws in fakes videos in order to expose them as artificial, for example, by detecting face-warping artifacts (Li and Lyu, 2019). Companies touting solutions include ZeroFOX, which offers a tool called Deepstar (Price, 2019) and Quantum Integrity, which launched a startup with Ecole Polytechnique Federale de Lausanne (EPFL) (Hoffstetter, 2019), as well as Google’s Assembler, a tool to spot fake or doctored images (Google Assembler, 2020).
Supporting Special Needs
Analytics can identify students with special needs. Tuomi. (2018) reports on the early detection of dyslexia. “A well-published example is the Swedish company ‘Lexplore’ that has developed a system that quickly scans for students at risk and detects dyslexia by tracking reader eye movements.” Similarly, “AI techniques can facilitate early intervention and provide specialists with robust tools indicating the person’s autism spectrum disorder level.” Special needs analysis can also identify sensory and physical impairments, language or mathematical difficulties, attention deficit disorder (Drigas and Ioannidoun, 2012), and autism (Anagnostopoulou, et.al., 2020).
With detection may also come automated support for special needs. For example, a company called Brainpower supports autistic learners through the use of computerized glasses. (UC Berkeley, 2019) Moreover, one study “found that children with ASD are tend to speak to a robot more than they do with an adult (and another) study has shown that robots can be used to produce positive outcomes in individual therapy of children with severe disabilities through using play methods (Ibid)”.
Analytics and AI can also provide direct support for people with accessibility needs. For example, “Tools for image recognition are helping people who are visually impaired better navigate both the internet and the real world” (Access Now, 2018:14). Automated transcription, translation, and sensors are all also additional forms of accessibility support.
Information about emotions (also known as ‘affect data’) can be identified in written text in social media or submissions from students (Medhat, et.al., 2014). A study found that “when using sentiment analyses of 51,000 student evaluation comments from 23 large OU modules, that substantial differences in lived and affective experiences could be identified” (Rientes & Jones, 2019: 114). Affect data can also be found in clickstream data, interaction patterns, and bodily signals (D’Mello, 2017:116-118). This technology is being commercialized; “With companies like Affectiva, BeyondVerbal and Sensay providing plug-and-play sentiment analysis software, the affective computing market is estimated to grow to $41 billion by 2022,” according to Harvard Business Review(Kleber, 2018).
Applications include instructional evaluation, institutional decision and policy-making, and learning system enhancement. (Dolianiti, et.al., 2019:413-414) For example, one project uses facial recognition technology to evaluate whether students in a classroom are bored. A facial recognition system being used in China reports on six types of behaviors (“reading, writing, hand raising, standing up, listening to the teacher, and leaning on the desk”) as well as identifying seven moods (“happy, upset, angry, fearful or disgusted”) logs both the behavior and the facial expressions (Jun, 2018). Affect data can also be used to inform the management of student discussions and in teacher evaluations (D’Mello, 2017:118).
Analytics can be used to sample a population’s opinions. Drew (2016) describes “improving insight into citizens’ views, needs and experiences by analysing unstructured data, such as letters, phone calls or social media.” For example, the Ofﬁce for National Statistics (ONS) “is exploring how to use Twitter data to understand the movements of particular populations (so people can plan local services) and the Ministry of Justice is analysing social media comments to see what people think about, and how they use, the courts.” (Ibid.)
Educators are interested in opinion sampling in order to assess campus environments, courses, programs, and individual professors. Beard (2020) reports, “In the UK, AI is being used today for things like monitoring student wellbeing, automating assessment and even in inspecting schools.” These applications are becoming more all-encompassing and specialized. For example, Kiciman, et.al. (2018) describe the use of social media analytics “to study the effect of early alcohol usage on topics linked to college success.” In another example, researchers used social media reports to accumulate data on sexual assaults on college campuses (Duong, et.al., 2020).
There is a large literature devoted to automated grading, beginning with Page (1966), continuing through the Hewlett competition (Kaggle, 2012), and today the technology has at least “developed to the point where the systems provide meaningful feedback on students’ writing and represent a useful complement (not replacement) to human scoring” (Kaja and Bosnic, 2015).
Automated writing evaluation (AWE) has already been tested in language-learning classes and has met with a generally favourable response from students. “The correcting network can effectively help students to improve their English writing. Compared with the traditional teacher marking and giving feedback, it is immediate, clear, and time-saving!” (Lu, 2019).
Automated essay grading is a categorization task; a large number of essays are sorted into categories, labeled with grades, and then a candidate essay is matched to a category, and hence, a grade. However, students report that they would also value feedback on submitted assignments, which requires a more in-depth analysis. (Ifenthaler and Widanapathirana, 2014) It can also be adjusted to reflect required content and feedback based on scoring rubrics, as for example in McGraw-Hill’s used in its Connect digital course materials (Schaffhauser, 2020).
Ultimately, AI could replace grading altogether. Rose Luckin argues, “If technology tracked a student throughout their school days, logging every keystroke, knowledge point and facial twitch, then the perfect record of their abilities on file could make such testing obsolete” (Beard, 2020).
Usually we think of competencies as being assessed by means of tests and evaluations, but analytics offers a wider scope. As reported by the European Commission’s Joint Research Centre (JRC) Science Policy Report, “AI systems are well suited for collecting informal evidence of skills, experience, and competence from open data sources, including social media, learner portfolios, and open badges.” (Tuomu, 2018). This creates the possibility of assessing competencies from actual performance data outside educational environments, for example, using technologies like analytics-based assessment of personal portfolios (van der Schaaf, et.al., 2017) or using data-driven skills assessment in the workplace (Lin, et.al., 2018).
Predictive analytics can support resource planning and event response. As Drew (2016) reports, examples include cases where “the Government Digital Service (GDS) has created predictive models of trafﬁc to gov.uk pages to help spot issues with pages or services more quickly and the Food Standards Agency has used Twitter data to predict norovirus outbreaks.” The 2020 coronavirus outbreak in Wuhan was first detected by an artificial intelligence called BlueDot (Niiler, 2020). In the future, such systems could recommend appropriate measures be taken in advance of an urgent need.
Resource planning is an administrative task that depends on predictions of needs and usage. Tools like CourseSignals were originally designed for this. (Gasevic, Dawson & Siemens, 2015) Analytics also enable institutions to assess the use of existing resources, for example, whether use of the learning management system corresponds with improved learning outcomes. (Sclater, Peasgood and Mullan, 2016)
Predictive analytics can be used to assist in learning design. For example, “Rienties and Toetenel (2016) linked 151 modules taught in 2012–2015 at the OU followed by 111,256 students with students’ behaviour using multiple regression models and found that learning designs strongly predicted Virtual Learning Environment (VLE) behaviour and performance of students, as illustrated in Figure 7.4. Findings indicated that the primary predictor of academic retention was how teachers designed their modules, in particular the relative amount of so-called ‘communication activities’ (i.e., student to student, teacher to student, student to teacher).” (Rientes & Jones, 2019: 116)
Analytics can offer novel approaches to user testing using data that might be otherwise opaque to evaluators. For example, Lester Tong and his colleagues (Tong, et.al., 2020) investigated whether “individuals’ neural responses to videos could predict their choices to start and stop watching videos.” Tong explains, “Here we have a case where there is information contained in subjects’ brain activity that allows us to forecast the behavior of other, unrelated, people—but it’s not necessarily reflected in their self-reports or behavior” (Stanford, 2020).
A company called UserTesting writes on its website, “analytics can—and should—guide usability testing efforts… because analytics can reveal problems that usability testing could never uncover. Usability testing is typically conducted with a small, representative sample of visitors. It also takes place on a fairly limited number of pages. So if a site has 10,000 products, and there’s a problem with one page or product category, it’s highly unlikely that usability testing would reveal the problem” (UserTesting, 2013).
Aaron Powers (2018) identifies three major types of application of analytics for user testing: discovery (“when your project is early on, you don’t know what you don’t know, and are open to anything”), testing hypotheses (“When your team has specific ideas and you’ve collected specific data around those ideas”), and to simplify problems (in case “results are too complex for people to understand, or the problem just seems too big to pull it together”).
Identify Students At Risk of Failing
Working with a learning management system and different types of data, analytics tools can identify factors statistically correlated with students at risk of failing failing or dropping out.” (Scholes, 2016). For example, a Jisc report describes several such projects, including one at New York Institute of Technology (NYIT) that used four data sources: “admission application data, registration / placement test data, a survey completed by all students, and financial data.” (Sclater, Peasgood and Mullan, 2016) And for example, “using the trace data collected by the Blackboard learning management system (LMS) and data from the institutional Student Information System (SIS), Course Signals uses a data-mining algorithm to identify students at risk of academic failure in a course” (Gasevic, Dawson & Siemens, 2015).
The purpose of identifying at-risk students is to provide the institution with the opportunity to prevent students from failing or dropping out before it happens. Analytics have been effective in this regard; for example, “Georgia State University has a proven record of using predictive analytics to improve student retention and graduation rates,” according to university staff. (Neelakantan, 2019; Banan, 2019) This institution benefits as well; an RPK report recently suggested that “expected increases in student retention rates could generate net revenue averaging $1 million annually per institution” (Desrochers & Staisloff, 2017).
Analytics can draw from campus information sources to support student advising. For example, the Berkeley Online Advising (BOA, 2020) project at the University of California at Berkeley “integrates analytical insights with relationship and planning tools for advisors of large cohorts and the students they support… Its underlying data is aggregated from multiple, disparate campus sources into S3 buckets, and transformed into Redshift views” (Heyer & Kaskiris, 2020). Additionally, the Comprehensive Analytics for Student Success (COMPASS) project at the University of California, Irvine, “ focuses on bringing relevant student data to campus advisors, faculty, and administrators with the goal of providing actionable information to improve undergraduate student outcomes” (UCI Compass, 2020). As O’Brien (2020) writes, “These tools provide advisors with information that allows for proactive outreach and intervention when critical student outcomes are not met.”
“The goal of precision education is to identify at-risk students as early as possible and provide timely intervention based on teaching and learning experiences,” write Yang and Ogata (2020). They suggest that analogous to precision medicine, precision education systems consider a wider array of variables than learning analytics, “students’ IQ, learning styles, learning environments, and learning strategies… precision education is a new challenge of applying artificial intelligence, machine learning, and learning analytics.”
Descriptive and predictive analytics can assist an institution in recruitment. A McKinsey report describes a project at Northeastern that led them to “introduce a number of changes to appeal to those individuals most likely to enroll once admitted, including offering combined majors.” The same report also described how as a result of analytics at UMUC “invested in new call-center capabilities and within a year realized a 20 percent increase in new student enrollment.” (Krawitz, Law & Litman, 2018)
An oft-cited application is the potential of learning analytics to make content recommendations, either as a starting point, or as part of a wider learning analytics-supported learning path. For example, the Personalised Adaptive Study Success (PASS) system supports personalisation for students at Open Universities Australia (OUA) (Sclater, Peasgood and Mullan, 2016). Additionally, students report desiring recommendations regarding potential learning activities, and suggestions for potential learning partners. (Schumacher, 2018) Content and learning path recommendations are based not only on the discipline being studied but also on the individual learning profile, academic history, and a variety of contextual factors. (Ifenthaler and Widanapathirana, 2014)
Adaptive learning is a step beyond learning recommendations in the sense that the learning environment itself changes (or ‘adapts’) to events in the learning experience (Sonwalkar, 2007). For example, “Adaptive learning systems — like IBM Watson and Microsoft Power BI — have the advantage of continually assessing college students’ skill and confidence levels.” (Neelakantan, 2019).
Early adaptive learning applications were expert systems based on explicit knowledge representations and user models, that is, they were based on statements and rules (Garrett & Roberts, 2004). More recently, the ‘black box’ methods characteristic of contemporary analytics, such as neural networks, have been employed. “Their popularity has stemmed from their ability to classify students, share characterizations, and simulate and track learners ’cognitive progress’” (Almohammadi, et.al., 2017).
A widely publicized startup launched in 2015, Knewton, advertised that it could disrupt the textbook industry by creating adaptive learning out of open educational resources (OER). (del Castillo, 2015))
Adaptive Group Formation
Zawacki-Richter, et. al. (2019:4) write, AI in Education (AIEd) “can contribute to collaborative learning by supporting adaptive group formation based on learner models, by facilitating online group interaction or by summarising discussions.” While they suggest such an application as a tool to be used by a human tutor, fully-automated group formation is desirable, especially for mass-enrollment courses. Mujkanovica, et.al. (2012) demonstrated ”the feasibility of using a set of individual characteristics of group members to form groups that are more likely to have desired group behaviours.” Alberola, et.al. (2016) demonstrate a simple tool that “combines artificial intelligence techniques such as coalition structure generation, Bayesian learning, and Belbin’s role theory to facilitate the generation of working groups in an educational context.”
Though not specifically a learning application, the potential to employ placement matching services is of direct interest to students. These are applications that would match students with potential co-op placements, internships or job recruitment opportunities. These analytics are also of interest to recruiters and employers. Examples include services like Ziprecruiter (Lunden, 2018) and the Government of Canada Talent Cloud. (OECD, 2018)
Job interviews are being evaluated, at least in part, by artificial intelligence systems.For example, “According to Korea Economic Research Institute (KERI), nearly a quarter of the top 131 corporations in the country currently use or plan to use AI in hiring.” For example, the company SmartRecruiters says its analytics capabilities “arm talent acquisition teams with the data and reporting tools necessary to drive recruiting strategies forward” (SmartRecruiters, 2019).
What’s significant is that the AI uses gamification and sentiment analysis to evaluate an applicant’s personality, attitudes and adaptability. And “it then asks questions that can be tough: ‘You are on a business trip with your boss and you spot him using the company (credit) card to buy himself a gift. What will you say?’” (Cha, 2020).
CNN reports on similar companies operating in the United States, like HireView, Yobs and Talview. The emphasis is on the value AI-based interview screening brings, “ushering a massive number of people through the interview process quickly and reviewing them in a fair, consistent way” (Metz, 2020).
It is well known that “many universities have introduced differential pricing by undergraduate program as an alternative to across-the-board tuition increases”. It is a response to oversubscribed programs, and “a policy lever through which state governments can alter the field composition of the workforce they are training with the public higher education system” (Stange, 2013). Analytics plays a significant role in setting dynamic or variable prices, with prominent examples being found in hotel pricing, airfare, and ecommerce. (Altexsoft, 2019).
According to David Parkes (2019), “Artificial intelligence (AI) is the pursuit of machines that are able to act purposefully to make decisions towards the pursuit of goals.” He argues, “Machines need to be able to predict to decide, but decision making requires much more. Decision making requires bringing together and reconciling multiple points of view. Decision making requires leadership in advocating and explaining a path forward. Decision making requires dialogue.”
The use of algorithmic systems to make decisions is not new. Drew (2016) describes “allowing data-led decisions by non-technical analysts/specialists,” for example, the government’s emergency planning committee “now has a dynamic, interactive visualization tool that allows non-specialists to help respond to emergencies.” Parkes and Vohra (2019) point to its use in cases involving recidivism prediction, credit scoring, applicant screening, setting bail and sentencing, lending and insurance, and the allocation of public services. Banks and financial institutions “rely heavily on quantitative analysis and models in most aspects of financial decision making. They routinely use models for a broad range of activities, including underwriting credits; valuing exposures, instruments, and positions; measuring risk; managing and safeguarding client assets; determining capital and reserve adequacy; and many other activities” (FRS, 2011:1).
What’s new are scale, ubiquity and accountability. AI enables a human decision-maker to make many more decisions and to use the same process for multiple types of decisions, but raises questions about who is ultimately accountable for a decision that has been made and how new information could be added to better inform the decision.
Generative analytics is different from the previous four categories in the sense that it is not limited to answering questions like “what happened” or “how can we make it happen”, but instead uses the data to create something that is genuinely new. In a sense, it is like predictive and prescriptive analytics in that it extrapolates beyond the data provided, but while in the former two we rely on human agency to act on the analytics, in the case of generative analytics the analytics engine takes this action on its own.
Chatbots and More
Chatbots are applications that interact in real time with humans, using native language, and have the appearance of conducting a genuine conversation. They are now widely used on corporate websites and have been integrated into personal assistants such as Google Assistant and Siri. Existing educational applications include the Duolingo Chatbot and EdTech Foundry’s Differ. Recently, Microsoft released an open source version of QBot that takes questions, refers them to experts, and then listens in on the answers in order to learn how to answer them for itself (Fleming, 2020). Chatbots may also assist with career development, offering “informed, friendly and flexible high-quality, local contextual and national labour market information including specific course/training opportunities, and job vacancies” (Attwell, 2020).
In the future, in addition to emulating human conversation, chatbots will generate additional human responses, such as gestures and emotions. For example, there’s Magic Leap’s Mica, an AI-driven being “that comes across as very human” (Craig, 2018). “What is remarkable about Mica is not the AI, but the human gestures and reactions (even if they are driven by AI).” Meanwhile, though “fictionalized and simulated for illustrative purposes only”, products like Samsung’s Neon are being called ‘artificial humans’, “a computationally created virtual being that looks and behaves like a real human, with the ability to show emotions and intelligence.” (Craig , 2020)
The Washington Post uses an AI called Heliograf to write news and sports articles; in its first year it wrote around 850 items. “That included 500 articles around the election that generated more than 500,000 clicks.” (Moses, 2017) While this was a domain-specific application, “there is still a push by many newsrooms for general purpose bots that serve a similar purpose to a site’s homepage or search bar.” (Johri, Han & Mehta, 2016)
Today, AI-based text-generators have advanced to the point where the designers of systems such as OpenAI’s GPT2 help them back in order to evaluate the wider impact of computers that write as well as humans do. “To show what that means, OpenAI made one version of GPT2 with a few modest tweaks that can be used to generate infinite positive – or negative – reviews of products” (Hern, 2019). A “a slimmed-down, accessible version of that same technology” can be tested at TalkToTransformer.com (Vincent, 2019).
Analytics and AI have self-generated computer science papers (Stribling, et.al., 2005), music (Galeon, 2016), art (Shepherd, 2016), books (Springer Nature, 2019) and inventions (Fleming, 2018). There are now commercial AI-based applications that generate educational resources, including articles (eg., AiWriter), textbooks, test questions (eg. WeBuildLearning), and more.
Deepfakes, and technologies like Deepfakes, represent “the emergence of generative technology capitalizing on machine learning promises (that) will enable the production of altered (or even wholly invented) images, videos, and audios that are more realistic and more difficult to debunk than they have been in the past.” (Chesney and Citron, 2018:1759). “New approaches promise greater sophistication, including Google DeepMind’s ‘Wavenet’ model, Baidu’s DeepVoice, and GAN models” (Ibid:1761).
Such technology can make educational content more interesting and engaging. For example, In 2015, an algorithm called DeepStereo developed for Google Maps was able to generate a video from a series of still photographs (Flynn, et.al., 2015). Also, “With deep fakes, it will be possible to manufacture videos of historical figures speaking directly to students, giving an otherwise unappealing lecture a new lease on life” (Chesney and Citron, 2018:1769).
Chesney and Citron write, “The educational value of deep fakes will extend beyond the classroom. In the spring of 2018, Buzzfeed provided an apt example when it circulated a video that appeared to feature Barack Obama warning of the dangers of deep-fake technology itself. One can imagine deep fakes deployed to support educational campaigns by public-interest organizations such as Mothers Against Drunk Driving (Chesney and Citron, 2018:1770).
AI-based coaching is a tempting target for analytics companies. As Loutfi (2019) explains, because of the cost workplace coaching is often limited to higher-level executives. However, companies such as LeaderAmp, which offers an AI-based coaching service, may change that. “As long as it’s grounded in the science, which suggests why a soft skill works and how to learn it, then it’s completely appropriate to use with AI because even a very expensive coach cannot be with a person they’re coaching all the time.”
Learning analytics may in particular help develop students’ self-regulated learning (SRL). “There is a distinct need for educational researchers and educators to focus on the development of SRL in digital environments, including how learning analytics may be deployed to support this imperative” (Lodge. 2018). There’s also the [possibility, suggest the same authors, that analytics may help in the development of a deeper theoretical understanding of SRL.
It may seem far-fetched, but some pundits are already predicting the development of artificial intelligences and robots teaching in the classroom. In a recent celebrated case, a professor fooled hist students with ‘Jill Watson’, an artificial tutor (Miller, 2016). “‘Yuki’, the first robot lecturer, was introduced in Germany in 2019 and has already started delivering lectures to university students at The Philipps University of Marburg.” (Ameen, 2019). While most observers still expect AI and analytics to be limited to a support role, these examples suggest that the role of artificial teachers might be wider than expected.
There is an additional question that needs to be answered, and has been increasingly entrusted to analytics: “what ought to happen?” Recently the question has been asked with respect to self-driving vehicles in the context of Philippa Foote’s ‘trolley problem’. (Foote, 1967). In a nutshell, this problem forces the reader to decide whether to take an action to save six and kill one, or to desist from action to save one, allowing (by inaction) six to be killed. It is argued that automated vehicles will face similar problems.
It may be argued that these outcomes are defined ahead of time by human programmers. For example, cars made for rich people have their ethical priorities preset: they will protect passengers, not bystanders (Morris, 2016). But not all ethical outcomes will be preprogrammed; arguably, an AI’s ethical stance will often emerge as a byproduct of other priorities and activities.
Automated systems have an impact on what content is acceptable (and what is not) in a society. We see this in effect on online video services. “On both YouTube and YouTube Kids, machine learning algorithms are used to both recommend and mediate the appropriateness of content” (UC Berkeley Human Rights Center Research, 2019). Though such algorithms are influenced by input parameters, their decisions are always more nuanced than designed, leading people to adapt to the algorithm, thereby redefining what is acceptable.
What counts as ‘appropriate’ behaviour may be shaped by analytics and AI. We have already seen research that shows that “educational robots are being brought to various settings such as school and home in a way that alters how children learn, interact and develop their personalities.” Just as texting changed social expectations on how people communicate, so also interactions with AI may well change how people make requests and draw conclusions. These and additional implications are being investigated by HUMAINT, “an interdisciplinary JRC project aiming to understand the impact of machine intelligence on human behaviour, with a focus on cognitive and socio-emotional capabilities and decision making” (Tuomi, 2018; HUMAINT, 2020).
Identifying the Bad
The patterns and behaviours exhibited by one or more persons may be evidence of bad intentions. An example of this is the process of conspiracy-theory detection. For example, a recent Ryerson-Royal Roads research project is analyzing patterns to detect the spread of misinformation about Covid-19. “Coronavirus misinformation is spreading quickly on social media. We are starting to see that many of the tactics and tools used to spread politically-motivated misinformation are now being used to spread misinformation about COVID-19” (Habib, 2020). Researchers look for signs of ‘social bots’ and ‘coordinated inauthentic behaviour’. “These two forms of social manipulation, when left unchecked, can skew the conversation, manufacture anger where there is none, suppress opposition or dampen debate. These tactics may undermine our ability as citizens to make decisions and reach consensus as a society” (Ibid).
Amplifying the Good
AI can select between what might be called ‘good’ content and ‘bad’ content, displaying a preference for the former. For example, in response to violence in conflict zones, researchers “argue the importance of automatic identification of user-generated web content that can diffuse hostility and address this prediction task, dubbed ‘hope-speech detection’” (Palakodety, et.al., 2019).
Defining What’s Fair
There is a line of research that proposes that AI can define what’s fair. An early example of this is software designed to optimize the design of congressional voting districts in such a way that minimizes gerrymandering(Cohen-Addad, Klein & Young, 2018). In another study, research suggested that “an AI can simulate an economy millions of times to create fairer tax policy” (Heaven, 2020). A tool developed by researchers at Salesforce “uses reinforcement learning to identify optimal tax policies for a simulated economy.” The idea in this case was to find tax policy that maximized productivity and income equality in a model economy.
Moderation is a difficult task that involves judgements and interactions on a continual basis. Human moderation is expensive and requires training. However, AI moderation faces numerous challenges. For example, St. John’s professor Kate Klonick argues, “One of the things that’s really, really hard — and has always been hard — is when people post bad content that’s removable, but they post it in protest or to raise awareness. Generally, the biggest threat is going to be over-censorship rather than under-censorship.” (Field & Lapowsky, 2020) Twitter found similar issues. “We want to be clear: While we work to ensure our systems are consistent, they can sometimes lack the context that our teams bring.” (Gadde & Derella, 2020)
During the Covid 19 outbreak companies began relying more on automated moderation. YouTube reported, “We will temporarily start relying more on technology to help with some of the work normally done by reviewers,” the company announced in a blog post (YouTube, 2020). The sort of technologies it can use to do this might include, for example, a neural net that identifies rage on a Twitch stream (Yan, 2020).
Over time, moderation will become more interactive. As chatbots evolve they have the potential to intervene and mediate human conversation, encouraging engagement and tempering over-reaction. Current chatbots (Sennaar, 2019) are far from this ideal. However an early example of this approach is suggested by marketing for Ed Tech Foundy’s Differ, “a class communication app that uses chatbots and artificial intelligence to increase student engagement, performance and retention,” which is part of a larger research program (Nilsen, 2019).
Additionally, moderation will focus on more than just text. For example, in 2019 Facebook announced Whole Post Integrity Embeddings, “which allows its AI systems to simultaneously analyze the entirety of a post — the images and the text — for signs of a violation.” (Field & Lapowsky, 2020)
Systems that track and analyze feelings and emotions can sway users toward good or healthy emotions. For example, some products and services specifically address mental health issues. “These applications aim to coach users through crises using techniques from behavioral therapy. (A program called) Ellie helps treat soldiers with PTSD. (Another program called) Karim helps Syrian refugees overcome trauma. Digital assistants are even tasked with helping alleviate loneliness among the elderly” (Kleber, 2018). Similar systems could interact with students and modify sentiments and emotions that may be interfering with their learning and socialization. “Applications will act like a Fitbit for the heart and mind, aiding in mindfulness, self-awareness, and ultimately self-improvement, while maintaining a machine-person relationship that keeps the user in charge” (Ibid).
Even as this work is being written, new applications of analytics in learning appear every day. The list of AI-generated content continues to expand, for example, and it is not a stretch to imagine learning resources being developed automatically on an as-needed basis in the imaginable future.
So the list of applications above should be taken into consideration only as a tentative snapshot. To remain current, it is advisable to follow online repositories of analytics projects, for example, the Oklahoma University the Projects in Artificial Intelligence Registry (PAIR, 2020), which “serves as a global directory of active and archival AI projects and research and might eventually serve as a hub for various initiatives” (O’Brien, 2020).
In the meantime, the list of applications provided here serves as a baseline describing the objectives and desired outcomes of learning analytics, and therefore as a listing of the benefits expected from this activity, as a counter to the ethical risks and considerations raised in the next chapter. After all, if there were no benefit to be derived from analytics, there would be no ethical implications; we would simply treat analytics as a form of social vandalism. But the potential benefits, as we have seen, are real.