Big Data Analytics in higher Education Walaa Adel Mahmoud Mohamed Arab Academy for Science and Technology and Maritime Transport, [email protected] Abstract— Big Dataprovides an opportunity to educational Institutions to use their InformationTechnology resources strategically to improve educational quality and guidestudents to higher rates of completion, and to improve student persistence andoutcomes.
This paper explores the attributes of big data that are relevant toeducational institutions, investigates the factors influencing adoption of bigdata and analytics in learning institutions and seeks to establish the limitingfactors hindering use of big data in Institutions of higher learning. Thispaper has been use the dataset “AcademicRanking of World Universities, 2003-2017”, we studied and analyzed to forecasthow university’s management and faculty could adapt to changes to improve theireducation and thereby the ranking of their universities in the upcoming years.Microsoft SQL Server Data Mining Add-ins Excel was employed as a software mining tool for predicting the trendinguniversity ranking. This research paper concentrates upon predictive analysisof university ranking using forecasting based on data mining technique.Keywords — BigData analytics, Mining Big Data, Education. 1. INTRODUCTIONAs an emerging ?eld within education, a number of scholarshave contended that Big Data framework is well positioned to address some ofthe key challenges currently facing higher education 1. Global trending isaffecting education, additionally; there has been pressure from political andsocial changes for institutions of higher education to respond to these rapidchanges effectively and on time.
In the context of the strategic planning ofhigher education, Big Data Analytics is relevant nowadays since both regularand distance education brings about new data useful to support the making ofdecisions 2. The plethora of useful data generated makes decision makingtough, however, if higher educational institutes trace data they can adaptbetter 3. Knowledge discovery and data mining approaches have beenutilized to make sense of the unstructured data. There are several techniquesor algorithms that are helpful in extracting the characteristics of the dataand building a pattern. Big data has found its place in education and ispredicted to be extensively implemented in institutions of higher education.
Analytics can be defined as the process of determining,assessing, and interpreting meaning from volumes of data. It has beencategorized in three different categories – descriptive, predictive andprescriptive. Predictive analysis can serve many segments of society as it canreveal hidden relationship which may not be apparent with descriptive modeling.
Analytics advancement plays an important role in higher education planning.Not only, dataanalytics helps in analyzing Below points but also can be helpful in predictivemodeling for faculty, administrative and students groups who are looking outfor genuine results about the university rankings, based on which they maketheir decisions. Using the dataset “Academic Ranking of World Universities,2003-2017”, we studied and analyzed to forecast how university’s management andfaculty could adapt to changes to improve their education and thereby the rankingof their universities in the upcoming years.
Microsoft SQL Server Data MiningAdd-ins Excel was employed as a software mining tool for predicting thetrending university ranking. This research paper concentrates upon predictiveanalysis of university ranking using forecasting based on data miningtechnique. 1.1 The Contribution of this paperThe paper was guided by the following specificobjectives , • Thereis limited research into big data in higher education, despite growinginterests in exploring and unlocking the value of the increasing data withinhigher education environment. • Thispaper contributes to the conceptual and theoretical understanding of Big Dataand Analytics within higher education. • Itintroduces the notion of Big Data and outlines its relevance to highereducation. • Itdescribes the opportunities this growing research are abrings to highereducation as well as major challenges associated with its exploration andimplementation.
2. Big Dataand analytics in higher educationBig Data describes data that isfundamentally too big and moves too fast, thus exceeding the processingcapacity of conventional database systems 4. Big data has some key propertiesamong them are: Volume, Velocity, Veracity, Variety, Volume etc. In addition tothese properties, the stages required to unlock the value of data are – datacollection, data analysis, visualization and application. Some of them areclassification, clustering, regression etc. Big Data is a knowledge systemthat is already changing the objects of knowledge and social theory in many?elds while also having the potential to transform management decision-makingtheory (Boyd & Crawford, 2012). Big Data incorporates the emergent research?eld of learning analytics (Long & Siemen, 2011), which is already agrowing area in education.
However, research in learning analytics has largelybeen limited to examining indicators of individual student and classperformance. Big Data brings new opportunities and challenges for institutionsof higher education. Long andSiemen(2011) indicated that Big Data presents the most dramatic framework inef?ciently utilizing the vast array of data and ultimately shaping the futureof higher education. The application of Big Data in higher education was alsoechoed by Wagner and Ice(2012), who noted that technological developments havecertainly served as catalysts for the move towards the growth of analytics inhigher education.In the context of highereducation, Big Data connotes the interpretation of a wide range ofadministrative and operational data gathered processes aimed at assessinginstitutional performance and progress in order to predict future performanceand identify potential issues related to academic programming, research,teaching and learning (Hrabowski, Suess & Fritz, 2011a, 2011b; Picciano,2012). Others indicated that to meet the demands of improved productivity,higher education has to bring the tool of analytics into the system. As anemerging ?eld within education, a number of scholars have contended that BigData framework is well positioned to address some of the key challengescurrently facing higher education (see, eg, Siemens, 2011).At this early stage much of thework on analytics within higher education is coming from interdisciplinaryresearch, spanning the ?elds of Educational Technology, Statistics,Mathematics, Computer Science and Information Science.
Acore element of thecurrent work on analytics in education is centered on data mining.Big Data in higher education alsocovers database systems that store large quantities of longitudinal data onstudents’ right down to very speci?c transactions and activities on learningand teaching. When students interact with learning technologies, they leavebehind data trails that can reveal their sentiments, social connections,intentions and goals. Researchers can use such data to examine patterns ofstudent performance over time—from one semester to another or from 1 year toanother.On a higher level, it could beargued that the added value of Big Data is the ability to identify useful dataand turn it into usable information by identifying patterns and deviations frompatterns. Long and Siemen (2011) indicated that Big Data is now well positionedto start addressing some of the key challenges currently facing highereducation. An OECD (2013) report suggested that it may be the foundation onwhich higher education can reinvent both its business model and bring togetherthe evidence to help make decisions about educational outcomes. From an organizational learningperspective, it is well understood that institutional effectiveness andadaptation to change relies on the analysis of appropriate data (Rowley, 1998)and that today’s technologies enable institutions to gain insights from datawith previously unachievable levels of sophistication, speed and accuracy(Jacqueline, 2012).
As technologies continue to penetrate all facets of highereducation, valuable information is being generated by students, computerapplications and systems (Hrabowski & Suess, 2010). Furthermore, Big Data Analyticscould be applied to examine student entry on a course assessment, discussionboard entries, blog entries or wiki activity, which could generate thousands oftransactions per student per course. These data would be collected in real ornear real time as it is transacted and then analyzed to suggest courses ofaction. As Siemens (2011) indicated that” learning analytics are afoundational tool for informed change in education” and provide evidence onwhich to form understanding and make informed (rather than instinctive)decisions. Big Data can also address thechallenges associated with ?nding information at the right time when data aredispersed across several unlinked different data systems in institutions. Byidentifying ways of aggregating data across systems, Big Data can help improvedecision-making capability. 3.
OPPORTUNITIES With largevolumes of student information, including enrollment, academic and disciplinaryrecords, institutions of higher education have the data sets needed to bene?tfrom a targeted analytics. Big Data and analytics in higher education can betransformative, altering the existing processes of administration, teaching,learning, academic work (Baer&Campbell,2011), contributing to policy andpractice outcomes and helping address contemporary challenges facing highereducation. Big Data can provide institutions of higher education the predictivetools they need to improve learning outcomes for individual students as wellways ensuring academic programmers are of high-quality standards. By designingprogrammers that collect data at every step of the students learning processes,universities can address student needs with customized modules, assignments,feedback and learning trees in the curriculum that will promote better andricher learning. One of the ways higher education canutilize Big Data tools is to analyze the performance and skill level ofindividual students and create personalized learning experiences that meettheir speci?c learning path ways. When used effectively, Big Data can helpinstitutions enhance learning experience and improving student performanceacross the board, reduce dropout rates and increase graduation numbers (Figure1).
The key contribution of Big Data willdepend on the application of three data models (descriptive, relational andpredictive) and the utility of each to guide better decision making (Figure 2). Fig 1: KeyBig Data opportunities for three end-users in higher education Fig 2: ThreeBig Data Analytical models in higher education 4. RelatedworkA literature review of academicresearch associated with data analytics and descriptive modeling in theEducational sector reveals the following facts: § Competition for Admissions:The advent of ranking systems has given students and society more data to evaluatethe quality of an educational institute.
Unlike the olden days when people hadless knowledge about the quality of education being imparted in an educationalinstitute, thanks to the extensive amount of data available in this age ofinformation, many organizations that engage in ranking universities have comeinto existence and help college-goers choose the best institute that fits theirset of requirements 11. However, there has been little evidence that highcompetition has had positive effects on what students learn. § Student Performance -Predictive Analysis: Research papers also pointed out towards a few factorsthat pre-empted the probability of success of a student 12. These were:· Past Performance: If astudent has a past record of scoring good grades, it became a strong · Indicator of the futureperformance of the student. Demographic Outlook: Multiple research articles andsurveys also proved that students who are married performed better at studiesthan single students. It was also mentioned in the research papers that theolder the student is, the higher the changes of a better GPA are.
· Subject Choice: It has cometo the fore through various researches that those students who chose math andhonors in high school were deemed to succeed in undergraduate and graduatestudies than those students that chose other subjects. · Other Factors: There weresome other factors noted in the research that proved to be strong indicators ofstudents’ success. These included the performance of a student in onlineclasses and the ratio of attempted to that of credits completed.§ Academics & BusinessIntelligence: In all the researches that were undertaken, it was discoveredthat business intelligence was hardly used in the educational sector 13.However, it has tremendous potential and can be used by educational institutesin increasing the enrollment numbers as well as sifting through studentapplications. § Machine Learning: Anotherangle to data analytics in educational institutes that was explored in all theresearch literature was to do with machine learning algorithms. The C4.5algorithm which is essentially a decision tree algorithm can be used toeffectively design predictive models from the student data that has beenaccumulated over the years 14, 15.
5. METHODOLOGYThis paper focuses on the datamining add in of Microsoft SQL Server Data Mining Add-ins Excel. A sample dataset “Academic Ranking of World Universities, 2003-2017” extracted toundergo the lifecycle of a data mining process, which includesformulating/refining data, evaluating and analyzing mining models, therebypredicting results with the use of spreadsheet.
For this process, user musthave installed Microsoft Excel for the Table Analysis and Data Mining Clientadd-ins. Since the approach was based on Table Analysis Tools, we had toconvert our raw data into table format that was supported by Excel.The steps involved during theprocess were: Data Preparation, Data Modeling, Accuracy and Validation, andModel Usage.
In the Data Preparation process, picking the correct attributesfrom the source (exploring data), removing the outliers (cleaning data),splitting the data set into samples (partitioning data) were the commonpreparation needs. Several Data Models are supportedby the add-in, such as: Clustering, Decision Tree, Time-Series, Pie Chart,Neural Networks, Sequencing Clusters, and Histogram etc. Accuracy and Validation generateestimation models that evaluate against the test data. Classification matrix,Accuracy Chart and Profit Chart are few of the parameter evaluators.
In theModel Usage, there are two phases wherein in the browse part we explore thepatterns from the output. In second phase, we query the model to predict fromthe new data.Our dataset, “Academic Ranking ofWorld Universities, 2003-2017” had various factors on which the descriptive andpredictive modeling was done. Some of the factors were- a) Alumni which had around 10% oftotal- It refers to the number of the alumni who wins Nobel Prizes and otherMedals. b) Award which had around 20% of total- Thetotal number of the staff winning Nobel Prizes. c) Highly Cited (HiCi) with total20% – referring to of Highly Cited Researchers in twenty one different subjectcategories. d) Publication PUB (20%) – Count of papers indexed in ScienceCitation Index and Social Science Citation Index in 2017.
e) Per Capita Performance PCP(10%) – weighted scores of above stated five values divided by the count offull-time equivalent academic staff. Based on different parameters,the ranking of university changes. For example in the below figure, Figure3, onthe basis of PCP in 2017, the ranking for the university is high (good) forhigher score.
In 2017, Harvard Institute of Technology had the lowest score ofPCP, so its ranking was the best (6). This analysis is helpful for universitieswho can focus on improving their PCP score which is dependent on above statedindicators. More publication, more HiCI, more awards can help them get a betterranking Fig 3: Rankingof universities in USA based on PCP score. Fig 3: Rankingof universities in worldSimilar models were generated using datamining add-in to analyze the factors and their influence on improving ordeteriorating the ranking of universities. In addition, from our research weexamined that criteria like cultural, economic and historical stature cannot bethe basis on which universities can be ranked. These ranking barriers maymislead students in deciding the university for their bright future.
6. Challenges of implementationSome of the issues faced while implementingthe data mining process for analyzing the trend in the university ranking were:a. Data Fog situation, accuracy, multipletruths and extraction of data. b. Finding the correct and related data setfor the research. c.
Cleaning and refining the data setaccording to the requirements of the software. d. Lack of data governance. e. Understanding the algorithms provided bythe data-mining add-in. 7. ConclusionThrough the proper use of big dataanalytics the revolutionary development on the education sector could beachieved.
Instead of some innate challenges, big data analytics can representscustomized learning environments to the learners, can reduce potential dropoutsand failure and can develop long term learning plans. All of these are possiblethrough the effective development and use of big data analytics in theeducational institutions. Microsoft SQL Server Data Mining Add-ins Excel, thetool used could provide meaningful predictions upon which universities can take correctivemeasures to enhance the quality of education system, improve their facultycontribution towards society. Further, the descriptive modeling can helpevaluate the teaching staff and their excellence in imparting the education.This study provided vital information on which universities need to formulatenew policies.
They can design strategies according to the parameters they arefalling behind on. However, for universities to incorporate the data miningtechnique into their current systems will not be an easy endeavor. Bringing inchanges to the already existing setup would require enormous transformation interms of cost, resources and tools. REFERENCES1. Siemens, G., Howdata and analytics can improve education, July 2011. Retrieved on August, 2011.
8.2. Amorim, J.A., etal. Big Data Analytics in the Public Sector: Improving the Strategic Planningin World Class Universities. in Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC),2013 International Conference on. 2013.
3. Daniel, B., BigData and analytics in higher education: Opportunities and challenges. BritishJournal of Educational Technology, 2015. 46(5): p. 904-920.
4. Manyika, J., etal., Big data: The next frontier for innovation, competition, and productivity.2011.
5. Reyes, J., Theskinny on big data in education: Learning analytics simplified. TechTrends:Linking Research & Practice to Improve Learning, 2015. 59(2): p.
75- 80.6. Bichsel, J., Analyticsin higher education: Benefits, barriers, progress, 2012, and Recommendations. 7. Demchenko, Y., E.
Gruengard, and S. Klous. Instructional Model for Building Effective Big DataCurricula for Online and Campus Education. in Cloud Computing Technology andScience (CloudCom), 2014 IEEE 6th International Conference on. 2014. 8. Michalik, P., J.
Stofa, and I. Zolotova. Concept definition for Big Data architecture in theeducation system. in Applied Machine Intelligence and Informatics (SAMI), 2014IEEE 12th International Symposium on. 2014.9.
Lias, T.E. and T.Elias, Learning Analytics: TheDefinitions, the Processes, and the Potential. 2011. 10. Kantardzic, M.,Data mining: concepts, models, methods, and algorithms2011: John Wiley .
11. M’Hammed, A., H.Wu, and Y. Cherng- Jyh, Using Data Mining for Predicting Relationships betweenOnline Question Theme and Final Grade. Journal of Educational Technology &Society, 2012.
15(3): p. 77-88. 12. Ramesh, V., P.Parkavi, and K. Ramar, Predicting student performance: a statistical and datamining approach.
International Journal of Computer Applications, 2013. 63(8):p. 35-39.
13. mar Pal, A.K. andS. Pal, Analysis and Mining of Educational Data for Predicting the Performanceof Students. 2013.
14. Bound, J., B.Hershbein, and B.T. Long, Playing the Admissions Game: Student Reactions toIncreasing College Competition. The Journal of Economic Perspectives, 2009.
23(4): p. 119-146. 15. Guster, D. and C.Brown, The application of business intelligence to higher education: Technicaland managerial perspectives. J.
of Information Technology Management, 2012.23(2).16. Marsh, O., Maurovich-Horvat, L.
, &Stevenson, O. (2014). Big Data and Education: What’s the Big Idea.
Big Data andEducation conference. UCL17. Hervatis, V., Loe, A., Barman, L.,O’Donoghue, J., & Zary, N.
(2015). A Conceptual Analytics Model for anOutcome-Driven Quality Management Framework as Part of Professional HealthcareEducation. JMIR Medical Education , 1 (2)