Abstract— The prototype data mining system that I designedand implemented in this project is an application for macroeconomical analysis. I built an application for analyzing oneof the major issue of developing countries. The main purposeof the application is classification and prediction some kind ofsituations of economy in a country scale. One of these situationsis, so called, the middle income trap.
The risks of falling intothe Middle Income Trap have increasingly become a focus ofdiscussions on the long-term economic and social developmentprospects of developing countries. These risks, and how tominimize them, are being debated at the highest levels of policymaking in some of the fastest growing emerging economies, evenwhile these countries remain a source of envy to the rest of theworld. I. INSTRUCTIONS TO COMPILE AND RUN THE PROGRAM I built a classification model. The program consist ofseveral stages.
Each stage use different data. Gross DomesticProduct (GDP) is a monetary measure of the market value ofall final goods and services produced in a period of time. It’sperfect data for classifying countries in the middle incometrap. I also take population and exportation databases forfinding out Middle income trap countries. First of all, I foundout the values in Gross Domestic Product per capita datawhich refers to middle income countries (MIC). Secondly, Iset the interval, between which, the countries with middleincome might be.
Then, for better filtering out, I selectthe countries with exportation is slowing down, but thereare still in the middle income. For that, I found out thecountries which export growth from 2009 to 2012 yearsgetting decreased over this time interval. In addition, Idefined the countries that Gross Domestic Product per capitabehave in the same meaner as exportation, but they are alsoin the middle income list. I also used exportation data withpopulation for increasing the accuracy of the model. The laststage is merging all results together. II. DOCUMENTED PROGRAM LISTINGS The project was done in Google Cloud Platform.
GoogleCloud Platform, offered by Google, is a suite of cloudcomputing services that runs on the same infrastructure thatGoogle uses internally for its end-user products, such asGoogle Search and YouTube. Alongside a set of managementtools, it provides a series of modular cloud services includingcomputing, data storage, data analytics and machine learning.Google Cloud Platform lets you build and host applicationsand websites, store data, and analyze data on Google’sscalable infrastructure.
I use R Programming Language for compiling the pro-gram. R is language and environment for statistical comput-ing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modeling,statistical tests, time series analysis, classification, clustering,etc. I worked with some packages for handling data. The readxlpackage makes it easy to get data out of Excel and into R.
Compared to many of the existing packages (e.g. gdata, xlsx,xlsReadWrite) readxl has no external dependencies, so it’seasy to install and use on all operating systems. It is designedto work with tabular data.
III. THE FINAL DESIGN OF THE DATA MINING SYSTEM A. the algorithm A classification model attempts to draw some conclusionfrom observed values. Given one or more inputs a classi-fication model tries to predict the value of one or moreoutcomes.
Outcomes are labels that can be applied to adataset. In this project, I find the countries in the middleincome trap and figure out the possible reasons why they fellinto such an Income Trap, e.g., the decrease of exportation.Because the majority of data values are numeric, I try to usethe classification algorithms, which handle numeric attributesbetter than nominal values.
Thus, it increases the accuracyof my results. I chose One Rule” classification algorithm. OneR, short for”One Rule”, is a simple, yet accurate, classification algorithmthat generates one rule for each predictor in the data, thenselects the rule with the smallest total error as its “one rule”.OneR induces classification rules based on the value of asingle predictor. B.
the functionality My model should find those countries in the middleincome trap situation using features like population, im-portation, exportation, etc. In addition, the model shouldpredict as well. It means that the model could forecast theeconomy growth of the country. As a result, looking at themachine’s prediction, economists can use in advance theirknown patterns and solutions to avoid the Middle IncomeTrap situation. C.
the output I got the results based on GDP, population, and Exportdata sets. The model could classify all major players ofthe MICs, such as China, Brazil, South Africa, Malaysiaand others. The list of Middle income Trap countries is inFigure 4. However, the program also finds a little numberhigh income countries.
It happens because those classifiedcountries’ characteristics actually are clear middle incometrap characteristics. I strongly believe and it’s highly possible