Hamza distributed processing techniques on large data sets. In

    Hamza Haruna MOHAMMED1,                      Tomiya SAID AHMED ZARBEGA1,                                  [email protected]                              [email protected]   1Çankaya University Ankara, Turkey. ABSTRACT             The acquisition of large scale integration as well as reasoning of data on web brought us to the topic of Linked Data. Linked Data lies at the heart of what Semantic Web is all about. To this notion, rampant increase in the amounts of RDF data available raise a major need and research interest in building efficient and scalable distributed sparql query processing scheme.

In this paper, we extended the previous discussed scenario proposals, there by achieved querying of RDF data with more scalability and reasoning capabilities, by using parallel and distributed processing techniques on large data sets. In our experiment, we distributed rdf datastore based on Apache Spark, which is a MapReduce-like data-parallel framework designed for large-scale data processing running on top of the JVM.  We then use the Hadoop Distributed File System (hdfs), a popular distributed file system handling the distribution of the data across a cluster and its replication. 1.  INTRODUCTION The acquisition of large scale integration as well as reasoning of data on web brought us to the topic of Linked Data. Linked Data lies at the heart of what Semantic Web is all about.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

To this notion, rampant increase in the amounts of RDF data available raise a major need and research interest in building efficient and scalable distributed Sparql query processing scheme. Semantic Web technologies allow us to effectively support analysis of large-scale distributed data sets by means of the well-known Resource Description Framework (RDF) model, which, obviously represents data via a graph-shaped approach. As a consequence, in the Semantic Web initiative, the most prominent kind of graph data are represented by RDF graphs. 2.  RELATED WORK 3.  METHOD4.

  RESULTWhat were the results obtained?5.   DISCUSSION AND CONCLUSION 6.    REFERENCES1 Abdelaziz, I., Harbi, R.

, Khayyat, Z., & Kalnis, P. (2017).

A survey and experimental comparison of distributed SPARQL engines for very large RDF data. Proceedings of the VLDB Endowment, 10(13), 2049-2060. doi:10.14778/3151106.31511092 Graux, D., Jachiet, L.

, Genevès, P., & Layaïda, N. (2016).

SPARQLGX: Efficient Distributed Evaluation of SPARQL with Apache Spark. Lecture Notes in Computer Science The Semantic Web – ISWC 2016, 80-87. doi:10.

1007/978-3-319-46547-0_93 Peng, Peng, et al. “Processing SPARQL queries over distributed RDF graphs.” The VLDB Journal 25.2 (2016): 243-268.4 S.

(2017, July 19). Semagrow/semagrow. Retrieved January 08, 2018, from https://github.com/semagrow/semagrow5 W. (2017, November 09).

WukongGPU/WukongGPU. Retrieved January 08, 2018, from https://github.com/WukongGPU/WukongGPU6 T. (2018, January 03). Tyrex-team/sparqlgx. Retrieved January 08, 2018, from https://github.com/tyrex-team/sparqlgx