An Effective Path-aware Approach for Keyword Search over Data Graphs
Subject Areas : Software Engineering and Information SystemsAsieh Ghanbarpour 1 , Hassan Naderi 2 , Soheil ZareMotlagh 3
1 - Department of Computer Engineering, University of Sistan and Baluchestan, Zahedan, Iran
2 - Computer Engineering Department, Iran University of Science and Technology (IUST), Tehran, Iran
3 - University of Sistan and Baluchestan, Zahedan, Iran
Keywords: Relevant answers, Information Retrieval, Keyword search, Information spreading, Database,
Abstract :
Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, which utilizes both the textual and structural features of answers in order to produce a more accurate order of answers. In addition, a query processing algorithm is developed based on information spreading technique to enumerate answers in approximate order. This algorithm is further improved by allowing a skewed development toward more promising paths and enables a more efficient processing of keyword queries. Performance evaluation through extensive experiments on a standard benchmark of three real-world datasets shows the effectiveness and efficiency of the proposed algorithms.Index Terms—Information retrieval, Database, Keyword search, Relevant answers, Information spreading.
[1] J. Coffman, A.C. Weaver, An Empirical Performance Evaluation of Relational Keyword Search Techniques, IEEE Transactions on Knowledge and Data Engineering, 26 (2014) 30-42.
[2] J. Park, S.-g. Lee, Keyword search in relational databases, Knowl Inf Syst, 26 (2011) 175-193.
[3] A. Ghanbarpour, H. Naderi, Survey on Ranking Functions in Keyword Search over Graph-Structured Data, Journal of Universal Computer Science, 25 (2019) 361-389.
[4] B. Ding, J.X. Yu, S. Wang, L. Qin, X. Zhang, X. Lin, Finding Top-k Min-Cost Connected Trees in Databases, in: 23rd International Conference on Data Engineering, IEEE, 2007, pp. 836-845.
[5] H. He, H. Wang, J. Yang, P.S. Yu, BLINKS: ranked keyword searches on graphs, in: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, ACM, Beijing, China, 2007, pp. 305-316.
[6] M. Kargar, A. An, X. Yu, Efficient Duplication Free and Minimal Keyword Search in Graphs, IEEE Transactions on Knowledge and Data Engineering, 26 (2013) 1657 - 1669
[7] M. Kargar, A. An, Keyword search in graphs: finding r-cliques, Proceedings of the VLDB Endowment, 4 (2011) 681-692.
[8] F. Liu, C. Yu, W. Meng, A. Chowdhury, Effective keyword search in relational databases, in: Proceedings of the 2006 ACM SIGMOD international conference on Management of data, ACM, Chicago, USA, 2006, pp. 563-574.
[9] M. Miao, J. Wang, S. Wen, J. Ma, Publicly verifiable database scheme with efficient keyword search, Information Sciences, 475 (2019) 18-28.
[10] V. Hristidis, L. Gravano, Y. Papakonstantinou, Efficient IR-style keyword search over relational databases, in: Proceedings of the 29th international conference on Very large data bases, VLDB Endowment, Berlin, Germany, 2003, pp. 850-861.
[11] C.-S. Park, S. Lim, Efficient processing of keyword queries over graph databases for finding effective answers, Information Processing & Management, 51 (2015) 42-57.
[12] X. Yu, Z. Yu, Y. Liu, H. Shi, CI-Rank: Collective importance ranking for keyword search in databases, Information Sciences, 384 (2017) 1-20.
[13] A. Hulgeri, C. Nakhe, Keyword Searching and Browsing in Databases using BANKS, in: Proceedings of the 18th International Conference on Data Engineering, IEEE Computer Society, 2002, pp. 431-440.
[14] V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, H. Karambelkar, Bidirectional expansion for keyword search on graph databases, in: Proceedings of the 31st international conference on Very large data bases, VLDB Endowment, Trondheim, Norway, 2005, pp. 505-516.
[15] J. Vivekavardhan, A.S. Chakravarthy, P. Ramesh, Search Engines and Meta Search Engines Great Search for Knowledge: A Frame Work on Keyword Search for Information Retrieval, in: S.C. Satapathy, K.S. Raju, K. Shyamala, D.R. Krishna, M.N. Favorskaya (Eds.) Advances in Decision Sciences, Image Processing, Security and Computer Vision, Springer International Publishing, Cham, 2020, pp. 435-443.
[16] Y. Mass, Y. Sagiv, Language models for keyword search over data graphs, in: Proceedings of the 5th ACM international conference on Web search and data mining, ACM, Seattle, Washington, USA, 2012, pp. 363-372.
[17] J.I. Lopez-Veyna, V.J. Sosa-Sosa, I. Lopez-Arevalo, A low redundancy strategy for keyword search in structured and semi-structured data, Information Sciences, 288 (2014) 135-152.
[18] S. Najafi, F. Soleimanian Gharehchopogh, A New Hybrid Method for Web Pages Ranking in Search Engines, Journal of Advances in Computer Engineering and Technology, 5 (2019) 233-244.
[19] A. Sela, I. Ben-Gal, Information spread in the age of the internet, in: 2014 IEEE 28th Convention of Electrical & Electronics Engineers in Israel (IEEEI), 2014, pp. 1-4.
[20] C. Liu, Z.-K. Zhang, Information spreading on dynamic social networks, Communications in Nonlinear Science and Numerical Simulation, 19 (2014) 896-904.
[21] S. Bergamaschi, F. Guerra, M. Interlandi, R. Trillo-Lado, Y. Velegrakis, QUEST: a keyword search system for relational data based on semantic and machine learning techniques, Proceedings of the VLDB Endowment, 6 (2013) 1222-1225.
[22] S. Bergamaschi, F. Guerra, M. Interlandi, R. Trillo-Lado, Y. Velegrakis, Combining user and database perspective for solving keyword queries over relational databases, Information Systems, 55 (2016) 1-19.
[23] T.N. Le, Z. Bao, T.W. Ling, Exploiting semantics for XML keyword search, Data & Knowledge Engineering, 99 (2015) 105-125.
[24] P. Dayananda, C.N. Sowmyarani, Review on Keyword Search and Ranking Techniques for Semi-Structured Data, in: D.L. Miltiadis, D. Linda, V. Anna (Eds.) Knowledge-Intensive Economies and Opportunities for Social, Organizational, and Technological Growth, IGI Global, Hershey, PA, USA, 2019, pp. 248-270.
[25] M. Rihany, Z. Kedad, S. Lopes, A Keyword Search Approach for Semantic Web Data, in: E. Métais, F. Meziane, S. Vadera, V. Sugumaran, M. Saraee (Eds.) Natural Language Processing and Information Systems, Springer International Publishing, Cham, 2019, pp. 131-143.
[26] D. Wang, L. Zou, D. Zhao, Top-k queries on RDF graphs, Information Sciences, 316 (2015) 201-217.
[27] S.B. Sinha, X. Lu, D. Theodoratos, Personalized Keyword Search on Large RDF Graphs based on Pattern Graph Similarity, in: Proceedings of the 22nd International Database Engineering & Applications Symposium, ACM, Villa San Giovanni, Italy, 2018, pp. 12-21.
[28] Y. Yuan, G. Wang, L. Chen, H. Wang, Efficient Keyword Search on Uncertain Graph Data, IEEE Transactions on Knowledge and Data Engineering, 25 (2013) 2767-2779.
[29] S. Kim, W. Lee, N.R. Arora, T.-C. Jo, S.-H. Kang, Retrieving keyworded subgraphs with graph ranking score, Expert Systems with Applications, 39 (2012) 4647-4656.
[30] Y. Yang, D. Agrawal, H.V. Jagadish, A.K.H. Tung, S. Wu, An Efficient Parallel Keyword Search Engine on Knowledge Graphs, in: 2019 IEEE 35th International Conference on Data Engineering (ICDE), 2019, pp. 338-349.
[31] G. Li, B.C. Ooi, J. Feng, J. Wang, L. Zhou, EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data, in: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, ACM, Vancouver, Canada, 2008, pp. 903-914.
[32] R.D. Virgilio, P. Cappellari, M. Miscione, Cluster-Based Exploration for Effective Keyword Search over Semantic Datasets, in: A.F. Laender, S. Castano, U. Dayal, F. Casati, J. de Oliveira (Eds.) Conceptual Modeling - ER 2009, Springer Berlin Heidelberg, 2009, pp. 205-218.
[33] K. Nguyen, J. Cao, Top-K data source selection for keyword queries over multiple XML data sources, Journal of Information Science, 38 (2012) 156-175.
[34] Y. Mass, Y. Sagiv, Virtual Documents and Answer Priors in Keyword Search over Data Graphs, in: EDBT/ICDT Workshops, 2016.
[35] Y. Pan, Y. Wu, ROU: advanced keyword search on graph, in: Proceedings of the 22nd ACM international Conference on information and knowledge management, ACM, San Francisco, California, USA, 2013, pp. 1625-1630.
[36] B. Kimelfeld, Y. Sagiv, Finding and approximating top-k answers in keyword proximity search, in: Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, ACM, Chicago, IL, USA, 2006, pp. 173-182.
[37] Y. Hao, H. Cao, Y. Qi, C. Hu, S. Brahma, J. Han, Efficient keyword search on graphs using mapreduce, in: Big Data (Big Data), 2015 IEEE International Conference on, IEEE, 2015, pp. 2871-2873.
[38] J. Coffman, A.C. Weaver, A framework for evaluating database keyword search strategies, in: Proceedings of the 19th ACM international conference on Information and knowledge management, ACM, Toronto, ON, Canada, 2010, pp. 729-738.