International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169
Volume: 3 Issue: 3 15...
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169
Volume: 3 Issue: 3 15...
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169
Volume: 3 Issue: 3 15...
of 3

Preserving Data Confidentiality and Query Privacy Using KNN-R Approach

Citation/Export MLA Shruthi.K, “Preserving Data Confidentiality and Query Privacy Using KNN-R Approach”, March 15 Volume 3 Issue 3 , International Journal on Recent and Innovation Trends in Computing and Communication (IJRITCC), ISSN: 2321-8169, PP: 1510 - 1512, DOI: 10.17762/ijritcc2321-8169.1503133 APA Shruthi.K, March 15 Volume 3 Issue 3, “Preserving Data Confidentiality and Query Privacy Using KNN-R Approach”, International Journal on Recent and Innovation Trends in Computing and Communication (IJRITCC), ISSN: 2321-8169, PP: 1510 - 1512, DOI: 10.17762/ijritcc2321-8169.1503133
Published on: Mar 4, 2016
Published in: Engineering      
Source: www.slideshare.net


Transcripts - Preserving Data Confidentiality and Query Privacy Using KNN-R Approach

  • 1. International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 3 1510 - 1512 _______________________________________________________________________________________________ 1510 IJRITCC | March 2015, Available @ http://www.ijritcc.org _______________________________________________________________________________________ Preserving Data Confidentiality and Query Privacy Using KNN-R Approach Shruthi.K k.shruthireddy2811@gmail.com Abstract: Cloud computing is one of the famous and well known technique that processes the data query efficiently. Since it is maintaining huge amount of resources, its privacy and security is an issue. Cloud service providers are not trust worthy, so data is to be secured. Whenever the data is sent to the cloud, it is encrypted because to protect the sensitive data such that query privacy and data confidentiality is assured. Cloud computing reduces the in- house resources .This doesn’t mean processing of the query should be slow. To ensure query privacy and data confidentiality RASP approach is designed. The RASP Perturbation technique combines Order preserving Encryption, Dimensionality Expansion, random noise injection , random projection to provide strong safety to the perturbed data and query. RASP make use of the kNN algorithm to process the query efficiently. kNN approach use the minimum square range to process the query. It transfers data to the multidimensional space where it uses indexing approach to process the minimum square range points. __________________________________________________*****_________________________________________________ 1.Introduction Cloud computing refers to the service which is accessed over the internet. It is based on pay-as-you-go manner. The goal of cloud computing is to provide high performance computing or super computing power with the cloud computing technologies, large pool of resources can be connected using public or private network. Cloud is maintaining huge amount of resources hence security and privacy are two main concepts which is to be preserved[15]. There are three different types of cloud computing. Infrastructure as a service where hardware is accessed over the internet such as server or storage. Software as a service where complete application is running on other’s computer can be accessed such as web based email and Google document is well known exam which offer many online application. Platform as a service means that the application can be developed using web based tools so they run on system software and hardware. Force.com and the Google map application are examples. Parallel computing of query service in the cloud is very popular because of the advantages of scalability and cost saving. Using the cloud infrastructure, the cloud service providers/ owners can conveniently scale –up and down and pay for what they use. Cloud service providers are not trust worthy and hence the data confidentiality and query privacy should be preserved . The new approach should be proposed to preserve the privacy of data resources. To enjoy the benefits of the cloud computing, it is not meaningful to provide slow query services, because of security and privacy issues. The main purpose of cloud is to reduce the significant amount of cloud resources[4]. There exist some co-ordination between the data privacy, query privacy, use of the cloud. This is referred as DQEL criteria: Data privacy, query privacy , efficient query processing and low processing cost[9]. 2. Definition of RASP In this section, the definition and properties of the RASP(Random Space Perturbation) is introduced. In Random Space Perturbation, the set of data is securely transformed, so that the order is preserved but the distribution and domain are changed[3]. So that the attacker cannot effectively recover the original data and the derived properties are preserved. RASP is the multidimensional and uses the techniques such as geometric perturbation, random noise injection. 2.1 Properties of RASP. RASP has many important features It is convexity preserving. It transforms the range into the another polyhedron. It doesn’t preserve the order of dimensional values and the proof is straightforward[5]. It doesn’t consider the length between two records. The original query can be transformed in to the RASP perturbed data space. A hyper-cube is transformed into polyhedron using RASP perturbation. 3.Meaning of data perturbation. Data perturbation is a popular technique, in preserving the privacy of data processing[11]. A major challenge in data perturbation is to balance the privacy protection and data utility, which are pair of conflicting factors. There are two types of data perturbation, namely probability distribution approach and value distribution approach[7]. 4.RASP using and KNN-algorithm RASP does not pertain the distance between the records, KNN query cannot be directly processed with the RASP perturbed data.
  • 2. International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 3 1510 - 1512 _______________________________________________________________________________________________ 1511 IJRITCC | March 2015, Available @ http://www.ijritcc.org _______________________________________________________________________________________ KNN algorithm is based on the range queries and uses the index in range query processing and uses the index in range query processing and hence fast processing of range queries takes place. 4.1 Processing of KNN algorithm The main goal of KNN algorithm is to find the KNN nearest point in the spherical range that centred at the query point. It uses the square range instead of spherical range. However it has to overcome the few issues such as whether the data privacy and query privacy are present, whether these is a increment in the service workload[13]. How to find the minimum square range that exactly contain the KNN-nearest points. The KNN algorithm consists of three rounds of interaction between service and the client. First, the client will sent, the upper-bound range which contains more than K points and the lower- bound range which contains less than K-points to the server. The server finds the inner range and than returns to the client. Second, the client finds the outer range depending upon the inner range and returns to the server. The server finds the outer range and then returns to the client. Third, the client decrypt the records and finds the first K-point as the resulting server side. 5.System architecture The purpose of the architecture is to extend the data base server to the public cloud and the private cloud. System architecture is having two groups:- The trusted group and the honest group. The trusted group include the data owner and the authorized users. Data owner who have the ability to store the data into the cloud and authorized user who can query the data. The honest group include the cloud provider who host the data base and response the query services[8]. D’ is the data stored in the cloud. H(q’,D’) is the encrypted data over the service. G(R’) it is the encrypted result provided by the cloud server to the authorized user. 5.1 Threat model Security analysis is considered as one of the important features, hence some assumption are made. The data base is accessed only by the authorised user. The communication between the client and server is properly protected and hence there will be no leakage of the data or the query[2]. The opponent can see the query processing, perturbed data base, the access pattern but not more than this. The opponent can have the complete knowledge regarding the database, such as the attributes, application etc. 6.Related work. 6.1. Existing system. Cloud computing is one of the most important and unique technique because of the scalability and cost saving. The data owner are not trust worthy and hence preserving the privacy of the sensitive data is major problem. The data is not sent to the cloud unless data confidentiality and query privacy are assured[12]. It is not meaningful to provide slow query processing because of these issues one should resolve the in-house resources. Disadvantages:- Service provider make a copy of data base or may corrupt the user query, as a result efficient query processing has to presented. 6.2. Proposed system. The proposed system uses the KNN-algorithm for processing the range queries in the perturbation space. It help in parallel processing of the data. Advantages:- It satisfies all the aspects of DQEL criteria. Data privacy, query privacy, efficient query processing and low in-house processing cost. The utility of the processing range queries will be preserved. It uses the concept of indexing to support and find the minimum square range. It process the range queries in the efficient way such that it provide fast processing of range queries without any problem. The users upon obtaining the result, decrypts and uses for its purpose. 7. Possible outcomes Cloud computing helps in the parallel processing of the queries since cloud is hosting large amount of data resources. The data which is very sensitive should be kept highly confidential, and hence it should be protected. Cloud owners does not move the data into the cloud until the security and privacy is preserved[4]. The service provider copy the database because of their curious nature. Processing of queries for such a huge data resources would be time consuming and it finds almost unwanted information. Hence RASP approach is defined.
  • 3. International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 3 1510 - 1512 _______________________________________________________________________________________________ 1512 IJRITCC | March 2015, Available @ http://www.ijritcc.org _______________________________________________________________________________________ In RASP, it makes uses of KNN-algorithm to process the query where it decides the range and allots the indexes in the minimum square range. The RASP perturbation method solves all the problems. 8.Conclusion The RASP perturbation approach helps in processing of the queries very efficiently. It satisfies all the requirements of DQEL criteria it combines the features of order preserving encryption, random noise injection, random sprojection.the main benefit of using cloud computing is to reduce the amount of in-house workload. It uses kNN approach to process the query by deciding the range in the perturbed space and by allotment of indexing to the queries, so that the query can be processed very fastly. References [1] R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu, “Order preserving encryption for numeric data,” in Proceedings of ACM SIGMOD Conference, 2004. [2] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. K. and Andy Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “Above the clouds: A berkeley view of cloud computing,” Technical Report, University of Berkerley, 2009. [3] J. Bau and J. C. Mitchell, “Security modeling and analysis,” IEEE Security and Privacy, vol. 9, no. 3, pp. 18–25, 2011. [4] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004. [5] N. Cao, C. Wang, M. Li, K. Ren, and W. Lou, “Privacypreserving multi-keyword ranked search over encrypted cloud data,” in INFOCOMM, 2011. [6] K. Chen, R. Kavuluru, and S. Guo, “Rasp: Efficient multidimensional range query on attack-resilient encrypted databases,” in ACM Conference on Data and Application Security and Privacy, 2011, pp. 249–260. [7] K. Chen and L. Liu, “Geometric data perturbation for outsourced data mining,” Knowledge and Information Systems, 2011. [8] K. Chen, L. Liu, and G. Sun, “Towards attack-resilient geometric data perturbation,” in SIAM Data Mining Conference, 2007. [9] B. Chor, E. Kushilevitz, O. Goldreich, and M. Sudan, “Private information retrieval,” ACM Computer Survey, vol. 45, no. 6, pp. 965–981, 1998. [10] R. Curtmola, J. Garay, S. Kamara, and R. Ostrovsky, “Searchable symmetric encryption: improved definitions and efficient constructions,” in Proceedings of the 13th ACM conference on Computer and communications security. New York, NY, USA: ACM, 2006, pp. 79–88. [11] N. R. Draper and H. Smith, Applied Regression Analysis. Wiley, 1998. [12] H. Hacigumus, B. Iyer, C. Li, and S. Mehrotra, “Executing sql over encrypted data in the database- service-provider model,” in Proceedings of ACM SIGMOD Conference, 2002. [13] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. Springer-Verlag, 2001. [14] B. Hore, S. Mehrotra, and G. Tsudik, “A privacy- preserving index for range queries,” in Proceedings of Very Large Databases Conference (VLDB), 2004. [15] H. Hu, J. Xu, C. Ren, and B. Choi, “Processing private queries over untrusted data cloud through privacy homomorphism,” Proceedings of IEEE International Conference on Data Engineering (ICDE), pp. 601–612, 2011. [16] Z. Huang, W. Du, and B. Chen, “Deriving private information from randomized data,” in Proceedings of ACM SIGMOD Conference, 2005

Related Documents