Project Report - Secure and Dependable Storage

CHAPTER 1
INTRODUCTION 1.1 OVER VIEW OF THE PROJECT Cloud computing has been envisioned as the next generation architecture of the IT enterprise due to its long list of unprecedented advantages in IT: on demand self-service, ubiquitous network access, location-independent resource pooling, rapid resource elasticity, usage-based pricing, and transference of risk. One fundamental aspect of this new computing model is that data is being centralized or outsourced into the cloud. From the data owners perspective, including both individuals and IT enterprises, storing data remotely in a cloud in a flexible ondemand manner brings appealing benefits: relief of the burden of storage management, universal data access with independent geographical locations, and avoidance of capital expenditure on hardware, software, personnel maintenance, and so on. While cloud computing makes these advantages more appealing than ever, it also brings new and challenging security threats to the outsourced data. Since cloud service providers (CSP) are separate administrative entities, data outsourcing actually relinquishes the owners ultimate control over the fate of their data. As a result, the correctness of the data in the cloud is put at risk due to the following reasons. First of all, although the infrastructures under the cloud are much more powerful and reliable than personal computing devices, they still face a broad range of both internal and external threats to data integrity. Outages and security breaches of noteworthy cloud services appear from time to time. Amazon S3s recent downtime [Gmails mass email deletion incident, and Apple MobileMes post-launch downtime are all such examples]. Second, for benefits of their own, there are various motivations for CSPs to behave unfaithfully toward
cloud customers regarding the status of their sourced data. Examples include CSPs, for monetary reasons, reclaiming storage by discarding data that has not been or is rarely accessed or even hiding data loss incidents to maintain a reputation In short, although outsourcing data into the cloud is economically attractive for the cost and complexity of long-term Large scale data storage, it does not offer any guarantee on data integrity and availability. This problem, if not properly addressed, may impede successful deployment of the cloud architecture. 1.2 SYSTEM ANALYSIS & FEASIBILITY STUDY System Analysis is the detailed study of the various operations performed by the system and their relationship within and outside the system. Analysis is the process of breaking something into its parts so that the whole may be understood. System analysis is concerned with becoming aware of the problem, identifying the relevant and most decisional variables, analyzing and synthesizing the various factors and determining an optional or at least a satisfactory solution. During this a problem is identified, alternate system solutions are studied and recommendations are made about committing the resources the resources used to the system. 1.2.2 FEASIBILITY STUDY A feasibility analysis usually involves a through assessment of the operational (need), financial and technical aspects of a proposal. Feasibility study is the test of made to identify whether the user needs may be satisfied using the current software and hardware technologies, whether the system will be cost effective from a business point of view and whether it can be developed with the given budgetary constraints. A feasibility study should be relatively cheap and done at the earlier possible time. Depending on the study, the decision is made whether to go ahead with a more detailed analysis.
When a new project is proposed, it normally goes through feasibility assessment. Feasibility study is carried out to determine whether the proposed system is possible to develop with available resources and what should be the cost consideration. Facts consideration in the feasibility analysis were Technical Feasibility Economic Feasibility Behavioral Feasibility 1.2.2.1 Technical Feasibility Technical feasibility include whether the technology is available in the market for development and its availability. The assessment of technical feasibility must be based on an outline design of system requirements in terms of input, output, files, programs and procedures. This can be qualified in terms of volumes of data, trends, frequency of updating, cycles of activity etc, in order to give an introduction of technical system. Considering our project it is technically feasible, with its emphasis on a more strategic decision making process is fast gaining ground as a popular outsourced function. 1.2.2.2 Economic Feasibility This feasibility study presents tangible and intangible benefits from the project by comparing the development and operational cost. The technique of cost benefit analysis is often used as a basis for assessing economic feasibility. This system needs some more initial investment than the existing system, but it can be justifiable that it will improve quality of service. Thus feasibility study should center along the following points:
Improvement resulting over the existing method in terms of accuracy, timeliness.
Cost comparison
Estimate on the life expectancy of the hardware. Overall objective. Our project is economically feasible. It does not require much cost to be involved in the overall process. The overall objective is in easing out the recruitment processes. 1.2.2.3 Behavioral / Operational Feasibility
This analysis involves how it will work when it is installed and the assessment of political and managerial environment in which it is implemented. People are inherently resistant to change and computers have been known to facilitate change. The new proposed system is very much useful to the users and therefore it will accept broad audience from around the world. 1.3 SYSTEM DESIGN The most creative and challenging face of the system development is System Design. It provides the understanding and procedural details necessary for the logical and physical stages of development. In designing a new system, the system must have a clear understanding of the objectives, which the design is aiming to fulfill. The first step is to determine how the output is to be designed to meet the requirements of the proposed output. The operational phases are handled through program construction and testing. Design of the system can be defined as a process of applying various techniques and principles for the purpose of defining a device, a process or a system in sufficient detail to permit its physical realization. Thus system design is a solution to how to approach to the creation of a new system. This important phase provides the understanding and the procedural details necessary for
implementing the system recommended in the feasibility study. The design step provides a data design, architectural design, and a procedural design. 1.4 OVERVIEW OF LANGUAGE USED 1.4.1 JAVA Java is a small, simple, safe, object oriented, interpreted or dynamically optimized, byte coded, architectural, garbage collected, multithreaded programming language with a strongly typed exception-handling for writing distributed and dynamically extensible programs. Java is an object oriented programming language. Java is a high-level, third generation language like C, FORTRAN, Small talk, Pearl and many others. You can use java to write computer applications that crunch numbers, process words, play games, store data or do any o the thousands of other things computer software can do. Special programs called applets that can be downloaded from the internet and played safely within a web browser. Java a support this application and the follow features make it one of the best programming language. It is simple and object oriented It helps to create user friendly interfaces. It is very dynamic. It supports multi threading. It is highly secure and robust. It supports internet programming Java is programming language originally developed by Sun Microsystems and released in 1965 as a core component of Suns Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-
level facilities. Java applications are typically compiled to byte code which can run on any Java virtual machine (JVM) regardless of computer architecture. The original and reference implemented Java compilers, virtual machine, and class libraries were developed by Sun from 1995. As of May 2007, in compliance with the specifications of the Java Community Process, Sun made available most of their Java technologies as free software under the GNU General Public License. Other has also developed alternative implementations of these Sun technologies, such as the GNU Compiler for Java and GNU Class path. The Java platform is the name for a bundle of related programs, or platform, from Sun which allow for developing and running programs written in the Java programming language. The platform is not specific to any one Processor or operating system, but rather an execution engine (called a virtual machine) and a compiler with a set of standard libraries which are implemented for various hardware and operating system so that Java programs can run identically on all of them. Different editions of the platform are available, including: Java ME(Micro Edition):Specifies several different sets of libraries (know as profiles) for devices which are sufficiently limited that supplying the full set of Java libraries would take up unacceptably large amounts of storage. Java SE (Standard Edition): For general purpose use on desktop PCs, servers and similar devices. Java EE (Enterprise Edition): Java SE plus various APIs useful for multi-tier client-server enterprise applications.
Java began as a client side platform independent programming language that enabled stand-alone Java applications and applets. The numerous benefits of Java resulted in an explosion in the usage of Java in the back end server side enterprise systems. The Java Development Kit (JDK), which was the original standard platform defined by Sun, was soon supplemented by a collection of enterprise APIs. The proliferation of enterprise APIs, often developed by several different groups, resulted in divergence of APIs and caused concern among the Java developer community. Java byte code can execute on the server instead of or in addition to the client, enabling you to build traditional client/server applications and modern thin client Web applications. Two key server side Java technologies are servlets and Java Server Pages. Servlets are protocol and platform independent server side components which extend the functionality of a Web server. Java Server Pages (JSPs) extend the functionality of servlets by allowing Java servlet code to be embedded in an HTML file. Features of Java Platform Independence The Write-Once-Run-Anywhere ideal has not been achieved (tuning for different platforms usually required), but closer than with other languages. Object Oriented Object oriented throughout - no coding outside of class definitions, including main(). An extensive class library available in the core language packages. Compiler/Interpreter Combo
Code is compiled to byte codes that are interpreted by Java virtual machines (JVM). Robust Exception handling built-in, strong type checking (that is, all data must be declared an explicit type), local variables must be initialized. Several dangerous features of C & C++ eliminated No memory pointers No preprocessor Automatic Memory Management Automatic garbage collection - memory management handled by JVM. Security No memory pointers Programs run inside the virtual machine sandbox. Array index limit checking Code pathologies reduced by Byte code verifier - checks classes after loading Dynamic Binding The linking of data and methods to where they are located is done at runtime. New classes can be loaded while a program is running. Linking is done on the fly. Threading Lightweight processes, called threads, can easily be spun off to perform multiprocessing.
Great multimedia displays. 1.4.2 TECHNOLOGY SPECIFICATION JDK1.5 The Java 2 Platform Standard Edition Development Kit (JDK) is a development environment for building applications, applets, and components using the Java programming language. The JDK includes tools useful for developing and testing programs written in the Java programming language and running on the Java platform. These tools are designed to be used from the command line. Except for the applet viewer, these tools do not provide a graphical user interface. JDK DOCUMENTATION The on-line Java 2 Platform Standard Edition Documentation contains API specifications, feature descriptions; developer guides, reference pages for JDK tools and utilities, demos, and links to related information. This documentation is also available in a download bundle which you can install on your machine. CONTENTS OF THE JDK The following table contains a general summary of the files and directories in the JDK. Development Tools Runtime Environment Additional Libraries C header Files
Tools and utilities (Inside bin subdirectory) that will help you develop, C execute, debug, and document programs written in the Java programming language. For further information, see the tool documentation.
An implementation of the J2SE runtime environment (Inside jre subdirectory.) For the use of the JDK. The runtime environment includes a Java virtual machine, class libraries, and other files that support the execution of programs written in the Java programming language.
Additional class libraries (In the lib subdirectory.) and support files required by the development tools.
(In the include subdirectory.) Header files that support nativecode programming using the Java Native Interface, the JVMTM Tool Interface, and other functionality of the Java 2 Platform.
Table1.1: Contents of JDK
2) Source Code (In src.zip.) Java programming language source files for all classes that make up the Java 2 core API (that is, sources files for the java.*, javax.* and some org.* packages, but not for com.sun.* packages). This source code is provided for informational purposes only, to help developers learn and use the Java programming language. These files do not include platform-specific implementation code and cannot be used to rebuild the class libraries. To extract these file, use any common zip utility. Or, you may use the Jar utility in the JDK's bin directory: jar xvf src.zip. The Java Programming Language is a general-
10
purpose, concurrent, strongly typed, class-based object-oriented language. It is normally compiled to the byte code instruction set and binary format defined in the Java Virtual Machine Specification. ENHANCEMENTS IN JDK 5
Generics - This long-awaited enhancement to the type system allows a type or method to operate on objects of various types while providing compiletime type safety. It adds compile-time type safety to the Collections Framework and eliminates the drudgery of casting. Enhanced for Loop - This new language construct eliminates the drudgery and error-proneness of iterators and index variables when iterating over collections and arrays. Autoboxing/Unboxing - This facility eliminates the drudgery of manual conversion between primitive types (such as int) and wrapper types (such as Integer). Typesafe Enums - This flexible object-oriented enumerated type facility allows you to create enumerated types with arbitrary methods and fields. It provides all the benefits of the Typesafe Enums pattern. Varargs - This facility eliminates the need for manually boxing up argument lists into an array when invoking methods that accept variable-length argument lists. Static Import - This facility lets you avoid qualifying static members with class names without the shortcomings of the "Constant Interface anti pattern. Annotations (Metadata) - This language feature lets you avoid writing boilerplate code under many circumstances by enabling tools to generate it
11
from annotations in the source code. This leads to a "declarative" programming style where the programmer says what should be done and tools emit the code to do it. Also it eliminates the need for maintaining "side files" that must be kept up to date with changes in source files. Instead the information can be maintained in the source file. JAVA SERVER PAGES Java Server Pages (JSP) is a technology based on the Java language and enables the development of dynamic web sites. JSP was developed by Sun Microsystems to allow server side development. JSP files are HTML files with special Tags containing Java source code that provide the dynamic content. The following shows the Typical Web server, different clients connecting via the Internet to a Web server. In this example, the Web server is running on UNIX and is the very popular Apache Web server First static web pages were displayed. Typically these were peoples first experience with making web pages so consisted of My Home Page sites and company marketing information. Afterwards Perl and C were languages used on the web server to provide dynamic content. Soon most languages including Visual basic, Delphi, C and Java could be used to write applications that provided dynamic content using data from text files or database requests. These were known as CGI server side applications. ASP was developed by Microsoft to allow HTML developers to easily provide dynamic content supported as standard by Microsofts free Web Server, Internet Information Server (IIS). JSP is the equivalent from Sun Microsystems, a comparison of ASP and JSP will be presented in the following section. JSP source code runs on the web server in the JSP Servlet Engine. The JSP Servlet engine dynamically generates the HTML and sends the HTML output to
12
the clients web browser. MAIN REASONS TO USE JSP Multi-platform Component reuse by using JavaBeans and EJB. Advantages of Java.
You can take one JSP file and move it to another platform, web server or JSP Servlet engine. This means you are never locked into one vendor or platform. HTML and graphics displayed on the web browser are the presentation layer. The Java code (JSP) on the server is implementation. By having a separation of presentation web designers work only on the presentation and concentrate on implementing the application. classed as
classed as the Java developers
and implementation,
JSP ARCHITECTURE JSPs are built on top of SUN Microsystems' servlet technology. JSPs are essential an HTML page with special JSP tags embedded. These JSP tags can contain Java code. The JSP file extension is .jsp rather than .htm or .html. The JSP engine parses the .jsp and creates a Java servlet source file. It then compiles the source file into a class file; this is done the first time and this why the JSP is probably slower the first time it is accessed. Any time after this the special compiled servlet is executed and is therefore returns faster. APACHE TOMCAT Apache Tomcat (or simply Tomcat, formerly also Jakarta Tomcat) is an open source web server and Servlet container developed by the Apache Software Foundation (ASF). Tomcat implements the Java Servlet and the Java Server Pages
13
(JSP) specifications from Oracle Corporation, and provides a "pure Java" HTTP web server environment for Java code to run. Tomcat should not be confused with the Apache web server, which is a C implementation of an HTTP web server; these two web servers are not bundled together, although they are frequently used together as part of a server application stack. Apache Tomcat includes tools for configuration and management, but can also be configured by editing XML configuration files Components Tomcat5.x was released with Catalina (Servlet container), Coyote (an HTTP connector) and Jasper (a JSP engine). Catalina Catalina is Tomcat's Servlet container. Catalina implements Sun Microsystems' specifications for Servlet and Java Server Pages (JSP). In Tomcat, a Realm element represents a "database" of usernames, passwords, and roles (similar to Unix groups) assigned to those users. Different implementations of Realm allow Catalina to be integrated into environments where such authentication information is already being created and maintained, and then use that information to implement Container Managed Security as described in the Servlet Specification. Coyote
14
Coyote is Tomcat's HTTP Connector component that supports the HTTP 1.1 protocol for the web server or application container. Coyote listens for incoming connections on a specific TCP port on the server and forwards the request to the Tomcat Engine to process the request and send back a response to the requesting client. Jasper Jasper is Tomcat's JSP Engine. Tomcat 5.x uses Jasper 2, which is an implementation of the Sun Microsystems's JavaServerPages 2.0 specification. Jasper parses JSP files to compile them into Java code as Servlets (that can be handled by Catalina). At runtime, Jasper detects changes to JSP files and recompiles them. Jasper 2 From Jasper to Jasper 2, important features were added:
JSP Tag library pooling - Each tag markup in JSP file is handled by a tag handler class. Tag handler class objects can be pooled and reused in the whole JSP Servlet. Background JSP compilation - While recompiling modified JSP Java code, the older version is still available for server requests. The older JSP Servlet is deleted once the new JSP Servlet has finished being recompiled. Recompile JSP when included page changes - Pages can be inserted and
included into a JSP at runtime. The JSP will not only be recompiled with JSP file
15
changes but also with included page changes. CASCADING STYLE SHEETS Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation semantics (the look and formatting) of a document written in a markup language. Its most common application is to style web pages written in HTML and XHTML, but the language can also be applied to any kind of XML document, including plain XML, SVG and XUL. CSS is designed primarily to enable the separation of document content (written in HTML or a similar markup language) from document presentation, including elements such as the layout, colors, and fonts.[1] This separation can improve content accessibility, provide more flexibility and control in the specification of presentation characteristics, enable multiple pages to share formatting, and reduce complexity and repetition in the structural content (such as by allowing for table less web design). CSS can also allow the same markup page to be presented in different styles for different rendering methods, such as onscreen, in print, by voice (when read out by a speech-based browser or screen reader) and on Braille-based, tactile devices. It can also be used to allow the web page to display differently depending on the screen size or device on which it is being viewed. While the author of a document typically links that document to a CSS style sheet, readers can use a different style sheet, perhaps one on their own computer, to override the one the author has specified.CSS specifies a priority scheme to determine which style rules apply if more than one rule matches against a particular element. In this so-called cascade, priorities or weights are calculated and assigned to rules, so that the results are predictable.
16
The CSS specifications are maintained by the World Wide Web Consortium (W3C). Internet media type (MIME type) text/css is registered for use with CSS by RFC 2318 (March 1998).
XML Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards. The design goals of XML emphasize simplicity, generality, and usability over the Internet. It is a textual data format with strong support via Unicode for the languages of the world. Although the design of XML focuses on documents, it is widely used for the representation of arbitrary data structures, for example in web services. 1.5 SYSTEM SPECIFICATION HARDWARE SPECIFICATION Processor Type Pentium IV.
17
Speed Ram Hard Disk
2.5 GHz. 128 MB. 40GB.
Table1.2: Hardware Specification
SOFTWARE SPECIFICATION Operating system Windows XP.
Tool
MyEclipse Enterprise Workbench 5.1
Programming Package
Jdk1.5.0
Database Server Browser
MySQL. Tomcat 5.5. Google Chrome
Table1.3: Software Specification
18
1.6 FUTURE ENHANCEMENT We have described some suggested requirements for public auditing services and the state of the art that fulfills them. However, this is still not enough for a publicly auditable secure cloud data storage system, and further challenging issues remain to be supported and resolved. The three macro economics trends that are seen as fuelling the growth of this industry are: Shorter employment tenures Shrinking labor pools Need for technology workers In wake of the new and related trends, it is imperative for frequent upgrades to a companys software or web applications to make it easier for clients and employees to address new business needs. 1.7 ORGANISATION OF THE REPORT Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7
Table1.4: Organization of Report
Literature overview Development environment Design Architecture Implementation Testing Conclusion and future work
19
CHAPTER 2
LITERATURE REVIEW 2.1 EXISTING SYSTEM cloud computing has been envisioned as the next generation architecture of the IT enterprise due to its long list of unprecedented advantages in IT: on demand self-service, ubiquitous network access, location-independent resource pooling, rapid resource elasticity, usage-based pricing, and transference of risk. Preserving user is most part for the company as well for the users. User information is maintained only be admin. But in Cloud environment user information should be preserved more that the other. User data is not monitored by anyone. LIMITATION OF PRESENT SYSTEM Byzantine Error Occurrence. Integration of Database in Many Servers.
Pre-Requisition for Client manipulation.
Replication based file distribution system. 2.2 PROPOSED SYSTEM
20
We proposed to secure user information from cloud environment. How we secure is by auditing user log and their behavior on this cloud. Third party who coming to cracks our site is by using user account. So we trace each and every user what the user doing in this site. Monitoring the user keenly can easily get the cracker who misbehaving. We are placing an Auditor in common to both user and company of whose information must be secure. Auditor monitors the information in the cloud. Trace each and every transaction, updates, insertion and deletion of data.
ADVANTAGES The servers should be updated frequently. An algorithm to find out the misbehaving servers. Prior request should be made to the CSP for manipulation.
Allocating an agent for Supervising.
Providing a unique token for each client within the cloud. Handling the unknown user under supervision. 2.3 COLLECTED INFORMATION 2.3.1 MODULES WITH MODULE DESCRIPTION: Modules: 1. Desirable Properties for Public Auditing 2. Support Batch Auditing 3. Utilizing Homomorphic Authenticators 4. Handling Multiple Concurrent Tasks Problem Definition:
21
In this project we faced some of the serious problem, it is not so easy to trace out the behavior of the user making auditor as common. And this auditor is most trustable for both owners as well as for user. Making secure in the auditor authentication is challenging. Desirable Properties for Public Auditing: Our goal is to enable public auditing for cloud data storage to become a reality. Thus, the whole service architecture design should not only be cryptographically strong, but, more important, be practical from a systematic point of view. We briefly elaborate a set of suggested desirable properties below that satisfy such a design principle. The in-depth analysis is discussed in the next section. Note that these requirements are ideal goals. They are not necessarily complete yet or even fully achievable in the current stage.
Auditor
Figure 2.1 :Public Auditing
Us er
Support Batch Auditing: The prevalence of large-scale cloud storage service further demands auditing efficiency. When receiving multiple auditing tasks from different owners delegations, a TPA should still be able to handle them in a fast yet cost-effective fashion. This property could essentially enable the scalability of a public auditing service even under a storage cloud with a large number of data owners.
Use r
Auditor
Use r
22
Figure 2.2: Support Batch Auditing
Use r
Utilizing Homomorphic Authenticators: To significantly reduce the arbitrarily large communication overhead for public auditability without introducing any online burden on the data owner, we resort to the Homomorphic authenticator technique. Homomorphic authenticators are unforgivable metadata generated from individual data blocks, which can be securely aggregated in such a way to assure a verifier that a linear combination of data blocks is correctly computed by verifying only the aggregated authenticator.
Login
Secure Authenticat ion Figure 2.3: Homomorphic authenticator technique
Sign In
Handling Multiple Concurrent Tasks: Keeping this natural demand in mind, we note that two previous works can be directly extended to provide batch auditing functionality by exploring the technique of bilinear aggregate signature. Such a technique supports the aggregation of multiple signatures by distinct signers on distinct messages into a single signature and thus allows efficient verification for the authenticity of all messages. Basically, with batch auditing the K verification equations (for K auditing tasks) corresponding to K responses {, } from a cloud server can now be aggregated into a single one such that a considerable amount of auditing time is expected to be saved. A very recent work gives the first study of batch auditing and presents mathematical details as well as security reasonings.
23
2.3.2 G.Atenieseet al., Provable Data Possession at Untrusted Stores, Proc. ACMCCS 07, Oct. 2007, pp. 598609. Verifying the authenticity of data has emerged as a critical issue in storing data on untrusted servers. It arises in peer- to-peer storage systems, network file systems, long-term archives, web-service object stores, and database systems. Such systems prevent storage servers from misrepresenting or modifying data by providing authenticity checks when accessing data. However, archival storage requires guarantees about the authenticity of data on storage, namely that storage servers possess data. It is insufficient to detect that data have been modified or deleted when accessing the data, because it may be too late to recover lost or damaged data. Archival storage servers retain tremendous amounts of data, little of which are accessed. They also hold data for long periods of time during which there may be exposure to data loss from ad- ministration errors as the physical implementation of storage evolves, e.g., backup and restore, data migration to new systems, and changing memberships in peer-to-peer systems. Archival network storage presents unique performance demands. Given that file data are large and are stored at re- mote sites, accessing an entire file is expensive in I/O costs to the storage server and in transmitting the file across a network. Reading an entire archive, even periodically, greatly limits the scalability of network stores. (The growth in storage capacity has far outstripped the growth in storage access times and bandwidth). Furthermore, I/O incurred to establish data possession interferes with on-demand bandwidth to store and retrieve data. We conclude that clients need to be able to verify that a server has retained file data without retrieving the data from the server and without having the server access the entire file. Previous solutions do not meet these requirements for proving data possession. Some
24
schemes provide a weaker guarantee by enforcing storage complexity: The server has to store an amount of data at least as large as the clients data, but not necessarily the same exact data. Moreover, all previous techniques require the server to access the entire file, which is not feasible when dealing with large amounts of data. 2.3.3 G. Atenieseet al., Scalable and Efficient Provable Data Possession, Proc. Secure Communication 08, Sept. 2008. In recent years, the concept of third-party data warehousing and, more generally, data outsourcing has become quite popular. Outsourcing of data essentially means that the data owner (client) moves its data to a third-party provider (server) which is supposed to presumably for a fee faithfully store the data and make it available to the owner (and perhaps others) on demand. Appealing features of outsourcing include reduced costs from savings in storage, maintenance and personnel as well as increased availability and transparent upkeep of data. A number of security-related research issues in data outsourcing have been studied in the past decade. Early work concentrated on data authentication and integrity, i.e., how to efficiently and securely ensure that the server returns correct and complete results in response to its clients queries. Later research focused on outsourcing encrypted data (placing even less trust in the server) and associated difficult problems mainly having to do with efficient querying over encrypted domain , More recently, however, the problem of Provable Data Possession (PDP) is also sometimes referred to as Proof of Data Retrivability (POR) has popped up in the research literature. The central goal in PDP is to allow a client to efficiently, frequently and securely verify that a server who purportedly stores clients potentially very large amount of data is not cheating the client. In this context, cheating means that the server might delete some of the
25
data or it might not store all data in fast storage, e.g., place it on CDs or other tertiary off-line media. It is important to note that a storage server might not be malicious; instead, it might be simply unreliable and lose or inadvertently corrupt hosted data. An effective PDP technique must be equally applicable to malicious and unreliable servers. The problem is further complicated by the fact that the client might be a small device (e.g., a PDA or a cell-phone) with limited CPU, battery power and communication facilities. Hence, the need to minimize bandwidth and local computation overhead for the client in performing each verification. Two recent results PDP and POR have highlighted the importance of the problem and suggested two very different approaches. The first is a public-keybased technique allowing any verifier (not just the client) to query the server and obtain an interactive proof of data possession. This property is called public verifiability. The interaction can be repeated any number of times, each time resulting in a fresh proof. The POR scheme uses special blocks (called sentinels) hidden among other blocks in the data. During the verification phase, the client asks for randomly picked sentinels and checks whether they are intact. If the server modifies or deletes parts of the data, then sentinels would also be affected with a certain probability. However, sentinels should be indistinguishable from other regular blocks; this implies that blocks must be encrypted. Thus, unlike the PDP scheme in, POR cannot be used for public databases, such as libraries, repositories, or archives. In other words, its use is limited to confidential data. In addition, the number of queries is limited and fixed a priori. This is because sentinels, and their position within the database, must be revealed to the server at each query a revealed sentinel cannot be reused. 2.3.4 H. Shacham and B. Waters, Compact Proofs of Retrievability, Proc. Asia- Crypt 08, LNCS, vol. 5350, Dec. 2008, pp. 90107.
26
In this paper, we give the proof of Retrivability schemes with full proofs of security against arbitrary adversaries in the Juels-Kaliski model. Our scheme has the shortest query and response of any proof-of-retrievability with public variability and is secure in the random oracle model. Our second scheme has the shortest response of any proof-of-Retrivability scheme with private variability (but a longer query), and is secure in the standard model. Proofs of storage. Recent visions of \cloud computing" and \software as a service" call for data, both personal and business, to be stored by third parties, but deployment has lagged. Users of outsourced storage are at the mercy of their storage providers for the continued availability of their data. Even Amazon's S3, the best-known storage service, has recently experienced significant downtime. In an attempt to aid the deployment of outsourced storage, cryptographers have designed systems that would allow users to verify that their data is still available and ready for retrieval if needed: Deswarte, Quisquater, and Sadane, GazzoniFilho and Barreto and Schwarz and Miller. In these systems, the client and server engage in a protocol; the client seeks to be convinced by the protocol interaction that his data is being stored. Such a capability can be important to storage providers as well. Users may be reluctant to entrust their data to an unknown startup; an auditing mechanism can reassure them that their data is indeed still available. 2.3.5 K. D. Bowers, A. Juels, and A. Oprea, Hail: A High-Availability and Integrity Layer for Cloud Storage, Proc. ACM CCS 09, Nov. 2009, pp. 18798. Cloud storage denotes a family of increasingly popular on-line services for archiving, backup, and even primary storage of files. Amazon S3 is a well known example. Cloud-storage providers offer users clean and simple file-system interfaces, abstracting away the complexities of direct hardware management. At
27
the same time, though, such services eliminate the direct oversight of component reliability and security that enterprises and other users with high service-level requirements have traditionally expected. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. To restore security assurances eroded by cloud environments, researchers have proposed two basic approaches to client verification of file availability and integrity.
CHAPTER 3
DEVELOPMENT ENVIRONMENT 3.1 MYECLIPSE: My Eclipse is a commercially available Enterprise Java and AJAX IDE created and maintained by the company Genuine, a founding member of the Eclipse Foundation. My Eclipse is built upon the Eclipse platform, and integrates both proprietary and open source solutions into the development environment. My Eclipse has two primary versions a professional and a standard edition. The standard edition adds database tools, a visual web designer, persistence tools, Spring tools, Struts and JSF tooling, and a number of other features to the basic Eclipse Java Developer profile. It competes with the Web Tools Project, which is a part of Eclipse itself, but My Eclipse is a separate project entirely and offers a different feature set. Most recently, My Eclipse has been made available via Pulse,
28
a provisioning tool that maintains Eclipse software profiles, including those that use My Eclipse. 3.2 MySQL: MySQL was developed by a consulting firm in Sweden called TcX. They were in need of a database system that was extremely fast and flexible. Unfortunately they could not find anything on the market that could do what they wanted. So, they created MySQL, which is loosely based on another database management system called SQL. The product they created was fast, reliable, and extremely flexible. It is used in many places throughout the world. Lately, however, it has begun to permeate the business world as a reliable and fast database system MySQL is often confused with SQL, the structured query language developed by IBM. It is not a form of this language but a database system that uses SQL to manipulate, create, and show data. MySQL is a program that manages databases, much like Microsoft's Excel manages spreadsheets. SQL is a programming language that is used by MySQL to accomplish tasks within a database, just as Excel uses VBA (Visual Basic for Applications) to handle tasks with spreadsheets and workbooks. A database is a series of structured files on a computer that are organized in a highly efficient manner. These files can store tons of information that can be manipulated and called on when needed. A database is organized in the hierarchical manner, from the top down. You start with a database that contains a number of tables. Each table is made up of a series of columns, Data is stored. 3.3 J2EE
29
Today more and more development wants to write distributed transactional applications for the enterprise and leverage the speed, security and reliability of server side technology. J2EE is a platform independent, java centric environment from sun for developing, building and deploying web based enterprise application online. The J2ee platform consists of a set of services, APIs and protocols that provide functionality for developing multi-tiered web based application. At the client side tier, J2EE support pure HTML as well as java applets or applications. It relies on jsp and Servlets codes to create HTML or other formatted data for the client. EJB provide another layer where the platforms logic is stored. To reduce costs and fast-track enterprise application design and development, the java2 platform, Enterprise edition (J2EE) technology provides a component-based approach to the design.
CHAPTER 4
DESIGN ARCHITECTURE This project Towards Secure and Dependable Storage Services in Cloud Computing merely implemented as Online Banking software is an online Web application through which we can provide an additional level of security for users datas being stored. The security is being prosecuted by generating a homomorphic token id which can be used only once after which it gets expired. Modules in this project Administrator as TPA Client Banker 4.1 Administrator
30
Here the administrator himself serves as TPA. He has the sole privilege to modify or make any change in the database. He can access the client data as well as the employee details. He is the one who authenticate all the process being carried out in the application. 4.2 Client One who enjoy any services in the application is referred here as the client. Here the client can login as either a current account user or savings account user. There are facilities for transferring fund from any account to another, check the status of the account which includes balance enquiry and user login details and so on. A client can also apply for a loan online through this application if he is not being black listed for any malpractice or any other deal.
CHAPTER 5
IMPLEMENTATIONS 5.1 DATA FLOW DIAGRAM Level 1:
User
Authenti cation and user Account
31
Education Loan
Vehicle Loan
Money Transfer
User information affect to database in the cloud. This section is keenly monitoring by Auditor
Logout Figure 5.1.1: Client Services
The diagram explains how a client is accessing the services.
Level 2:
Auditing View
Authen tication
Reference number Generate Each Attempt Result 32
Choose user to retrieve monitored Data.
Result
Logout Figure 5.1.2: Third Party Auditing
The diagram shows the works of TPA.
5.2 UML DIAGRAMS 5.2.1Use-case Diagram
LOGIN
VALIDATIO N
33
MONITORIN G REQUEST
Adminstrator
RESPONSE
Client
Figure 5.2.1.1: Client Login
The above diagram represents the client login.
34
LOGIN
MONEY TRANSFER EDUCATION LOAN VECHICLE LOAN VIEW STATUS Figure 5.2.1.2: Client Services
Client
<<Cloud database System>>
This diagram explains the client Services and how the client is accessing the cloud database.
35
5.2.2 ACTIVITY DIAGRAM

Start
Home
No
Yes
Yes
Admin login
Client login
Admin page
Monitoring Client data
Client page Money Transfer Education Loan Vehicle Loan
Cloud Database System
End
Figure 5.2.2: Activity Diagram
The above diagram shows the admin and client activities in the cloud storage.
36
5.2.3 CLASS DIAGRAM
Figure 5.2.3: Class Diagram
The class diagram contains Login as the main class and the remaining as the sub-classes and represent how each classes are inherited from the main class.
37
5.2.4 SEQUENCE DIAGRAM
Figure 5.2.4: Sequence Diagram
In this diagram the sequential flow of instructions are explained and how the request & response are processed.
38
5.3 ARCHITECTURE DIAGRAM
Third Party Auditor Public data auditing Data Auditing Delegation
Cloud Server
File Access
Issue File Access Credential Owner

Figure 5.3: Architecture Diagram
User
The above diagram shows the Cloud Storage Architecture.
39
CHAPTER 6
SYSTEM TESTING Testing is vital to the success of the system. Testing is usually carried out to check for the Reliability of the system. System testing goal will be successfully achieved. The aim of testing is to create a buy free, reliable and secure system. Inadequate testing or non testing leads to error that may not appear until months later. The objective of testing is to discover the errors found in the system. The main stage of testing is: Module Testing Integration Testing Validation Testing Unit Testing Black box Testing White box Testing
40
6.1 MODULE TESTING Each individual program module is tested for any possible errors. They were also tested for Specifications i.e., to see whether they are working as per what the program does and how it should be performed under various conditions. 6.2 INTEGRATION TESTING Testing a collection of module is known as integration testing. A module is a collection of dependent classes such as an object class, an abstract data types or some looser collection of procedures and functions. A module encapsulated relation components could be tested without other system modules. In this testing can be a loss of data across interfaces, one module can have an inadvertent, adverse effect on another sub function, when combined may not produce designed major functions. 6.2.1 TOP DOWN INTEGRATION TESTING It is an incremental approach for the construction of a program structure. Here, modules are integrated by moving downwards through the control hierarchy, beginning with the main control module. 6.2.2 BOTTOM UP INTEGRATION TESTING It begins with the construction and testing of atomic modules. Because components are integrated from the bottom-up, processing required for the components, subordinate to a given level is available and need for stubs are eliminated. 6.3 VALIDATION TESTING Validation is the process of checking if something satisfies a certain criterion. Examples would include checking if a statement is true, if an appliance works as intended, if a system is secure, or if computer datas are compliant with
41
an open standard. Validation implies one is able to document that a solution or process is correct or is suited for its intended use. 6.4 UNIT TESTING
It deals with testing the individual units or programs in the system. It is to test the main functions availability and check whether they are working or not.
To test whether all hyperlinks are working properly or not.
To test the administrator facilities such as authentication check, change password, insertion, modification of administrative details, and database manipulation from the client side administrative login.
6.5 BLACK BOX TESTING It is a testing technique where the internal where the terminal architecture of the item is being tested is not necessarily known by the tester. The tester never examines the project code. This type of testing should not be performed by the developer of the program but the tester. Tester should know about the input and the expected outcomes. 6.6 WHITE BOX TESTING It is known as the glass box or structural or open box testing. It uses knowledge of programming code to examine output. Tester should read and use the source code of the application being tested.
Tester must apply carious input in order to test branches, conditions, loops and the logical sequence of the statements being executed.
42
CHAPTER 7
CONCLUSION 7.1 CONCLUSION In this paper, we investigate the problem of data security in cloud data storage, which is essentially a distributed storage system. To achieve the assurances of cloud data integrity and availability and enforce the quality of dependable cloud storage service for users, we propose an effective and flexible distributed scheme with explicit dynamic data support, including block update, delete, and append. We rely on erasure-correcting code in the file distribution preparation to provide redundancy parity vectors and guarantee the data dependability. By Utilizing the homomorphic token with distributed verification of erasure-coded data, our scheme achieves the integration of storage correctness insurance and data error localization, i.e., whenever data corruption has been detected during the storage correctness verification across the distributed servers, we can almost guarantee the simultaneous identification of the misbehaving server(s). Considering the time, computation resources, and even the related online burden of users, we also provide the extension of the proposed main scheme to support third-party auditing, where users can safely delegate the integrity checking tasks to third-party auditors and be worry-free to use the cloud storage services. Through detailed security and extensive experiment results, we show that our scheme is highly efficient and resilient to Byzantine failure, malicious data modification attack, and even server colluding attacks.
43
APPENDIX 1
SCREENSHOTS HOME PAGE
44
ADMIN LOGIN PAGE
45
AUDIT LOGIN-1
46
TOKEN ID GENERATION
47
AUDIT LOGIN WITH TOKEN ID
AUTOMATIC RESET OF TOKEN ID AFTER AUDIT LOGIN
48
ADMIN HOME
AUDIT RECORD
49
AUDIT DETAILS
USER DETAILS
50
User details
EMPLOYEE DETAILS
CLIENT SERVICES
51
ONLINE USER REGISTRATION FORM
52
USER LOGIN PAGE-1
USER LOGIN PAGE-2
53
USER ACCOUNT DETAILS
54
ONLINE TRANSACTION PAGE
55
MONEY TRANSFER
56
LOAN PAGE
VEHICLE LOAN PAGE
57
58
ONLINE LOAN APPLICATION
59
LOAN STATUS CHECK
EDUCATION LOAN DETAILS
60
61
62
CONTACT US
63
APPENDIX 2 REFERENCES: 1. Amazon.com, (July 2008) Amazon s3 Availability Event: July 20, 2008, http://status.aws.amazon.com/s3-20080720.html.
2.
M. Arrington, (Dec. 2006) Gmail Disaster: Reports of Mass Email Deletions, http://www.techcrunch.com/2006/12/28/gmail-disasterreports-of-massemail- deletions/. M. Armbrustet al., (Feb. 2009.) Above the Clouds: A Berkeley View of Cloud Computing, Univ. California, Berkeley, Tech. Rep. UCBEECS2009-28, Juels, J. Burton, and S. Kaliski, (07, Oct. 2007,) PORs: Proofs of Retrievability for Large Files, Proc. ACM CCSpp. 58497. M. Krigsman, (Dec. 2006) Apples Mobile Me Experiences PostLaunch Pain, http://blogs.zdnet.com/projectfailures/?p=908. P. Mell and T. Grance, (2009) Draft NIST Working Definition of Cloud Computing http://csrc.nist.gov/groups/SNS/cloudcomputing/index.html.
3.
4.
5.
6.
64

Project Report - Secure and Dependable Storage

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Project Report - Secure and Dependable Storage

Uploaded by

Copyright:

Available Formats

CHAPTER 1

Improvement resulting over the existing method in terms of accuracy, timeliness.

Table1.1: Contents of JDK

classed as the Java developers

Speed Ram Hard Disk

2.5 GHz. 128 MB. 40GB.

Table1.2: Hardware Specification

SOFTWARE SPECIFICATION Operating system Windows XP.

MyEclipse Enterprise Workbench 5.1

Database Server Browser

MySQL. Tomcat 5.5. Google Chrome

Table1.3: Software Specification

Pre-Requisition for Client manipulation.

Replication based file distribution system. 2.2 PROPOSED SYSTEM

Allocating an agent for Supervising.

Figure 2.1 :Public Auditing

Figure 2.2: Support Batch Auditing

Secure Authenticat ion Figure 2.3: Homomorphic authenticator technique

Authenti cation and user Account

Logout Figure 5.1.1: Client Services

The diagram explains how a client is accessing the services.

Reference number Generate Each Attempt Result 32

Choose user to retrieve monitored Data.

Logout Figure 5.1.2: Third Party Auditing

The diagram shows the works of TPA.

5.2 UML DIAGRAMS 5.2.1Use-case Diagram

Figure 5.2.1.1: Client Login

The above diagram represents the client login.

<<Cloud database System>>

5.2.2 ACTIVITY DIAGRAM

Client page Money Transfer Education Loan Vehicle Loan

Cloud Database System

Figure 5.2.2: Activity Diagram

5.2.3 CLASS DIAGRAM

Figure 5.2.3: Class Diagram

5.2.4 SEQUENCE DIAGRAM

Figure 5.2.4: Sequence Diagram

5.3 ARCHITECTURE DIAGRAM

Third Party Auditor Public data auditing Data Auditing Delegation

Issue File Access Credential Owner

The above diagram shows the Cloud Storage Architecture.

To test whether all hyperlinks are working properly or not.

SCREENSHOTS HOME PAGE

ADMIN LOGIN PAGE

AUDIT LOGIN WITH TOKEN ID

AUTOMATIC RESET OF TOKEN ID AFTER AUDIT LOGIN

ONLINE USER REGISTRATION FORM

USER LOGIN PAGE-1

USER LOGIN PAGE-2

USER ACCOUNT DETAILS

ONLINE TRANSACTION PAGE

VEHICLE LOAN PAGE

ONLINE LOAN APPLICATION

LOAN STATUS CHECK

EDUCATION LOAN DETAILS

You might also like