You are on page 1of 491

1

Volume 1 & 2

The Proceedings
Of 2nd National Conference on Innovation and Entrepreneurship in Information and Communication Technology May 14-15, 2011

Editor-in-Chief

Dr. Anil Kumar Pandey, Convener, SIG-IEICT


Editors 1. Dr. Saba Hilal, GNIT-MCA Institute 2. Mr. Pradeep Agrawal, GGIT 3. Dr. S.K. Pandey, GGIT 4. Dr. Shikha Jalota, GGIT 5. Mr. Ankit Shrivastava, GGIT 6. Ms. Monika, GNIT-MCA Institute 7. Ms. Jyoti Guglani, GGIT 8. Mr. Amit Kumar, GGIT

Copyright 2011 All rights reserved. No part of this publication may be produced or transmitted in any form or by any means without the written permission of Special Interest Group on (IE-ICT)

EDITORIAL
India is one of the youngest nations in the world. There are millions in the employable age. The conventional methods alone would not be adequate to address this problem. Innovation and entrepreneurship are key to job creation and national competitiveness. Technological advancements are increasing at rapid pace and developed economies are deriving economic dividends to create wealth and improve the efficiency of public services and processes. Throughout the developing world innovative entrepreneurs are working to establish businesses that are ICT enabled. India is at a threshold of new takeoff. The Indian IT industry is likely to clock revenues of over USD $70 billion by the end of this year and by USD $220 billion by 2020 that is why innovation and entrepreneurship and ability to combine the two in the domain of ICT becomes of immense importance. ICT has impacted our society in all walks of life education, employment, healthcare, communication, governance, business, banking to defense and disaster management. This has been possible due to innovative entrepreneurial venture that have come up using ICT and ICT enabled services. However in countries like India taking it to rural areas and integrating with agriculture, sanitation and village level micro enterprises are yet to be developed and commercialised. The university/ institute based innovators routinely produce breakthrough technologies that, if commercialized by industry, have the power to sustain the economic growth. However, this is not happening. In absence of this the business world is witnessing the rise of the student entrepreneurs who start their entrepreneurial journey at a very early stage while still pursuing their education. In order to help innovators and entrepreneurs create ventures academic institutions including schools, government, businesses and investors must work together. A clear knowledge of incubation support for taking their ideas as start up would help the economic growth as well as job opportunities for millions of youths of this country. This conference has been organized with the objective to bring students, faculty members, IT professionals, government agencies, venture capital agencies, innovation foundations and social entrepreneurs on a common platform to elicit and explore sources of innovation that exists among the upcoming and informal group of population who can be converted into the entrepreneurs of tomorrow through possible intervention generated here. It has the following aims. To provide a platform for students to meet and interact with innovators, successful entrepreneurs, government agencies and Venture Capitalists. To develop insight for integrating entrepreneurial experiences in the formal education process. To empower and sensitize students and faculty towards wealth creation through innovation and entrepreneurship. To give the basic knowledge and tools for setting up technology driven enterprise. To invoke drive and motivation to convert Intellectual Property Rights into enterprise through innovative management. Special Interest Group (SIG) on Innovation and Entrepreneurship in ICT, CSI Ghaziabad Chapter and Mahamaya Technical University have joined hands to organize this conference. It is a matter of great satisfaction that the response in terms of research papers as well as participation has been overwhelming and nation wide. Distinguished academicians, scholars from universities, government departments and entrepreneurs are participating to make it meaningful. We present here all the papers, PPTs and abstracts submitted by the authors. In future we intend to have third party review and get them published.

Dr. Anil Kumar Pandey Editor-In-Chief

The Proceedings
Editorial

2nd National Conference on Innovation and Entrepreneurship in Information and Communication Technology, May 14-15, 2011

____________________________________________________________________________________

Contents
Volume 1
New Technology acceptance model to predict adoption of wireless technology in healthcare Manisha Yadav, Gaurav Singh, Shivani Rastogi.9 Solutions to Security and Privacy Issues in Mobile Social Networking Nikhat Parveen, Danish Usmani..15 Wireless Monitoring Of The Green House Using Atmega Based Monitoring System: WSN Approach Miss.Vrushali R.Deore., Prof. V.M. Umale..21 Fuzzy C- Mean Algorithm Using Different Variants Vikas Chaudhary, Kanika Garg, Arun Kr. Sharma.27 Security Issues In Data Mining Rajeev Kumar, Pushpendra Kumar Singh, Arvind Kannaujia ..38 Web Caching Proxy Services: Security and Privacy issues Mr. Anoop Singh , Mr. Rohit Singh, Ms. Sushma Sharma ..42 A Comparative Study to Solve Job Shop Scheduling Problem Using Genetic Algorithms and Neural Network Vikas Chaudhary, Kanika Garg, Balbir Singh..48 Innovation & Entrepreneurship In Information And Communication Technology Deepak Sharma Nirankar Sharma, Nimisha Srivastava .55 Insider Threat: A Potential Challenges For The Information Security Domain Abhishek Krishna, Santosh Kumar Smmarwar, JaiKumar Meena , Monark Bag, Vrijendra Singh ...57 Search Engine: Factors Influencing The Page Rank PrashantAhlawat, Hitesh Kumar Sharma..63 Are The Cmmi Process Areas Met By Lean Software Development? Jyoti Yadav68 Password Protected File Splitter And Merger (With Encryption And Decryption) Mrs. Shikha saxena, Mr. Rupesh kumar sharma .73 Security Solution in Wireless Sensor Network Pawan Kumar Goel , Bhawnesh Kumar ,Vinit Kumar Sharma..77 Vertical Perimeter Based Enhancement Of Streaming Application P.Manikandan, R.Kathiresan, Marie Stanislas Ashok82 Orthogonal Frequency Division Multiplexing for Wireless Communications Meena G shende.87 A Comprehensive Study of Adaptive Resonance Theory Vikas Chaudhary, Avinash Dwivedi, Sandeep Kumar, Monika Bhati..91 Is Wireless Network Purely Secure? Mrs Shikha Saxena, Mrs Neetika Sharma,Ms Rachana Singh..98 An innovative digital watermarking process A Critical Analysis Sangeeta Shukla..104 Design Of A Reconfigurable Sdr Transceivers Using Labview Sapna Suri, Vikram Verma, Rajni Raghuvanshi And Pooja Pathak.111 A Modified Zero Knowledge Identification Scheme Using ECC Kanika Garg, Dr. R. Radhakrishan, Vikas Chaudhary, Ankit Panwar116 Security and Privacy of Conserving Data in Information Technology Suresh Kumar Kashvap, Pooja Agrawal , Minakshi Agrawal Vikas Chandra Pandey 120 Barriers to Entrepreneurship - An analysis of Management students

5
Dr. Pawan Kumar Dhiman ..126 A Novel Standby Leakage Power Reduction Method Using reverse Body Biasing Technique for Nanoscale VLSI Systems James Appollo .A.R, Tamijselvan. D..131 Survey On Decision Tree Algorithm Jyoti Shukla ,Shweta Rana ..136 Distributed Security Using Onion Routing Ashish T. Bhol , Savita H Lambole 141 Virtulization implementation in an Enterprise Rohit Goyal.146 Design of data link later using WiFi MAC protocols K.Srinivas (M.Tech) 149 Leveraging Innovation For Successful Entrepreneurship Dr. Sandeep kumar , Sweta Bakshi, Ankita Pratap ..153 Performance evaluation of cache replacement Algorithmd for Cluster Based cross layer design for Cooperative Caching (CBCC) in Mobile-Ad Hoc Network Madhavarao Boddu, Suresh joseph k..165 Owerment And Total Quality Management For Innovation And Success In Organisations Ms. Shamsi Sukumaran K , Ms. Bableen Kaur .178 A New Multiple Snapshot Alogorithm for Direction of Arrival Estimation using Smart Antenna Lokesh L , Sandesha karanth, Vinay T, Roopesh , Aaquib Nawaz ...185 Quality Metrics for TTCN -3 and Mobile Web Application Anu saxena , Kapil Saxena ...190 A Unique Pattern matching Algorithm Using The Prime Number Approach Nishtha kesswani , Bhawani Shankar Gurjar .....194 Study and implementation of Power control in Ad hoc networks Animesh srivastava, Vanya garg, Vivekta Singh..197 Improving the performance of Web log Mining by using K- Means clustering with Neural Network Vinita Srivastava..203 Higher Education Through Enterpreneurship Development in India Mrs. Vijay..208 Concepts ,Techniques,Limitations And Application of data mining S.C. Pandey, P.K Singh, D. dubey...210 ICT for Energy Efficiency, Conservation & reducing Carbon Emissions Aakash Mittal 212 Study of Ant Colony Optimization For Proficient Routing In Solid Waste Management Aashdeep Singh, Arun Kumar, Gurpreet Singh.213 Survey on Decision Tree Algorithm Jaya Bhushan, Shewta Rana, Indu...215

Volume 2
Analysis of Multidimensional Modeling Related To Conceptual Level Udayan Ghosh, Sushil Kumar....222 Wireless Sensor Networks Using Clustering Protocol Gurpreet Singh, Shivani Kang.....229 Performance Evaluation of Route optimization Schemes Using NS2 Simulation Manoj Mathur, Sunita Malik, Vikas...235 IT-Specific SCM Practices in Indian Industries: An Investigation Sanjay Jharkharia.....239 Cloud Computing Parveen Sharma, Manav Bharti University, Solan Himachal Pradesh..259 Comparing the effect of Hard and Soft threshold techniques on speech compression using wavelet transform Sucheta Dhir.263 A Hybrid Filter for Image Enhancement Vinod Kumar, Kaushal Kishore and Dr. Priyanka ...269 Comprehensive Study of Finger Print Detection Technique Vivekta Singh, Vanya Garg ....273 Study of Component Based Software Engineering using Machine Learning Techniques Vivekta Singh 281 Efficient Location-Based Spatial Query (LBSQ) Processing in Wireless Broadcast Environments K Madhavi, Dr Narasimham Challa.....285 A 3D Face Recognition using Histrograms Sarbjeet Singh, Meenakshi sharma, Dr. N Suresh Rao, Dr. Zahid Ali...291 An Application of Eigen Vector in Back Propagation Neural Network for Face expression Identification Ahsan Hussain ....295 Next Generation Cloud Computing Architecture Ahmad Talha Siddiqui, Shahla Tarannum, Tehseen Fatma.....299 Virtualization of Operating System using Xen Technology Annu Dhankhar, Siddharth Rana....304 Quality Metrics for TTCN-3 and Mobile-Web Applications Anu Saxena, Kapil Saxena..308 Future of ICT Enable Services for Inclusive Growth in Rural Unprivileged Masses Bikash Chandra Sahana , Lalu Ram.......312 Conversion of Sequential Code to Parallel An Overview of Various Conversion Methods Danish Ather, Prof. Raghuraj Singh .....314 Innovation And Entrepreneurship In Information And Communication Technology Deepak Sharma, Nirankar Sharma, Nimisha Shrivastava....320 Fuzzy Classification On Customer Relationship Management Mohd. Faisal Muqtida, Ashi Attrey, Diwakar Upadhyay...322 New Technology Acceptance Model to Predict Adoption of Wireless Technology in Healthcare Gaurav Singh , Manisha Yadav, Shivani Rastogi.330 Entrepreneurship through ICT for disadvantaged communities Ms. Geetu Sodhi, Mr. Vijay Gupta ....335 Efficient Location-Based Spatial Query (LBSQ) Processing in Wireless Broadcast Environments K Madhavi, Dr Narasimham Challa...342 K means Clustering Algorithm with High Performance using large data Vikas Chaudhary , Vikas Mishra, Kapil ...350 Performance Evaluation of Route optimization Schemes Using NS2 Simulation Manoj Mathur, Sunita Malik, Vikas...355 Image Tracking and Activity Recognition Navneet Sharma, Divya Dixit, Ankur Saxena...358

7
An innovative digital watermarking process A Critical Analysis Sangeeta Shukla, Preeti Pandey, Jitendra Singh361 Survey On Decision Tree Algorithm Shweta Rana........368 Comparing the effect of Hard and Soft threshold techniques on speech compression using wavelet transform Sucheta Dhir.....374 Improving The Performance Of Web Log Mining By Using K-Mean Clustering With Neural Network Vinita Shrivastava...379 A Hybrid Filter for Image Enhancement Vinod Kumar, Kaushal Kishore, Dr. Priyanka ....385 Trends in ICT Track: Software development & Deployment (AGILE METHODOLOGIES) Shubh, Priyanka Gandh, Manju Arora....390 Vulnerabilities in WEP Security and Their Countermeasures Akhilesh Arora.....400 Implementation of Ethernet Protocol and DDS in Virtex-5 FPGA for Radar Applications Garima chaturvedi, Dr.Preeta sharan ,Peeyush sahay...408 CCK Coding Implementation in IEEE802.11b Standard Mohd. Imran Ali......413 Cognitive Radio and Management of Spectrum Prof.Rinkoo Bhatia, Narendra Singh Thakur, Prateek Bhadauria, Nishant Dev...416 Impact of MNCs on entrepreneurship Ms. Sonia......428 Multilayered Intelligent Approach - An Hybrid Intelligent Systems Neeta Verma, Swapna Singh....437 Green ICT: A Next Generation Entrepreneurial Revolution Pooja Tripathi441

ABSTRACTS AND PPTS


An Innovation Framework For Practice-Predominant Engineering Education Om Vikas ........450 Mobile Ad-hoc Network Apoorv Agarwal, Apeksha Aggarwal ....460 Fuzzy C- Mean Clustering Algorithm Arun Kumar Sharma........470 A Comparitive Study of Web Securioty Protocols Hanish Kumar......475 E-Village A new mantra for rural development Mr. S.K. Mourya......481 Green ICT: A Next Generation Entrepreneurial Revolution Prof Pooja Tripathi.....484 Role Of 21st Centaury : Ict Need Of The Day Saurabh Choudhry ........486 Reusability Of Software Components Using Clustering Meenakshi Sharma, Priyanka Kakkar, Dr. Parvinder Sandhu, Sonia Manhas.487

PART - 1

New Technology acceptance model to predict adoption of wireless technology in healthcare

Abstract Adoption of new technologies is researched in Information Systems (IS) literature for the past two decades, starting with the adoption of desktop computer technology to the adoption of electronic commerce technology. Issues that have been researched comprise of how users handle various options available in software environment, their perceived opinion, barriers and challenges to adopting a new technology, IS development procedures that are directly impacting any adoption including interface designs and elements of human issues. However, literature indicates that the models proposed in the IS literature such as Technology Acceptance Model (TAM) are not suitable to specific settings to predict adoption of technology. Studies in the past few years have strongly concluded that TAM is not suitable in healthcare setting because it doesnt consider a myriad of factors influencing adoption technology adoption in healthcare This paper discusses the problems in healthcare due to poor information systems development, factors that need to be considered while developing healthcare applications as these are complex and different from traditional MIS applications and derive a model that can be tested for adoption of new technology in healthcare settings. The contribution of this paper is in terms of building theory that is not available in the combined areas of Information Systems and healthcare. Index Terms healthcare, Information Systems, adoption factors.

I. INTRODUCTION

nstitute of Medicine (IOM) in the United States has recognized that frontier technologies such as wireless technology would improve access to information in order to achieve quality health care. A

report released by the IOM in 2003 outlined a set of recommendations to improve Patient safety and reduce errors using reporting systems that are based on Information Systems (IS). While it is widely accepted that IS assists health related outcomes, how this can be efficiently achieved is an under researched area. Therefore, conflicting outcomes are reported in healthcare studies as to the successful role of IS. In essence, research is needed to investigate the role, and perhaps the use of, frontier technologies in improving information management, communication, cost and access to improve quality healthcare. In healthcare, specific issues relating to the failures of Information Management are being addressed using frontier technologies such as RF Tags and Wireless Handheld Devices. The main focus in using these technologies is to collect patient related information in an automated manner, at the point of entry, so as to reduce any manual procedures needed to capture data. While no other discipline relies more heavily on human interactions than health care, it is in healthcare that technology in the form of wireless devices has the means to increase not decrease the benefits derived from the important function of human interaction. Essential to this is the acceptance of this wireless handheld technology as this technology enables to collect data at the point of entry, with minimal manual intervention, with a higher degree of accuracy and precision. When it comes to the Management of Information Systems, development and implementation of a hospital Information System is different from traditional Information Systems due to the life critical environment in hospitals. Patient lives are dependent

10
upon the information collected and managed in hospitals and hence smart use of information is crucial for many aspects of healthcare. Therefore, any investigation conducted should be multi-dimensional and should cover many aspects beyond technical feasibility and functionality dictated by traditional systems. Successful implementation of health information systems includes addressing clinical processes that are efficient, effective, manageable and well integrated with other systems). While traditional Information Systems address issues of integration with other systems, this is more so important in hospital systems because of the profound impact these systems have on short and long term care of patients .Reasons for failure in Information Systems developed for healthcare include lack of attention paid to the social and professional cultures of healthcare professionals, underestimation of complex clinical routines, dissonance between various stakeholders of health information, long implementation time cycles, reluctance to support projects financially once they are delivered and failures to learn from past mistakes. Therefore, any new technologies should address these reasons in order to be accepted in the healthcare setting. II. UNSUITABILITY OF CURRENT TECHNOLOGY ACCEPTANCE MODELS TO HEALTHCARE: The acceptance of new technologies has long been an area of inquiry in the MIS literature. The acceptance of personal computer applications, telemedicine, e-mail, workstations, and the WWW are some examples of technologies that have been investigated in the MIS literature. User technology acceptance is a critical success factor for IT adoption and many studies have predicted this using Technology Acceptance Model (TAM), to some extent, accurately by means of a host of factors categorized into characteristics of the individuals, characteristics of the technology and the characteristics of the organizational context. Technology Acceptance Model, specifically measures the determinants of computer usage in terms of perceived usefulness and perceived ease of use. While perceived usefulness has emerged as a consistently important attitude formation, studies have found that perceived ease of use has been inconsistent and of less significant. The literature suggests that a plausible explanation for this could be the continued prolonged users exposure to technology leading to their familiarity, and hence the ease in using the system. Therefore users could have interpreted the perceived ease of use as insignificant while determining their intention to use a technology. The strengths of TAM lies in the fact that it has been tested in IS with various sample sizes and characteristics. Results of these tests suggest that it is capable of providing adequate explanation as well predicting user acceptance of IT. Strong support can be found for the Technology Acceptance Model (TAM) to be robust in predicting user acceptance However, some studies criticize TAM for its examination of the model validity with students who have limited computing exposure, administrative and clerical staff, who do not use all IT functions found in software applications. Studies also indicate that the applicability of TAM to specific disciplines such as medicine is not yet fully established. Further, the validity and reliability of the TAM in certain professional context such as medicine and law is questioned. Only limited information is found in the healthcare related literature as to the suitability of TAM. Similarly, in the literature related to the legal field, especially where IT is referred, limited information can be found on TAM. Therefore, it appears that the model is not fully tested with various other professionals in their own professional contexts. Therefore, it can be argued that, when it comes to emerging technology such as wireless handheld devices, TAM may not be sufficient to predict the acceptance of technology because the context becomes quite different. It should be noted that the current context in healthcare related Information Systems is not only the physical environment but also the ICT environment as wireless technology is markedly different from Desktop technology. A major notable change is the way in which information is accessed using wireless technology as the information is pushed to the users as opposed to users pulling the information from desktop computers. In the Desktop technology, users have the freedom to choose what they want to access and the usage behavior is dependent upon their choice. On the other hand, using wireless devices, it is possible for the information whether needed or not to reach these devices assume significant importance because of the setting in which these devices are used. For example, in an operation theatre patient lives assume importance and information needs must reflect this. If wireless handheld devices dont support data management that are closely linked with clinical procedures due to device restrictions such as screen size and memory, despite their attractions, users would discard these devices. Therefore,

11
applications developed onto these devices must address complex clinical procedures that can be supported by these devices. Another major consideration in the domain of wireless technology is the connectivity. While this is assumed to be always available in a wired network environment, this can not be guaranteed in a wireless technology due to mobility the network connectivity. As users carry the device and roam, the signal strength may change from strong to weak and this may interrupt user operations. Therefore, to accomplish smart information management, certain technical aspects must also be addressed. Current users of wireless technology are concerned with their security and privacy aspects associated in using this technology. This is because they need to reveal their identity in order to receive information. While the privacy is concerned with the information that they provide to others, security threats fall under the categories of physical threat and data threat. Due to the infancy stages and hardware restrictions, handheld devices are not able to implement these features to the expected level on the devices as found in desktop computers. In a healthcare setting, any leak in the privacy issues would have potential adverse impact on the stakeholders. Further, due to other devices that may be using radio frequency or infra-red frequency in providing healthcare to patients, there may be practical implementation restrictions in the usage of wireless devices for ICT. Our own experience in providing wireless technology solutions to a private healthcare in Western Australia yielded mixed responses. The wireless technology developed and implemented for the Emergency Department was successful in terms of software development and deployment. The project was well accepted by the users in the healthcare. However, the wireless solution provided to address problems encountered in the Operation Theatre Management System was not well received by the users, despite the superiority in design, functionality and connectivity. Users were reluctant to use the application due to the hardware and database connectivity restrictions, despite scoring a high level of opinion on acceptance for usefulness and ease of use. Now, let us assume that TAM is correct in claiming that the intention to use a particular system is a very important factor in determining whether users will actually use it. Let us also assume that the wireless systems developed for the private healthcare provider in Western Australia exhibited that there were clear intentions to use a the system. However, despite a positive affect on perceived usefulness and perceived ease of use, the wireless system was not accepted by users. It should be noted that the new system mimicked the current traditional system, and yet did not yield any interest in terms of user behaviors. While searching for reasons for this hard to explain phenomena, who argued, after studying TAM, that perceived usefulness should also include near-term and long-term usefulness in order to study behavioral intentions. Other studies that have examined the utilization of the Internet Technology have also supported view. This has given us a feeling that TAM may not be sufficient to predict the acceptance of wireless technology in specific healthcare setting. A brief review of prior studies in healthcare indicated that a number of issues associated with the lack of acceptance of wireless handheld devices are highlighted but not researched to the full extent that they warrant. For example, drawbacks of these devices in healthcare included perceived fear for new learning by doctors, time investment needed for such learning, cost involved in setting up the wireless networks and the cost implications associated with the integration of existing systems with the new wireless system .A vast majority of these studies concur that wireless handheld devices would be able to provide solutions to the Information Management problems encountered by healthcare. While these studies unanimously agree that the information management would be smarter using wireless technology and handheld devices, they seldom provided details of those factors that enabled the acceptance of wireless technology specific to healthcare setting. MIS journals appear to be lagging behind in this area. Therefore, it is safe to assume that current models that predict the acceptance of technology based on behavioral intentions are insufficient. This necessitates a radically new model in order to predict the acceptance of wireless handheld technology in specific professional settings.

III. INGREDIENTS FOR A NEW MODEL TO PREDICT ACCEPTANCE OF NEW TECHNOLOGY: Some of the previous models measured actual use through the intention to use and input to these models are perceived usefulness, perceived ease of use, attitude, subjective norm, perceived behavioral control, near term use, short term use, experience, facilitating conditions and so on. In recent years, factors that impacting technology acceptance

12
included job relevance, output quality and result demonstrability. In the field of electronic commerce and mobile commerce, factors such as security and trust are considered as factors of adoption of these technologies. In end user computing, factors such as user friendliness and maintainability appear to be influencing the applications. Therefore, any new model to determine the acceptance of wireless technology would include some of the above factors. In addition to these, when it comes to wireless technology, any acceptance factors should hinge on two dominant concepts hardware (or device) and applications that run o the hardware as the battle continues to accommodate more applications on a device that is diminishing in size, but improving in power. Further, mobile telephones and PDAs, appear to be accepted based on their attractiveness, hardware design, type of key pad that they provide, screen color and resolution, ability to be carried around etc. In effect, the hardware component appears to be an equally dominant factor in the adoption of wireless technology. Once the hardware and software applications are accepted, the third dominant factor in the acceptance of wireless technology appears to be the telecommunication factor. This factor involves various services provided by telecommunication companies, the cost involved in such services, the type of connectivity, roaming facilities, ability to access the Internet, provision for Short Messaging Services (SMS), ability to play games using the mobile devices etc. These factors are common to both mobile telephones and emerging PDAs. Some common features that the user would like to see appear to be alarming services, calendar, scheduler, ability to access digital messages both text and voice etc. Therefore, studies that investigate the adoption of wireless technology should aim to categories factors based on hardware, applications and telecommunication as these appear to be the building blocks of any adoption of this technology. Specific factors for applications, perhaps, could involve portability across various hardware, reliability of code, performance, ease of use, module cohesion across different common applications, clarity of code etc,. In terms of hardware, the size of the device, memory size, key pad, resolution of screen, various voice tones, portability, attractiveness, brand names such as Nokia, capability such as alarms, etc. would be some of the factors of adoption or acceptance. In terms of service provision, plan types, costs, access, free time zones, SMS provision, cost for local calls, cost to access the Internet, provision to share information stored between devices etc. appear to be dominant factors. Factors such as security etc form a common theme as all the three dominant categories need to ensure this factor. Factors mentioned above are crucial to determine the development aspects of Wireless Information Systems (WIS) for healthcare as these factors dictate the development methodology, choice of software language, user interface design etc. Further, the factors of adoption in conjunction with methodology would determine the integration aspects such as coupling the new system with existing systems. This would then determine the implementation plans. In essence, an initial model that can determine the acceptance of wireless technology in healthcare can be portrayed as follows:

Diagram 1: Proposed Model for Technology Adoption in Healthcare Settings In the above model, the three boxes in dark borders show the relationship between various factors that influence the acceptance of technology. The box on the left indicates various factors influencing wireless technology I any given setting. The three categories of factors hardware, software and telecommunication affect the way in which wireless technology is implemented. The factors portrayed in the box are generic and their role to specific healthcare setting varies depending upon the level of implementation. Once the technology is implemented, it is expected to be used. In healthcare settings, it appears that the usage, relevance and need are the three most important influencing factors for the continual usage of new technology When the correct balance is established, users exhibit positive perceptions about using a new technology such as wireless handheld devices for data management purposes. This, in turn, brings out positive attitude towards using the system, both short

13
and long term usage. The positive usage would then determine the intentions to use, resulting in usage behavior. The usage behavior then determines the factors that influence the adoption of new technology in a given setting. This is shown by the arrow that flows from right to left. Based on the propositions made in the earlier paragraphs, it is suggested that any testing done to predict the acceptance of new technology in healthcare should test the following hypotheses: 1. Hardware factors have a direct effect on the development, integration and implementation of wireless technology in healthcare for data management 2. Software factors have a direct effect on the development, integration and implementation of wireless technology in healthcare for data management 3. Telecommunication factors direct effect on the development, integration and implementation of wireless technology in healthcare for data management 4. Factors influencing wireless technology in healthcare setting have direct positive effect on usage, relevance and need 5. User perception of new technology is directly affected by usage, relevance and need 6. User perception of new technology has a direct effect on user attitude in using such technology 7. User attitude has a direct effect on intentions to use a new technology 8. Usage behavior is determined by intentions to use a new technology of those key factors. This approach would complement the open ended questions so as to determine the importance of the individual factors determining the adoption and usage of wireless devices and applications.

V. DATA COLLECTION: In order to perform validity and reliability tests, a minimum of 250 samples are required. Any study to test the model should consider the randomness of the samples to avoid any collective bias. Similarly, about 50 samples may be required to undergo the interview process, with each interview to last for 60 minutes. Any instruments developed for testing the model should be able to elicit responses of 'how' and 'why'. This is essential in order to discern differences between adoption and usage decision of wireless handheld applications. In addition, comparing responses to the question about adoption and questions about use would provide evidence that respondents were reporting their adoption drivers and not simply their current behavior. The interview questions should be semi structured or partially structured to guide the research. There are variations in qualitative interviewing techniques such as informal, standardized and guided. Structured interviews and partially structured interviews can be subjected to validity checks similar to those done in quantitative studies. Samples could be asked about their usage of wireless devices including mobile telephones and other hospital systems during the initial stages of the interview. They could be interviewed further so as to identify factors that would lead to the continual usage of these devices and any emerging challenges that they foresee such as training. The interviews can be recorded on a digital recording system with provision to convert automatically to a PC to avoid any transcription errors. This approach would also minimize transcription time and cost. The interview questions should be developed in such as way that both determinants and challenge factors could be identified. This then increases or enhances the research results, which is free of errors or bias.

IV. INSTRUMENTS: The instruments typically would constitute two broad categories of questions. The first category of questions would be related to the adoption and usage of wireless applications in healthcare for data collection purposes. The second category would consist of demographic variables, as these variables determine the granularity of the setting. Open ended questions can be included in the instrument to obtain unbiased and non-leading information. Prior to administering the questions, a complete peer review and a pilot study are insisted in order to ascertain the validity of the instrument. A two stage approach can be used in administering the instrument, where the first stage would gather information about the key factors influencing users decisions to use wireless applications and the second stage on the importance

VI. DATA ANALYSIS: Data should be coded by two individuals into a computer file prior to analysis and a file comparator technique should be used to resolve any data entry errors. A coding scheme should also be developed based on the instrument developed. The coders

14
should be given sufficient instructions on the codes, anticipated responses and any other detail needed to conduct the data entry. Coders should also be given a start-list that will include definitions from prior research for the categories of the construct. Some of the categories would include utilitarian outcomes such as applications for personal use and barriers such as cost and knowledge. Data should be analyzed using statistical software applications using both quantitative and qualitative analyses. Initially a descriptive analysis needs to be conducted, including a frequency breakdown. This should then be followed by a detailed cross sectional analysis of the determinants of behavior. A factor analysis should also be conducted to identify factors of adoption. Once this is completed, tests for significance can be performed between various factors.
[4] Freeman, E. H. (2003). Privacy Notices under the GrammLeach-Bliley Act. Legally Speaking (May/June), 5-9. [5] Goh, E. (2001). Wireless Services: China (Operational Management Report No. DPRO-94111): Gartner. [6] Hu, P. J., Chau, P. Y. K., & Liu Sheng, O. R. (2002). Adoption of telemedicine technology by health care organizations: An exploratory study. Journal of organizational computing and electronic commerce, 12(3), 197-222. [7] Hu, P. J., Chau, P. Y. K., Sheng, O. R. L., & Tam, K. Y. (1999). Examining the technology acceptance model using physician acceptance of telemedicine technology. Journal of Management Information Systems, 16(2), 91-112. [8] Kwon, T. J., & Zmud, R. W. (Eds.). (1987). Unifying the fragmented models of information systems implementation. New York: John Wiley. [9] Oritz, E., & Clancy, C. M. (2003). Use of information technology to improve the Quality of Health Care in the United States. Health Services Research, 38(2), 11-22. [10] Remenyi, D., Williams, B., Money, A., & Swartz, E. (1998). Doing Research in Business and Management. London: SAGE Publications Ltd. [11] [12] Rogers, E. M. (1995). Diffusion of Innovation (4th ed.). New York: Free Press. [13] Rozwell, C., Harris, K., & Caldwell, F. (2002). Survey of Innovative Management Technology (Research Notes No. M-15-1388): Gartner Research. [14] The nature and determinants of IT acceptance, routinization, and infusion, 67-86 (1994). [15] Sausser, G. D. (2003). Thin is in: web-based systems enhance security, clinical quality. Healthcare Financial Management, 57(7), 86-88. [16] Simpson, R. L. (2003). The patient's point of view -- IT matters. Nursing Administration Quarterly, 27(3), 254-256. [17] Smith, D., & Andrews, W. (2001). Exploring Instant Messaging: Gartner Research and Advisory Services. [18] Sparks, K., Faragher, B., & Cooper, C. L. (2001). Well-Being and Occupational Health in the 21st Century Workplace. Journal of Occupational and Organizational Psychology, 74(4), 481-510. [19] Tyndale, P. (2002). Taxonomy of Knowledge Management Software Tools: Origins and Applications, 2002, from www.sciencedirect.com [20] Wiebusch, B. (2002). First response gets reengineered: Will a new sensor and the power of wireless communication make us better prepared to deal with biological attacks? Design News, 57(11), 63 - 68. [21] Wisnicki, H. J. (2002). Wireless networking transforms healthcare: physician's practices better able to handle workflow, increase productivity (The human connection). Ophthalmology Times, 27(21), 38 - 41. [22] Yampel, T., & Eskenazi, S. (2001). New GUI tools reduce time to migrate healthcare applications to wireless. Healthcare Review, 14(3), 15-16. 9

VII.

CONCLUSION:

We saw in this case study that there is a necessity for a new model to accurately predict the adoption of new technologies in specific healthcare setting because current models available in the Information Systems domain are yet to fulfill this need. Based on our experience and available literature, we identified some initial factors that can influence and determine acceptance of technology. We also proposed a theoretical model that can be tested using these initial factors. In order to be complete, we suggested a proposed methodology for testing the model.

VIII. REFERENCES:
[1] Davies, F. D., Bagozzi, R. P., & Warshaw, P. R. (1989). User acceptance of computer technology: A comparison of two theoretical models. Communications of the ACM, 35(8), 9821003. Davis, G. B. (1985). A typology of management information systems users and its implication for user information satisfaction research. Paper presented at the 21st Computer Personnel Research Conference, Minneapolis. Dyer, O. (2003). Patients will be reminded of appointments by text messages. British Medical Journal, 326(402), 281.

[2]

[3]

15

Solutions to Security and Privacy Issues in Mobile Social Networking

Abstract Social network information is now being used in ways for which it may have not been originally intended. In particular, increased use of smartphones capable of running applications which access social network information enable applications to be aware of a users location and preferences. However, current models for exchange of this information require users to compromise their privacy and security. We present several of these privacy and security issues, along with our design and implementation of solutions for these issues. Our work allows location-based services to query local mobile devices for users social network information, without disclosing user identity or compromising users privacy and security. We contend that it is important that such solutions be accepted as mobile social networks continue to grow exponentially.

IX. INTRODUCTION

ur focus is on security and privacy in locationaware mobile social network (LAMSN) systems. Online social networks are now used by hundreds of millions of people and have become a major platform for communication and interaction between users. This has brought a wealth of information to application developers who develop on top of these networks. Social relation and preference information allows for a unique breed of application that did not previously exist. Furthermore, social network information is now being correlated with users physical locations, allowing information about users preferences and social relationships to interact in real-time with their physical environment. This fusion of online social networks with real-world mobile computing has created a fast growing set of applications that have unique requirements and unique implications that are not yet fully understood. LAMSN systems such as WhozThat [1] and Serendipity [2] provide the

infrastructure to leverage social networking context within a local physical proximity using mobile smartphones. However, such systems pay little heed to the security and privacy concerns associated with revealing ones personal social networking preferences and friendship information to the ubiquitous computing environment. We present significant security and privacy problems that are present in most existing mobile social network systems. Because these systems have not been designed with security and privacy in mind, these issues are unsurprising. Our assertion is that these security and privacy issues lead to unacceptable risks for users of mobile social network systems. We make three main contributions in this paper. a) We identify three classes of privacy and security problems associated with mobile social network systems: (1)direct anonymity issues, (2) indirect or K-anonymity issues, and (3) eavesdropping, spoofing, replay, and wormhole attacks. While these problems have been examined before in other contexts, we discuss how these problems present unique challenges in the context of mobile social network systems. We motivate the need for solutions to these problems. b) We present a design for a system, called the identity server, that provides solutions for these security and privacy problems. The identity server adapts established privacy and security technologies to provide novel solutions to these problems within the context of mobile social network systems. We describe our implementation of the identity server.

X. BACKGROUND In this section we provide the reader with a short introduction to work in the area of mobile social

16
networking and the technologies that have made it possible. 2.1 MOBILE COMPUTING Smartphones now allow millions of people to be connected to the Internet all the time and support mature development environments for third-party application developers. Recently there has been a dramatic rise in usage of smartphones, those phones capable of Internet access, wireless communication, and supporting development of third-party applications. This rise has been due largely to the iPhone and iPod Touch. 2.2 SOCIAL NETWORKS The growth of social networks has exploded over the last year. In particular, usage of Facebook has spread internationally and to users of a wide age range. According to Facebook.coms statistics page, the site has over 200 million active users [4] [5], of which ove 100 million log on To compare this with Com Scores global Internet usage statistics [6], this would imply that nearly 1 in 10 of all Internet users log on to Facebook everyday and that he active Facebook Internet population is larger than any single countrys Internet population (China is the largest with 179.7 million Internet users [6]). 2.3 PRIVACY AND SECURITY The work described in this paper draws on some previous privacy research in both location-based services and social networks [12] [13]. This prior work does not approach the same problem as addressed in this paper, however the mechanisms used in these papers may provide certain functions necessary to associate user preferences anonymously with user location for use in third-party applications. Our work, however, differs in that it seeks to hide the users identity while distributing certain personal information obtained from existing online social networks. XI. SECURITY AND PRIVACY PROBLEMS Peer-to-peer mobile social network systems, like WhozThat and Social Aware, exchange users social network identifiers between devices using shortrange wireless technology such as Bluetooth. In contrast to these systems, a mobile device in clientserver mobile social network systems, such as Bright kite and Loop, notifies a centralized server about the current location of the device (available via GPS, cell-tower identification, or other mechanisms). By querying the server, mobile devices in these clientserver systems can find nearby users, information about these nearby users, and other items of interest. 3.1 Direct Anonymity Issues The information exchange model of the mobile social network systems discussed previously provide little protection for the users privacy. These systems require the user to allow access to his or her social network profile information and at the same time associate that information with the users identity. For instance, Facebook applications generally require the user to agree to give the application access to his/her information through Face books API, intrinsically tying such information to the users identity. In a peer-to-peer context-aware mobile social network system such as Social Aware, we can track a user by logging the date and time that each mobile or stationary device detects the users social network ID. By collecting such logs, we can construct a history of the locations that a user has visited and the times of each visit, compromising the users privacy. Finally, given access to a users social network ID, someone else could access that users public information in a way that the user may not have intended by simply viewing that users public profile on a social network Web site. We conclude that clear text exchange of social networking IDs in systems such as WhozThat and Social Aware leads to unacceptable security and privacy risks, and allows the users anonymity to be easily compromised. We call such problems that directly compromise a users anonymity direct anonymity attacks. Direct anonymity attacks are also possible in client-server mobile social network systems. While users social network IDs are generally not directly exchanged between mobile devices in such systems, mobile or stationary devices can still track a user by logging the date and time that each device finds the user nearby. Since each device in these systems can find the social network user names and often full names of nearby users, the privacy of these users can be compromised. Thus, we have a direct anonymity issue - exposure of user names and locations in client-server systems allows the users anonymity to be compromised. 3.2. The Indirect or K-Anonymity Problem the indirect anonymity problem exists when a piece of information indirectly compromises a users identity. An example of this is when a piece of information unique to a user is given out, such as a list of the users favorite movies, this information might then be easily mapped back to the user. The Kanonymity problem occurs when n pieces of information or n sets of related information can be

17
used together to uniquely map back to a users identity. Furthermore, if a set of information can only be mapped to a set of k or fewer sets of users, the users anonymity is still compromised to a degree related to k. The challenge is to design an algorithm that can decide what information should and should not be given out in order to guarantee the anonymity of associated users. This problem is similar to previous K-anonymity problems related to the release of voter or hospital information to the public. However, it has been shown that by correlating a few data sets a high percentage of records can be reidentified. A paper by Sweeney shows how this reidentification process is done using voter records and hospital records [17]. The K-anonymity problem in this paper is unique in that the standard K-anonymity guarantees that released information cannot distinguish between k 1 individuals associated with the released information. However, the problem discussed here does not involve the release of personal records but rather sets of aggregated information that may relate to sets of individuals that may or may not be associated with the released information. Therefore, the K-anonymity guarantee for our problem refers to the minimal number of indistinguishable unique sets that are sufficient to account for all released information. More precisely there must be no more than k unique sets that are 1 not subsets of each other and all other sufficient sets are supersets of some of the minimal sets. This paper presents this K-anonymity problem informally and proposes a solution that is currently being explored and implemented by the authors, however it does not formally solve this problem, which is proposed as an important open problem in the area of mobile social network privacy. We argue that this problem is important because it would provide an alternative for users to take advantage of new mobile social network applications without compromising their privacy. The K-anonymity problem applies to both peer-to-peer and client server mobile social network systems, since both systems involve sharing a users social network profile data with other users of these systems 3.3 Eavesdropping, Spoofing, Replay, and Wormhole Attacks Once a users social network ID has been intercepted in a peer-to-peer mobile social network system, it can be used to mount a replay and spoofing attack. In a spoofing attack, a malicious user can masquerade as the user whose ID was intercepted (the compromised user) by simply sending (replaying) the intercepted ID to mobile or stationary devices that request the users social network ID. Thus, the replay attack, where the compromised users ID is maliciously repeated, is used to perform the spoofing attack. Another specific type of replay attack is known as a wormhole attack [18], where wireless transmissions are captured on one end of the network and replayed on another end of the network. these attacks could be used for a variety of nefarious purposes. For example, a malicious user could masquerade as the compromised user at a specific time and place while committing a crime. Clearly, spoofing attacks in mobile social networking systems present serious security risks. In addition to intercepting a users social network ID via eavesdropping of the wireless network, a malicious user could eavesdrop on information transmitted when a device requests a users social network profile information from a social network server. For example, if a mobile device in a peer-topeer system uses HTTP (RFC 2616) to connect to the Facebook API REST server [19] instead of HTTPS (RFC 2818), all user profile information requested from the Facebook API server is transmitted in clear text and can be intercepted. Interception of such data allows a malicious user to circumvent Face books privacy controls, and access private user profile information that the user had no intention to share. Eavesdropping, spoofing, replay, and wormhole attacks are generally not major threats to client-server mobile social network systems. These attacks can be defended against with the appropriate use of a robust security protocol such as HTTPS, in conjunction with client authentication using user names and passwords or client certificates. If a users social network login credentials (user name and password, or certificate) have not been stolen by a malicious user and the user has chosen an appropriately strong password, then it is nearly impossible for the malicious user to masquerade as that user. .

XII. SECURITY AND PRIVACY SOLUTIONS We have designed and implemented a system, called the identity server, to address the security and privacy problems described previously. Our system assumes that each participating mobile device has reasonably reliable Internet access through a wireless wide area network (WWAN) cell data connection or through a WiFi connection. Mobile devices that lack such an Internet connection will not be able to participate in our system. Furthermore, we assume that each participating mobile device has a short-range wireless network interface, such as either Bluetooth or WiFi,

18
for ad-hoc communication with nearby mobile and/or stationary devices. We describe the design and implementation of the identity server in this section 4.1Design of the Identity Server and Anonymous Identifier As discussed in subsections III-A and III-C, the clear text exchange of a users social network ID presents significant privacy and security risks [20]. To address these risks, we propose the use of an anonymous identifier, or AID. The AID is a nonce that is generated by a trusted server, called the identity server (IS). Before a users mobile device advertises the users presence to other nearby mobile and stationary devices, it securely contacts the IS to obtain the AID. The IS associates the newly generated AID with the mobile device that requested the AID, and then returns the new AID to the mobile device. The users mobile device then proceeds to share this AID with a nearby mobile and/or stationary device by launching a Bluetooth AID sharing service. After a nearby mobile or stationary device (device B) discovers this AID sharing service on the users mobile device (device A), device B establishes a connection to the users mobile device to obtain the shared AID. After the AID has been obtained by device B, device A requests another AID from the IS. This new AID will be shared with the next mobile or stationary device that connects to the AID sharing service on device A. While our design and implementation uses Bluetooth for AID sharing, we could also implement AID sharing using WiFi After the device B obtains the shared AID from device A, device B then proceeds to query the IS for the social network profile information for the user that is associated with this AID. Figure 1 shows the role of the IS in generating AIDs and processing requests for a users social network information. Once the social network information for an AID has been retrieved by the IS, the IS removes this AID from the list of AIDs associated with the mobile user. Before the users mobile device next advertises the users presence using the Bluetooth AID sharing service, it will obtain a new AID from the IS as described above. We permit multiple AIDs to be associated with a mobile user, which allows for multiple nearby mobile or stationary devices to obtain information about the user. To improve efficiency, the users mobile device may submit one request for multiple AIDs to the IS, and then proceed to share each AID one at a time with other nearby devices. The IS sets

a timeout value for each AID when the AID is created and provided to a users mobile device. An AID times out if it is not consumed within the timeout period, that is, if the IS has not received a query for social network profile information for the user associated with this AID within the timeout period. Upon timeout of an AID, the IS removes the AID from the list of AIDs associated with the user. We use AID timeouts to prevent the list of AIDs associated a user from growing without bound. The use of AIDs in our system provides important privacy features for mobile users. Since the mobile device shares only AIDs with other devices, a malicious user who has intercepted these AIDs cannot connect these AIDs to a particular users social network identity. Furthermore, the IS does not support the retrieval of certain personally identifiable information from a users social network profile, such as the users full name, email address, phone number, etc. Since the IS does not support the retrieval of personally identifiable information, a device that retrieves social network information for the user associated with an AID is unable to connect the AID to the users social network identity. Thus, only by compromising the IS can a malicious user tie an AID to a users social network ID. We assume that the IS is a secure and trusted system, and that compromising such a system would prove to be a formidable task. The use of IS and AIDs as we have described solves the direct anonymity problem. As the reader will see in subsection IV-C, the IS also addresses the indirect anonymity problem by providing a K-anonymity

19
guarantee for information returned from users social network profiles. 4.2. Implementation of the Identity Server All IS services accessed by mobile and/or stationary devices are exposed as web services conforming to the 4.3 Trust Networks and Onion Routing One way to support privacy in social network applications is to transfer information using a trusted peer-to-peer network [29]. Such a network would require a trust network much like that used by Katz and Gold beck [30] in which social networks provided trust for default actions on the. Moreover, in a mobile social network application, nodes could not only share their information directly but could give permission to their trusted network to share their information. This approach was used in the One Swarm [31] system to allow peer-to-peer file sharing with privacy settings that allowed the user to share data publicly, just with friends, or even with a chosen subset of those friends. However, such a model has obvious problems if any nodes are compromised since information is easily associated with its source. These peer-to-peer networks could be made anonymous through the use of onion routing [32]. The Tor network [33] uses onion routing to allow nodes to send data anonymously. Through the use of layers of encryption that are decrypted at selected routers along a virtual route, routing nodes cannot directly relate the information at the destination to its source. If data was shared in this manner it would not be so easy to identify the source of the information, protecting the direct anonymity of the user. We are currently exploring the use of trust networks and onion routing in terms of taking a more decentralized approach to protecting user anonymity that does not require trust of the social network (such as Facebook itself [29].

REST architecture [21]. We used the open source Reset framework [22] for Java to develop the IS. We expose each resource on the IS, including a mobile users AID, a mobile users current location, and the Facebook profile information for a mobile user, as separate URL-accessible resources supporting HTTP GET, POST, and PUT methods as appropriate. Figure 2 shows the web-accessible resources exposed on the IS, along with the HTTP methods supported by each resource. The body of each HTTP request is encoded using JSON (RFC 4627). All web service network traffic between the IS and other mobile/stationary devices is encrypted using HTTPS, and access to all resources is authenticated using HTTP basic access authentication (RFC 2617). Each mobile user must sign up for a user account on the IS prior to participation in our system. During the signup process, the user provides his/her Facebook user ID (we can obtain this using Facebook Connect [23]), and chooses a user name and password. The users user name and password are securely stored on the users mobile device, and are used to authenticate with the IS and obtain access to the guarded web resources on the IS for the devices current location, the users AID, and the users Facebook profile information. Access to the web resources for the users AID and current location is available only to the user herself/himself, and no other entity save for the logic implemented on the IS. Access to the web resource for the users Facebook profile information (we call this user user A) is provided to any authenticated user with a user account on the IS, provided that the authenticated users device is within an acceptable range of user As mobile device. See below for more information on location-based access control for a users Facebook profile.

20
CONCLUSION We have identified several important privacy and security issues associated with LAMSN systems, along with our work on novel solutions for these issues. Our solutions support anonymous exchange of social network information with real world locationbased systems, enabling context-aware systems that do not compromise users security and privacy. We hope that our work will convince users and developers that it is possible to move forward with creative mobile social network applications without further compromising user security and privacy. REFERENCES [1] N. Eagle and A. Pentland, Social serendipity: Mobilizing social software, [2] Global internet use reaches 1 billion, http://www.comscore.com/press/ release.asp?press=2698. [3] C. M. Gartrell, Social aware: Context-aware multimedia presentation via mobile social networks, Masters thesis, University of Colorado at Boulder, December 2008 applic [4] E. Miluzzo, N. D. Lane, S. B. Eisenman, and A. T. Campbell, Cenceme - injecting sensing presence into social networking applications, in Proceedings of the 2nd European Conference on Smart Sensing and Context (EuroSSC 2007), October 2007. [5] Brightkite, http://brightkite.com. [6] Loopt, http://www.loopt.com. [7] A. Tootoonchian, K. K. Gollu, S. Saroiu, Y. Ganjali, and A. Wolman, Lockr: social access control for web 2.0, in WOSP 08: Proceedings of the first

21

Wireless Monitoring Of The Green House Using ATMEGA Based Monitoring System: WSN Approach

Abstract in the present paper, authors have given an emphasis on WSN approach for green house monitoring and control. A control system is developed and tested using recent atmega microcontroller. The farmers in the developing countries can easily use designed for maximising yield. Atmega microcontrollers are preferred over other microcontrollers due to some important features including 10- bit ADC, sleep mode , wide input voltage range and higher memory capacity. Index Terms WSN, AVR, MICROCONTROLLERS, GREEN PRECISION AGRICULTURE RF2.4, HOUSE,

of the automation system architecture in modern greenhouses. Wireless communication can be used to collect the measurements and to communicate between the centralized control and the actuators located to the different parts of the greenhouse. In advanced WSN solutions, some parts of the control system itself can also be implemented in a distributed manner to the network such that local control loops can be formed. Compared to the cabled systems, the installation of WSN is fast, cheap and easy. Moreover, it is easy to relocate the measurement points when needed by just moving sensor nodes from one location to another within a communication range of the coordinator device. If the greenhouse flora is high and dense, the small and light weight nodes can even be hanged up to the plants branches. WSN maintenance is also relatively cheap and easy. The only additional costs occur when the sensor nodes run out of batteries and the batteries need to be charged or replaced, but the lifespan of the battery can be several years if an efficient power saving algorithm is applied. The research on the use of WSN in agriculture is mainly focused primarily on areas such as Proof-ofconcept applications to demonstrate the efficiency and efficacy of using sensor networks to monitor and control agriculture management strategies. The attempt is made by the authors to show the effective utilization of this concept into day to day monitoring of the green house for higher yield. II. RF COMMUNICATION AND MONITORING OF THE GREEN HOUSE PARAMETERS RF is the wireless transmission of data by digital radio signals at a particular frequency. RF

I.

INTRODUCTION

recent survey of the advances in wireless sensor network applications has reviewed a wide range of applications for these networks and identified agriculture as a potential area of deployment together with a review of the factors influencing the design of sensor networks for this application. WSN is a collection of sensor and actuators nodes linked by a wireless medium to perform distributed sensing and acting tasks. The sensor nodes collect data and communicate over a network environment to a computer system, which is called, a base station. Based on the information collected, the base station takes decisions and then the actuator nodes perform appropriate actions upon the environment. This process allows users to sense and control the environment from anywhere. There are many situations in which the application of the WSN is preferred, for instance, environment monitoring, product quality monitoring, and others where supervision of big areas is necessary. Wireless sensor network (WSN) form a useful part

22
communication works by creating electromagnetic waves at a source and being able to send the electromagnetic waves at a particular destination. These electromagnetic waves travel through the air at near the speed of light. The advantages of a RF communication are its wireless feature so that the user neednt have to lay cable all over the green house. Cable is expensive, less flexible than RF coverage and is prone to damage. RF communication provides extensive hardware support for packet handling, data buffering, burst transmissions, clear channel assessment and link quality. A. FEATURES a) Low power consumption. b) High sensitivity (type -104dBm) c) Programmable output power -20dBm~1dBm d) Operation temperature range -40~+85 deg C e) Operation voltage: 1.8~3.6 Volts. f) Available frequency at : 2.4~2.483 GHz B. APPLICATIONS a) Wireless alarm and security systems b) AMR-automatic Meter Reading c) Wireless Game Controllers. d) Wireless Audio/Keyboard/Mouse C. PROPOSED RF COMMUNICATION BASED GREEN HOUSE PARAMETER MONITORING HARDWARE In the proposed hardware, there would be two section master and slave. The slave part would contain the temperature and humidity sensor. The sensor would be connected to the AVR microcontroller. The RF transceiver would be connected to the AVR microcontroller which would wirelessly send the data to the master part. The master part would contain the RF transceiver which would receive the data and give to the microcontroller. The count would be displayed on the graphics LCD. The motor and DC fan would also be connected to the master board. These motor and DC fan would be accordingly controller based upon the relevant temperature and humidity condition. The major components of the proposed hardware,as seen in fig.1, are, Microcontroller - AVR- Atmega 16, Atmega 32. Compiler : AVR studio Range - 150 meter Master and Slave communication: 247 slaves. Sensor: Temperature : LM35 and Humidity sensor

III.

WHY TO USE ATMEGA MICROCONTROLLER?

There are several features of Atmega microcontroller as given below which makes it an ideal choice for green house parameter monitoring. A. FEATURES a) High-performance, Low-power AVR 8-bit Microcontroller b) Advanced RISC Architecture c) High Endurance Non-volatile Memory segments 16K Bytes of In-System Self-programmable Flash program memory 512 Bytes EEPROM 1K Byte Internal SRAM d) Peripheral Features Two 8-bit Timer/Counters with Separate Prescalers and Compare Modes One 16-bit Timer/Counter 8-channel, 10-bit ADC e) Special Microcontroller Features Power-on Reset and Programmable Brown-out Detection Internal Calibrated RC Oscillator External and Internal Interrupt Sources B.ARCHITECTURAL DESCRIPTION The ATmega16 provides the following features: 16K bytes of In-System Programmable flash Program memory with Read-While-Write capabilities, 512 bytes EEPROM, 1K byte SRAM, 32 general purpose I/O lines, 32 general purpose working registers, a JTAG interface for Boundary scan, On-chip Debugging support and programming, three flexible

23
Timer/Counters with compare modes, Internal and External Interrupts, a serial programmable USART, a byte oriented Two-wire Serial Interface, an 8channel, 10-bit ADC with optional differential input stage with programmable gain (TQFP package only), a programmable Watchdog Timer with Internal Oscillator, an SPI serial port, and six software selectable power saving modes. The Idle mode stops the CPU while allowing the USART, Two-wire interface, A/D Converter, SRAM; Timer/Counters, SPI port, and interrupt system to continue functioning. The Power-down mode saves the register contents but freezes the Oscillator, disabling all other chip functions until the next External Interrupt or Hardware Reset. In Power-save mode, the Asynchronous Timer continues to run, allowing the user to maintain a timer base while the rest of the device is sleeping. The ADC Noise Reduction mode stops the CPU and all I/O modules except Asynchronous Timer and ADC, to minimize switching noise during ADC conversions. In Standby mode, the crystal/resonator Oscillator is running while the rest of the device is sleeping. This allows very fast start-up combined with low-power consumption. In Extended Standby mode, both the main Oscillator and the Asynchronous Timer continue to run.The device is manufactured using Atmels high density nonvolatile memory technology. The Onchip ISP Flash allows the program memory to be reprogrammed in-system through an SPI serial interface, by a conventional nonvolatile memory programmer, or by an On-chip Boot program running on the AVR core. The boot program can use any interface to download the application program in the Application Flash memory. Software in the Boot Flash section will continue to run while the Application Flash section is updated, providing true Read-While-Write operation. By combining an 8-bit RISC CPU with In-System Self-Programmable Flash on a monolithic chip, the Atmel ATmega16 is a powerful microcontroller that provides a highly-flexible and cost-effective solution to many embedded control applications. IV. WHY RF 2.4?

(Abbreviations: SOC: System-on-Chip, Network Processor, TXRX: Transceiver)

NP:

Table II. Part number, minimum and maximum frequency range, operating voltage and description

In nutshell, the advantages of RF 2.4 are, a) Low power consumption. b) Integrated data filters. c) High sensitivity d) Operation temperature range -40~+85 deg C e) Available frequency at : 2.4~2.483 GHz- No certification f) Required from government V. DETILS OF THE SENSORS USED

The important features given below in table I and table II make RF 2.4 an ideal choice for green house parameter Monitoring Table I. Part number, status, device type, frequency range and sensitivity

A. TEMPERATURE SENSOR The LM35 series, shown in fig.3, are precision integrated-circuit temperature sensors, whose output voltage is linearly proportional to the Celsius (Centigrade) temperature. The LM35 thus has an advantage over linear temperature sensors calibrated in Kelvin, as the user is not required to subtract a large constant voltage from its output to obtain convenient Centigrade scaling. The LM35 does not require any external calibration or trimming to provide typical accuracies of 1.4C at room temperature and 3.4C over a full -55 to +150C

24
temperature range. Low cost is assured by trimming and calibration at the wafer level. The LM35s low output impedance, linear output, and precise inherent calibration make interfacing to readout or control circuitry especially easy. It can be used with single power supplies, or with plus and minus supplies. As it draws only 60 A from its supply, it has very low self-heating, less than 0.1C in still air. The LM35 is rated to operate over a -55 to +150C temperature range, while the LM35C is rated for a -40 to +110C range (-10 with improved accuracy). The LM35 series is available packaged in hermetic TO-46 transistor packages, while the LM35C, LM35CA, and LM35D are also available in the plastic TO-92 transistor package. Fig. 4 shows the typical use of IC temperature sensor in the green house control system using AVR microcontroller. VI. DESIGN OBJECTIVES

Fig.2. Typical use of IC Temperature Sensor B. HUMIDITY SENSOR (HIH-3610 SERIES) Following are the features of humidity sensor selected for this design. a) Linear voltage output vs %RH b) Chemically resistant (output is not disturbed due to the presence of chemicals in the air). c) The HIH-3610 Series humidity sensor is designed specifically for high volume OEM (Original Equipment Manufacturer) users. d) Direct input to a controller due to sensors linear voltage output. Table III shows the available humidity sensors for green house application. Table III. Available Humidity Sensors for green house applications

The horticulturists near Nasik region felt the need of some automatic controller for their green houses where they grow export quality roses. The atmosphere in India change with great variance with the season. Hence, the quality of the roses does not remain the same due to the great change in the temperature and humidity parameters. Roses with adverse quality give less income. The loss in the income due to adverse quality roses is to the tune of 2 to 3 lakhs per acre per season. For roses, ideally, the green house should provide good light throughout the year, temperature range between 15 to 28C, night temperature should be between 15 to 18C, and the day temperature should not exceed 30C in any case. The growth is slowed down with the fall of temperature below 15C. If the temperature rises above 28C, humidity must be kept high. Higher night temperature above 15C hastens flower development, while lower temperature around 13.5C delays it. Depending on the temperature inside the greenhouse, the moisture should be kept in line for the best results. For example, if the temperature is 24 degrees, 60% humidity is suitable. Hence, variable temperature and humidity control for different crops using wireless technique for WSN environment using low cost technique was the main objective. Low power consumption during testing was another objective. Hence, selection of the sensors and most importantly, microcontroller, was very important keeping power consumption at remote places in view. To bring the temperature within control limit, exhaust fans were made automatically ON and for humidity control, water pump was made ON-OFF. VII. PROGRAMMING

Embedded C is used for the programming. Fig. 5 shows the the programming window of the AVR studio software used during programming.

25
IX. Results

Fig.3. AVR studio window during programming Following are some important features of the AVR studio. A) Integrated Development Environment (Write, Compile and Debug) B) Fully Symbolic Source-level Debugger C) Extensive Program Flow Control Options D) Language support: C, Pascal, BASIC, and Assembly VIII. FIELD OBSERVATIONS

Results are found to be satisfactory. Area A and B (each admeasuring 10 meters x 10 meters) were selected. Area A is used to take reading without temperature and humidity control. Readings in Area B were taken after suitable automatic control action with the help of AVR based green house controller. It is found that the designed hardware has shown consistently faithful readings and also proved to be accurate in the humid atmosphere of the green house. Following readings and graphs show some of the readings in Area A and corresponding readings after corrective action in Area B. Table IV. Readings taken in the green house near Nasik before and after control action. Fan is automatically ON after temperature in AREA is more than 30 C and Motor is ON after R. Humidity is less than 50%.

Readings were taken for 15 days. ON-OFF action of the hardware was tested. Satisfactory results were achieved. Fig. 6, 7, 8 and 9 show the photographs of the green house structures used to take readings.

Fig. 5: Green house parameters in AREA A (without control action) Fig. 4: Green house near Nasik growing roses: Complete sections of the green are seen. Vents are used to regulate the temperature, naturally.

26
range of 4-5 lakhs per acre. REFERENCES Network, European Journal of Scientific Research ISSN 1450-216X , Vol.33 No.2 (2009), pp.249-260 Euro Journals Publishing, Inc. 2009. [2] The Greenhouse Remote Monitoring System based on RCM2100, WANG Juan WANG Yan College of Mechanical and Electric Engineering, Agricultural University of Hebei, Baoding,071001, China denis0695@163.com. [3] A Study on the Greenhouse Auto Control System Based on Wireless Sensor Network, BeomJin Kang Dae, Heon Park, KyungRyung Cho, ChangSun Shin, SungEon Cho , JangWoo Park IEEE, 22 December 2008. [4] Anil Kumar Singh, Precision farming Water technology center, New Delhii [5] Debashis Mandal and S. K. Ghosh, Precision Farming [6] H. J. Hellebrand, H. Beuche, K.H. Dammer, Precision Agriculture [7] S. M. Swinton and J. Lowenbergdeboer, Precision Agriculture. [8] Mahmoud Omid , A Computer-Based Monitoring System to Maintain Optimum Air Temperature and Relative Humidity in Greenhouses [9] Teemu Ahonen, Reino Virrankoski and Mohammed Elmusrati, Greenhouse Monitoring with Wireless Sensor Network [10] Andrzej Pawlowski, Jose Luis Guzman, Francisco Rodrguez, Manuel Berenguel, Jos Snchez and Sebastin Dormido, Simulation of Greenhouse Climate Monitoring and Control with Wireless Sensor Network and Event-Based Control [11] Candido, F. Cicirelli, A. Furfaro, and L. Nigro, Embedded real-time system for climate control in a complex greenhouse
[1]

Fig. 6: Green house parameters in AREA B (with control action). Humidity values are increased and temperature values are decreased due to automatic control action of AVR based wireless green house controller.

X.

CONCLUSION

A. Low cost and maintenance free sensors are used to monitor environment. The system has several advantages in term of its compact size, low cost and high accuracy. B. The green house system considers design optimization and functional improvement of the system. C. The same system can be used to monitor industrial parameters also. D. The system developed has shown consistency, accuracy and precise control action over a period of 15 days and did not fail even once during testing.. E. Quality of roses in area B found to good than area A. F. Owner of the green house said that the good quality roses are sold at 1.5 times higher rate than medium quality roses. Hence, the system, if implemented, can increase the profit margin. G. The cost of the system is less than Rs. 2500/- if produced in multiple. H. For one acre green house, we need only 5 sets of AVR based green house controllers. I. Projected increase in the profit is in the

27

Fuzzy C- Mean Algorithm Using Different Variants

Abstract Clustering can be considered the most important unsupervised learning problem; so, as every other problem of this kind, it deals with finding a structure in a collection of unlabeled data. A loose definition of clustering could be the process of organizing objects into groups whose members are similar in some way. A cluster is therefore a collection of objects which are similar between them and are dissimilar to the objects belonging to other clusters. A group of the same or similar elements gathered or occurring closely together a bunch: Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data So, the goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. This method is frequently used in pattern recognition. It is based on minimization of the following objective function: Index Terms clustering analysis, fuzzy clustering, fuzzy c- mean, genetic algorithm.

XIII. INTRODUCTION

lustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data So, the goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. But how to decide what constitutes a good clustering? It can be shown that there is no absolute best criterion which would be independent of the final aim of the clustering. Consequently, it is the user which must supply this criterion, in such a way that the result of the clustering will suit their needs. For instance, we could be interested in finding representatives for homogeneous groups (data reduction), in finding natural clusters and describe their unknown properties (natural data types), in finding useful and suitable groupings (useful data classes) or in

finding unusual data objects (outlier detection The main requirements that a clustering algorithm should satisfy are:scalability;dealing with different types of attributes; discovering clusters with arbitrary shape; minimal requirements for domain knowledge to determine input parameters; ability to deal with noise and outliers; insensitivity to order of input records; high dimensionality; interpretability and usability There are a number of problems with clustering. Among them: Current clustering techniques do not address all the requirements adequately (and concurrently); Dealing with large number of dimensions and large number of data items can be problematic because of time complexity; The effectiveness of the method depends on the definition of distance (for distance-based clustering);If an obvious distance measure doesnt exist we must define it, which is not always easy, especially in multi-dimensional spaces; The result of the clustering algorithm (that in many cases can be arbitrary itself) can be interpreted in different ways Clustering algorithms may be classified as listed below: Exclusive Clustering Overlapping Clustering Hierarchical Clustering Probabilistic Clustering

In the first case data are grouped in an exclusive way, so that if a certain datum belongs to a definite cluster then it could not be included in another cluster. A simple example of that is shown in the figure below, where the separation of points is achieved by a straight line on a bi-dimensional plane. On the contrary the second type, the overlapping clustering, uses fuzzy sets to cluster data, so that each

28
point may belong to two or more clusters with different degrees of membership. In this case, data will be associated to an appropriate membership value. Instead, a hierarchical clustering algorithm is based on the union between the two nearest clusters. The beginning condition is realized by setting every datum as a cluster. After a few iterations it reaches the final clusters. Finally, the last kinds of clustering use a completely probabilistic approach Clustering algorithms can be applied in many fields, for :Marketing: finding groups of customers with similar behavior given a large database of customer data containing their properties and past buying records; Biology: classification of plants and animals given their features; Libraries: book ordering; Insurance: identifying groups of motor insurance policy holders with a high average claim cost; identifying frauds; City-planning: identifying groups of houses according to their house type, value and geographical location; Earthquake studies: clustering observed earthquake epicenters to identify dangerous zones; WWW: document classification; clustering weblog data to discover groups of similar access patterns with the update of membership uij and the cluster centers cj by:

(2)

(3) This iteration will stop when we generate the objective , where is a termination criterion between 0 and 1, whereas k are the iteration steps. This procedure converges to a local minimum or a saddle point. As already told, data are bound to each cluster by means of a Membership Function, which represents the fuzzy behaviour of this algorithm. To do that, we simply have to build an appropriate matrix named U whose factors are numbers between 0 and 1, and represent the degree of membership between data and centers of clusters of Jm.the cluster, and ||*|| is any norm expressing the similarity between any measured data and the center. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above, with the update of membership uij and the cluster centers cj Then create the algorithm of fuzzy c-mean for the next form of conversion in the search form to iteration the step of algorithm. The algorithm is composed of the following steps:
Initialize U=[uij] matrix, U(0) At k-step: calculate the centers vectors C(k)=[cj] with U(k)

XIV. FUZZY C-MEANS CLUSTERING ALGORITHM Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. This method (developed by Dunn in 1973 and improved by Bezdek in 1981) is frequently used in pattern recognition. It is based on minimization of the following objective function:

(1) where m is any real number greater than 1, uij is the degree of membership of xi in the cluster j, xi is the ith of d-dimensional measured data, cj is the ddimension center of the cluster, and ||*|| is any norm expressing the similarity between any measured data and the center. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above,

Using 1 and 2 Update U(k) , U(k+1)

29
choosing an initialization for the c-means clustering algorithms. Experiments use six data sets, including the Iris data, magnetic resonance and color images. The genetic algorithm approach is generally able to find the lowest On data sets with several local extreme, the GA approach always avoids the less desirable solutions. Deteriorate partitions are always avoided by the GA approach, which provides an effective method for optimizing clustering models whose objective function can be represented in terms of cluster centers. The time cost of genetic guided clustering is shown to make series of random initializations of fuzzy/hard c-means, where the partition associated with the lowest J value is chosen, and an effective competitor for many clustering domains. The subtractive clustering method assumes each data point is a potential cluster center and calculates a measure of the likelihood that each data point would define the cluster center, based on the density of surrounding data points.

If || U(k+1) - U(k)||< step 2.

then STOP; otherwise return to

Advantage:-i). Gives best result for overlapped data set and comparatively better than k-mean algorithm.

XV. VARIANTS IN FUZZY C-MEAN The most widely used clustering algorithm implementing the fuzzy philosophy is Fuzzy CMeans (FCM), initially developed by Dunn and later generalized by Bezdek, who proposed a generalization by means of a family of objective functions . Despite this algorithm proved to be less accurate than others, its fuzzy nature and the ease of implementation made it very attractive for a lot of researchers, that proposed various improvements and applications refer to . Usually FCM is applied to unsupervised clustering problems. i) Optimizing of Fuzzy C-Means Clustering Algorithm Using Genetic Algorithm (GA) PATTERN recognition is a field concerned with machine recognition of meaningful regularities in noisy or complex environments. In simpler words, pattern recognition is the search for structures in data. In pattern recognition, group of data is called a cluster [1].Fuzzy C-Means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters this method was developed by Dunn [2] in 1973 and improved by Bezdek [3] in 1981 and is frequently used in pattern recognition Thus what we want from the optimization is to improve the performance toward some optimal point or points[4]. Luus [5] identifies three main types of search methods: calculus- based, enumerative and random. Hall, L. O., Ozyurt, I. B. and Bezdek, J. C. [6] describe a genetically guided approach for optimizing the hard (j) fuzzy (j) c-means functional used in cluster analysis. Our experiments show that a genetic algorithm ameliorates the experiments difficulty of

Where m is any real number greater than 1, it was set to 2.00by Bezdek.uij is the degree of membership of xi in the cluster j; xi is the ith of d-dimensional measured data ; cj is the d-dimension center of the cluster and ||*|| is any norm expressing the similarity between any measured data and

Then go to the next step:-

30
the same tested data. Fig. 1 at Appendix A shows the flow chart of the program. A. Example 1 - Modeling a Two Input Nonlinear Function In this example, a nonlinear function was proposed: z = sin(x) * sin ( y) (7) The range X [-10.5, 10.5] and Y [-10.5. 10.5] is the input space of the above equation, 200 data pairs obtained randomly First the best least square error is obtained for the FCM of weighting exponent (m=2.00).Next the least square error of the subtractive clustering is obtained by iteration which was clusters since this error predefined if the error is less then. Then the cluster number is taken to the FCM algorithm, the error with 24 clusters and he weighting exponent (m).

The Genetic algorithm is a stochastic global search method that mimics the metaphor of natural biological evolution. GAs operates on a population of potential solutions applying the principle of survival of the fittest to produce (hopefully) better and better approximations to a solution [9, 10]. the individuals that they were created from, just as in natural adaptation. The algorithm is composed of the following steps:

Fig. 2 Random data points of equation (7); blue circles for the data to be clustered and the red stares for the testing data B. Example 2 - Modeling a One Input Nonlinear Function In this example, a nonlinear function was proposed also but with one variable x: y = sin(x) (8)

A complete program using MATLAB programming language was developed to find the optimal value of the weighting exponent. It starts by performing subtractive clustering for input-output data, build the fuzzy model using subtractive clustering and optimize the parameters by optimizing the least square error between the output of the fuzzy model and the output from the original function by entering a tested data. The optimizing is carried out by iteration. After that, the genetic algorithms optimized the weighting exponent of FCM. The same way, build the fuzzy model using FCM then optimize the weighting exponent m by optimizing the least square error between the output of the fuzzy model and the output from the original function by entering

The range X [-20.5, 20.5] is the input space of the above equation, 200 data pairs were obtained randomly and shown in fig 3 of the following diagram of Random data point equation

31
star class from the point class. On the other hand, it is easy to see that Fig. 2 is more crisp than Fig. 3 It illustrates that, for the classification of Iris database, features PL and PW are more important than SL and SW. Here we can think of that the weight assignment) 0, 0, 1, 1) is better than) 1, 1, 0, 0) for Iris database classification. With respect to FCM clustering, it is sensitive to the selection of distance metric. Zhao [12] stated that the Euclidean distance give good results when all clusters are spheroids with same size or when all clusters are well separated. In [13, 10], they proposed a GK algorithm which uses the well-known Mahalanobis distance as the metric in FCM. They reported that the GK algorithm is better than Euclidean distance based algorithms when the shape of data is considered. In [11], the authors proposed a new robust metric, which is distinguished from the Euclidean distance, to improve the robustness of FCM. Since FCMs performance depends on selected metrics, it will depend on the feature-weights that must be incorporated into the Euclidean distance. Each feature should have an importance degree which is called feature-weight. Feature-weight assignment is an extension of feature selection [17]. The latter has only either 0-weight or 1-weight value, while the former can have weight values in the interval [0.1]. Distance measures are studied and a new one is proposed to handle the different feature-weights. In section 5 we proposed the new FCM for clustering data objects with different feature-weights. Modified Distance Measure for the New FCM Algorithm Two distance measures are used in FCM widely in literature: Euclidian and Mahalanobis distance measure. Suppose x and y are two pattern vectors (we have introduced pattern vector in section 3). The Euclidian distance between x and y is: d2 (x, y) = (x-y) T (x-y) (14)

Fig. 3 Random data points of equation (8); blue circles for the data to be clustered and the red stares for the testing data Advantage: - Genetic algorithm provides higher resolution capability. The time needed to reach an optimum through genetic algorithm is less than the time needed by iterative approach. Genetic algorithm gives better performance and has less approximation error with less time. the subtractive clustering parameters, which are the radius, squash factor, accept ratio, and the reject ratio are optimized using the GA.The original FCM proposed by Bezdek is optimized using GA and another values of the exponent rather than(m =2) are giving less approximation error ii) A New Feature Weighted Fuzzy C-Means Clustering Algorithm The Goal of cluster analysis is to assign data points with similar properties to the same groups and dissimilar data points to different groups [3]. Generally, there are two main clustering approaches i.e. crisp clustering and fuzzy clustering. In the crisp clustering method the boundary between clusters is clearly defined. However, in many real cases, the boundaries between clusters cannot be clearly defined. Some objects may belong to more than one cluster. In such cases, the fuzzy clustering method provides a better and more useful method to cluster these objects [2].Fuzzy c-means (FCM) proposed by [5] and extended by [4] is one of the most wellknown methodologies in clustering analysis. Basically FCM clustering is dependent on the measure of distance between samples. A clustering based on PL and PW. , One can see that there are much more crossover between the star class and the point class. It is difficult for us to discriminate the

And the Mahalanobis distance between x and a center t (taking into account the variability and correlation of d2 (x, t, C) = (x - t)T C-1 (x- t) (15)

In Mahalanobis distance measure C is the co-variance matrix. Using co-variance matrix in Mahalanobis distance measure takes into account the variability and correlation of the data. To take into account the weight of the features in calculation of distance between two data points we suggest the use

32
of (x-y)m (modified (x-y)) instead of (x-y) in distance measure, whether it is Euclidian or Mahalanobis. (x-y)m is a vector that its ith element is obtained by multiplication of ith element of vector (x y) and ith element of vector FWA. So, with this modification, equ.14 and equ.15 will be modified to this form: d2m (x, y) = (x- y)tm (x y) d2m (17) (x y)m (i) = (x - y) (i) * FFWI (i) (18) (x,t,C) = (xt)tm C-1 (x(16) t)m set to be able to calculate and (having these parameters in hand, we can easily calculate the feature estimation index for each feature. see section 3). To have these clusters we apply FCM algorithm with Euclidian distance on the data set. The created clusters help us to calculate the FWA vector. This step, in fact, is a pre-computing step. In the next and final step, we apply our Feature weighted FCM algorithm on the data set, but here we use modified Mahalanobis distance in FCM algorithm.

We will use this modified distance measure in our algorithm of clustering data set with different feature- Since FCMs performance depends on selected metrics, it will depend on the featureweights that must be incorporated into the Euclidean distance. Each feature should have an importance degree which is called feature-weight. Feature-weight assignment is an extension of feature selection [17]. The latter has only either 0-weight or 1-weight value, while the former can have weight values in the interval [0.1]. Generally speaking, feature selection method cannot be used as feature-weight learning technique, but the inverse is right. To be able to deal with such cases, we propose a new FCM Algorithm that takes into account weight of each feature in the data set that will be clustered. After a brief review of the FCM in section 2, a number of features ranking methods are described in section 3. These methods will be used in determining FWA (feature weight assignment) of each feature. In section 4 distance measures are studied and a new one is proposed to handle the different feature-weights. In section 5 we proposed the new FCM for clustering data objects with different feature-weights. New Feature Weighted FCM Algorithm In this section we propose the new clustering algorithm, which is based on FCM and extend the method that is proposed by [15] for determining FWA of features and, moreover, uses modified Mahalanobis measure of distance, which takes into account the FWA of features in addition to variability of data. As mentioned before, despite FCM, this algorithm clusters the data set based on weights of features. In the first step of this algorithm we should calculate the FWA vector using method proposed in [15]. To do so, we need some clusters over the data

The result will be clusters which have two major difference with the clusters obtained in the first step. The first difference is that the Mahalanobis distance is used in fig 3. It means that the variability and correlation of data is taken into account in calculating the clusters. The second difference, that is the main contribution of this investigation, is that features weight index has a great role in shaping the clusters. Advantage:- We transformed the values into the FWA vector which its elements are in interval[0,1] and each element shows the relative significance of its peer feature. clustering on the data set in which weight of each feature plays a significant role in forming the shape of clusters. Distance into FCM clustering algorithm as the measure space; the other adapts the matriculated input vectors in FCM clustering algorithm. We put forward a modified algorithm on this basis, i.e. the algorithm can adapt more data modes by matriculating the data objects .At the same time, through the test and comparison on different data sets, it shows that the modified FCM clustering algorithm has a much better clustering result iii) A modified Algorithm Fuzzy C-Mean Clustering

Fuzzy C-Mean Clustering algorithm is widely used in data mining technology at present, and also is the

33
representative of fuzzy clustering algorithm.FCM clustering algorithm has perfect theory and profound mathematical basis[1]In Common FCM clustering algorithm itself also has some disadvantages. In common FCM clustering algorithm it generally adopts Euclidean Distance to measure the dissimilarity between objects so that it can easily discover the cluster with convex shapes[2].In some dats sets,the cluster cannot get a good result when data are often expressed by Euclidean Space.In order to adapt to more modes, Dij (m) =(xi-xj) T (xi-xj) (1) training data set the other half is as the test data set. Finally, compare and analyze the result of the test to determine the effectiveness of modified FCM clustering algorithm. Advantage:- We propose two modified programs for FCM clustering algorithm: one brings Mahalanobis distance into FCM clustering algorithm as the measure space; the other adapts the matriculated input vectors in FCM clustering algorithm. We put forward a modified algorithm on this basis i.e the algorithm can adapt more time through the test and comparison on different data sets, it shows that the modified FCM clustering algorithm has a much better clustering result. iv) An Efficient Fuzzy C-Mean Clustering Algorithm There are many fuzzy clustering methods being introduced [1].The fuzzy C-Mean (FCM) Algorithm is widelty used. It is based on the concept of fuzzy Cparition which was introduced by Ruspini[2],developed by Dunn [3],and generalized by Bezdek[4,5].The FCM algorithm and its derivatives have been used very successfully in many applications, such as pattern recognition[6]. ,classification[7],data mining[8],and image segmentation [9,10].It has also been used for data analysis and modeling [11,12]Normally ,the FCM algorithm consists of several execution steps.In the first step ,the Algorithm Sselects C initial cluster centers from the original dataset randomly.Then ,in later steps after several iterations of the algorithm ,the final result converges to the actual cluster center.Therefore ,choosing a good set of initial cluster centers is chosen, the algorithm may takes less iteration to find the actual cluster centers. In [13] propose the multistage random sampling FCM algorithm.It is based on the assumption that a small subset of a dataset of feature vectors can be used to approximate the cluster centers of the complete data set.Under this assumption FCM is used to compute the cluster centers of an appropriate size subset of the original database This efficient algorithm for improving the FCM is called the partition simplification FCM (psFCM).It is divided into two phases In phase 1st we first partition the data set into some small blocks cels using the k-d tree method [14] and reduce the original dataset into a simplified dataset with unit blocks as described in our work.[15]The FCM algorithm measures the quality of the partitioning by comparing the distance from

In accordance with the data type, data can be divided into numeric data and character data. The dissimilarity calculation method is usually just suitable for numeric attribute above [4] Design of Modified FCM Clustering Algorithm In common FCM clustering algorithm, the membership belonging to a particular cluster of data object is determined by the distance from the object to the certain cluster center. The membership of each data object indicates the degree of the data object belonging to the cluster . By using Mahalanobis Distance and Matriculated input vector to modify the FCM algorithm, we get the main study object in the paper. The objective function is as follow:-

J(U,C,X)=

d2 (xj,q) =(xj-ci)T Fi-1(xj-ci)

(2) (3)

Where d2 (xj, ci) =

Here we define the dissimilarity between objects by using Mahalanobis Distance,Where Fis an internal cluster internal tightness ,F matrix define as:

Fi=

(xj-ci)

ij

(4)

Firstly select a set of sample data sets Balance and Artifical where Balance Scale is a standard data set of UCI, Artificial is an artificial data set. Secondly implement the process by standard FCM algorithm MatFCM for vectors algorithm and MatFCM for matrices algorithm respectively. Then, test the clustering algorithm according to the different clusters.Due to the invisible data, we can only calculate the result of each algorithm Rec (Reception) and Rej (Rejection) by FRC (Fuzzy Relational Classifier) classification algorithm based on clustering which can determine the good or bad clustering effect; In which half of the data is as the

34
pattern xi to the current candidate cluster center wj with the distance from pattern xi to other andidate cluster centers.The objective function is an optimization function that calculates the weighted within-group sum of squared errors as follow[15]. Phase 1: Refine initial prototypes for fuzzy c-mean clustering Step 1: First we partition the dataset into unit blocks by using the k-d tree method.The splitting priority depends on the scattered degree of data values for each dimension.If one dimension has a higher scattered degree,it has a higher priority to be split.The scattered degree is defined as the distribution range and the standard deviation of the feature . Step 2: After splitting the data set into unit blocks we calculate the centroid for each unit block that contain some sample patterns.The centroid represent all sample pattern in this unit block.Then we use all of these centroid to denote the original pattern are represented by all the computed centroid In addition each centroid contains statistical information of the pattern in each unit block .These include the number of pattern in a unit block (WUB) and the linear sum of all patterens in a unit block When we scan the database the second time it also finds the statistics of each dimension.These statistics will be used when the algorithm calculate new candidate cluster centers, which improve the system performance. Step 3: Initialize the cluster center matrix by using a random generator from the dataset, record the cluster centers and set t=0 Step 4: Initialize the membership matrix U(0) by using functionwith the simplified data set Step 5:Increase t(i.e, t=t+1); compute a new cluster center matrix (candidate) W(i) by using function W(i)j Step 6: Compute the new membership matrix and simplified dataset and then go to phase 2 Phase 2: Find the actual cluster centers for the dataset Step 1: Initialize the fuzzy partition matrix U(0) by using the result of W(I) from Phase 1 with step (5) and (6) for the dataset x Step 2: Follow step 3 to 5 of the FCM algorithm discussed in Section 2 using The Stopping Condition. From the Experimental results described in the last section, the proposed psFCM algorithm has approximately the same speedup for the patterns of normal distribution as well as uniform distribution.In general our method works well for most kind of datasets. In Phase 1 of the psFCM algorithm the cluster centers found by using the simplified dataset is very close to the actual cluster centers. Phase 2 converges quickly if we use these cluster centers from phase 1 as the initial cluster centers of phase 2 .From the experiments, in most cases phase 2 converges in only a few iterations to converges if the stopping condition is smaller .However the number of patterns is used in phase 1 of the proposed algorithm is much smaller than FCM algorithm because Nps<<N. The psFCM algorithm divides a dataset into several unit blocks.The centroid of unit blocks replace the pattern and form a new dataset,the simplified dataset.As mentioned in Section 4,The simplified dataset decreases the complexity of computing the membership matrix from in every iteration.For a fair comparison the initialization in phase 1 of therandomely.We have also found in the psFCM algorithm is determined randomly.We have also found in the psFCM algorithm that an intial cluster center selected from a unit block with a higher density is closer to the actual cluster centere.This is a feature that cannot be found using the FCM algorithm and its derivatives .In future work, WE will study this feature more thoroughly. psFCM algorithm and the FCM algorithm is determined randomely.We have also found in the psFCM algorithm is determined randomly.We have also found in the psFCM algorithm that an intial cluster center selected from a unit block with a higher density is closer to the actual cluster centere.This is a feature that cannot be found using the FCM algorithm and its derivatives .In future work, WE will study this feature more thoroughly. Advantage: In efficient clustering that is better than the FCM algorithm.We reduce the computation cost and improve the performance by finding a good set of initial cluster centers. The index is defined based on the aggregated measure of separation between the classes in terms of class membership functions. The index value decreases with the increase in both the compactness of individual classes and the separation between the classes. To calculate the feature estimation index we passed a pre-computing step which was a fuzzy clustering using FCM with Euclidian Fuzzy C-Mean is one of the algorithms for clustering based on optimizing an objective function being sensitive to initial conditions v). The Algorithm Global Fuzzy C-Mean Clustering

35
There are many fuzzy clustering methods being introduced[2].Fuzzy C-Mean clustering algorithm is one of most important and popular fuzzy clustering algorithms.At present the FCM algorithm has been extensively used in feature analysis pattern recognition image processing classifier design etc ([3][4]).However the FCM clustering algorithm is sensitive to the situation of the initialization and easy to fall into the local minimum or a saddle point when iterating.To solve this problem several other techniques have been developed that are based global optimization methods (e.g genetic algorithm simulated annealing)[5-7].However in many practical applications the clustering method that is used is FCM with multiple restarts to escaping from the sensibility to initial value[8]clustering. In the following section we describe the proposed global fuzzy C-Mean algorithm starting conditions and always converge to a local minimum. In order to solve this problem we employ the FCM algorithms a local search and the FCM is scheduled differing in the initial positions of the cluster centers. Based on the k-meansalgorithm, in Ref. [9] they proposed if the k-1 centers placed at the optimal positions for the (k-1)-clustering problem and the remaining kth center placed at an appropriate position to be discovered, an optimal clustering solution with k clusters can be obtained through local search. Base on this assumption we proposed the global Fuzzy C-Means clustering algorithm. Instead of randomly selecting initial values for all cluster centers as is the case with most global clustering algorithms, the proposed technique proceeds in an incremental way attempting to optimally add one new cluster center at each stage. More specifically, we start with fuzzy 1-partition andind its optimal position which corresponds to the centroid ofthe data set X. For fuzzy 2-partition problem, the first initialcluster center is placed at the optimal position for fuzzy1-partition, while the second initial center at execution n isplaced at the position of the data point xn (n=1,,N ). Then we perform the FCM algorithm from each of these initial positions respectively, to obtain the best solution for fuzzy 2-partition. In general, let solution for fuzzy Cpartition. If we have found the solution for the fuzzy (C1)-partition problem, we perform the FCM algorithm with C clusters from each of this initial state. Comparison of the global Fuzzy C-Means algorithm to the FCM and the global k-means algorithm To validate the sensibility to initial value and the accuracy of the proposed algorithm, we conducted several experiments two artificial data sets and three real survey datasets. The three algorithms (FCM, GKM and GFCM) are compared on five data sets:(i) a synthetic data set consisting of ten clusters, with each cluster consisting of 30 points, and this data set is two-dimensional and is depicted . (ii) a synthetic data set consisting of fifteen clusters, with each cluster consisting of 20 points, and this data set is two-dimensional and is depicted in Fig.1(b); (iii) elect 63 data points from vowel data set, which consist of nine clusters and is10-dimensional; (iv) select 600 data points from set image dataset, which consist of six clusters and is 36-dimensional; (v) The main advantage of the algorithm is that it does not depend on any initial conditions and improves the accuracy of clustering.The algorithm is briefly summarized as follow: Step 1: Perform the FCM algorithm to find the optimal clustering centers v(1) of the fuzzy 1partition problem and let obj_1 be its corresponding value of the objective function found by(1). Step 2: Perform N runs of the FCM algorithm with c clusters where each run n starts from the initial state (V1*,,vc*,xn), and obtain their corresponding values of the objective functions and clustering centers. Step 3: Find the minimal value of the objection function obj_(c+1) and its corresponding clustering centers V(c+1) be the final clustering centers for fuzzy (c+1) partition Step 4: If c+1=c,stop;otherwise set c=c+1 and go to step 2. Advantage:Fuzzy C-Mean algorithms are not sensitive to initial value ,their clustering errors and accuracy of clustering are stable and the global Fuzzy C-Mean algorithm experimental result is better than the algorithm FCM. Fuzzy C-Mean clustering algorithm (GFCM), which is a global clustering algorithm for the minimization of the clustering error. This algorithm is an incremental approach to clustering, and we can obtain an optimal solution for fuzzy Cpartition through a series of local searches (FCM).At each local search we let optimal cluster centers for fuzzy (c-1)-partition problem be the (c-1) initial positions and an appropriate position within the data space be the remaining Cth initial position. The global FCM clusterining algorithm does not depend on any initial conditions, effectively escapes from the

36
sensibility to initial value and improve the accuracy of For each of the above presented data sets we executed 10times the three algorithms respectively, and summaries these results as show in Table I. Experimental results suggest that the global Fuzzy CMeans algorithms and the global k-means algorithms are not sensitive to initial value, their clustering errors and accuracy of clustering are stable, and the global Fuzzy C-Means algorithms experimental results is better than the global k-means algorithms and FCM. For the disadvantage of the global algorithms converging speed, we propose the fast global Fuzzy C-Means clustering algorithm, which significantly improves the convergence speed of the global Fuzzy C-Means clustering algorithm, which significantly improves the convergence speed of the global Fuzzy C-Means clustering algorithm, Inters of the fast global Fuzzy C-Means algorithm, it is very encouraging that, although executing significantly faster, forward a modified algorithm on this basis i.e. the algorithm can adapt more time through the test and comparison on different data sets, it shows that the modified FCM clustering algorithm has a much better clustering result. In efficient clustering that is better than the FCM algorithm. We reduce the computation cost and improve the performance by finding a good set of initial cluster centers. The index is defined based on the aggregated measure of separation between the classes in terms of class membership functions. The index value decreases with the increase in both the compactness of individual classes and the separation between the classes. Fuzzy C-Mean algorithms are not sensitive to initial value ,their clustering errors and accuracy of clustering are stable and the global Fuzzy C-Mean algorithm experimental result is better than the algorithm FCM. Fuzzy C-Mean clustering algorithm (GFCM), which is a global clustering algorithm for the minimization of the clustering error. This algorithm is an incremental approach to clustering, and we can obtain an optimal solution for fuzzy Cpartition through a series of local searches

XVI. COMPERSION Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. This method is frequently used in pattern recognition. It is based on minimization of the following objective function: The Genetic algorithm is a stochastic global search method that mimics the metaphor of natural biological evolution. GAs operates on a population of potential solutions applying the principle of survival of the fittest to produce (hopefully) better and better approximations to a solution. Genetic algorithm provides higher resolution capability. The time needed to reach an optimum through genetic algorithm is less than the time needed by iterative approach. Genetic algorithm gives better performance and has less approximation error with less time. the subtractive clustering parameters, In fuzzy weighted c-Mean we transformed the values into the FWA vector which its elements are in interval [0,1] and each element shows the relative significance of its peer feature. clustering on the data set in which weight of each feature plays a significant role in forming the shape of clusters. We propose two modified programs for FCM clustering algorithm: one brings Mahalanobis distance into FCM clustering algorithm as the measure space; the other adapts the matriculated input vectors in FCM clustering algorithm. We put

XVII. CONCLUSION Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. This method is frequently used in pattern recognition. It is based on minimization of the following objective function; the study of fuzzy cmean then we study of the different variants of fuzzy c-mean and compression in between them. The Genetic algorithm is a stochastic global search method that mimics the metaphor of natural biological evolution. GAs operates on a population of potential solutions applying the principle of survival of the fittest to produce (hopefully) better and better approximations to a solution. Genetic algorithm provides higher resolution capability. The time needed to reach an optimum through genetic algorithm is less than the time needed by iterative approach. Genetic algorithm gives better performance and has less approximation error with less time. the subtractive clustering parameters, In fuzzy weighted c-Mean we transformed the values into the FWA vector which its elements are in interval[0,1] and each element shows the relative significance of its peer feature. Clustering on the data set in which weight of each feature plays a significant role in forming the shape of clusters. We propose two

37
modified programs for FCM clustering algorithm: one brings Mahalanobis distance into FCM clustering algorithm as the measure space; the other adapts the matriculated input vectors in FCM clustering algorithm. We put forward a modified algorithm on this basis i.e. the algorithm can adapt more time through the test and comparison on different data sets, it shows that the modified FCM clustering algorithm has a much better clustering result. In efficient clustering that is better than the FCM algorithm. We reduce the computation cost and improve the performance by finding a good set of initial cluster centers. The index is defined based on the aggregated measure of separation between the classes in terms of class membership functions. The index value decreases with the increase in both the compactness of individual classes and the separation between the classes.Fuzzy C-Mean algorithms are not sensitive to initial value ,their clustering errors and accuracy of clustering are stable and the global Fuzzy C-Mean algorithm experimental result is better than the algorithm FCM. Fuzzy C-Mean clustering algorithm (GFCM), which is a global clustering algorithm for the minimization of the clustering error. This algorithm is an incremental approach to clustering, and we can obtain an optimal solution for fuzzy C-partition through a series of local searches
and its Use in Detecting Compact Well-Separated Clusters, Journal of Cybernetics 3;1973: 32-57. [7] Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, NY, 1981. [8] Beightler, C. S., Phillips, D. J., & Wild, D. J., Foundations of optimization (2nd ed.). (Prentice-Hall) Englewood Cliffs, NJ, 1979. [9] Luus, R. and Jaakola T. H. I., Optimization by Direct Search and Systematic Reduction of the Size of Search Region, AIChE Journal 1973; 19(4): 760 766 [10] Hall, L.O., Bensaid, A.M., Clarke, L.P., et al., 1992. "A comparison of neural network and fuzzy clustering techniques in segmentation magnetic resonance images of the brain". IEEE Trans. Neural Networks 3. [11] Hung M, D. ang D, 2001 "An efficient fuzzy c-means clustering algorithm". In Proc. the 2001 IEEE International Conference on Data Mining. [12] Han J., Kamber M., 2001 "Datamining: Concepts and Techniques". Morgan Kaufmann Publishers, San Francisco. [13] Pal S. K. and Pal A. (Eds.) 2002, "Pattern Recognition: From Classical to Modern Approaches". World Scientific, Singapore. [14] de Oliveira J.V., Pedrycz W., 2007, "Advances in Fuzzy Clustering and its Applications", John Wily & sons. [15] X. Wang, Y. Wang and L. Wang.,2004 "Improving fuzzy c-means clustering based on feature-weight learning", [16] Zhang J S,Leung Y W, Improved possibilistic c-mean clustering algorithm,IEEE Trans.on Fuzzy System, 2004, 12(2), PP: 209-227 [17] ZAHID N,Abouelala O,Limouri M,Essaid A. Fuzzy clustering based on K-nearest- neighbors rule,FuzzySets and System,2001,120 (1), pp:239-247. [18] Xin-Bo Gao. The Analysis and Application of Fuzzy clutering.XiDian University Press, 2004. [19] Jia-Wei Han, Michelins Kambber .Data Mining Concepts and Techniques.Mechanical Industry Press, 2002. [20] F. Hoppner, F. Klawonn, R. Kruse,and T.Runkler, Fuzzy cluster analysis Wiley Press New York,1999 [21] M.R. Reazaee ,P.M.J. Zwet, B.P.E Lelieveldt, R.J.Geest and J.H.C. Reiber, A multiresolution image segmentation technique based on pyramidal segmentation and fuzzy clustering. [23] P. Teppola, S.P. Mujunen, and P. Minkkinen, Adaptive fuzzy c-mean clustering in process monitor ring, Chemo metrics and Intelligent Laboratory System. [24] X. Change W. Li and J Farrell, A c-Mean clustering based fuzzy modeling method.

XVIII. REFERENCE:
[1] T. Kwok, R Smith, S. Lozano, and D.Taniar, Parallel Fuzzy c-mean clustering for larage data sets. [2] F. Hoppner, F. Klawonn, R. Kruse and T. Runkler, Fuzzy cluster analysis [3] M.C.Clark, L. O. Hall, MRI segmentation using fuzzy Clustering techniques integrating Knowledge. [4] Y.W.Lim, S.U.Lee, On the color Image Segmentation Algorithm Based on the Thresholding and the Fuzzy cMean Techniques. [5] Li-Xin Wang, A Course in Fuzzy Systems and Control, (Prentice Hall,Inc.) Upper Saddle River, NJ 07458; 1997: 342-353. [6] J. C. Dunn, A Fuzzy Relative of the ISODATA Process

38

Security Issues In Data Mining

Abstract In this article we discuss our research in developing general and systematic methods for intrusion detection. The key ideas are to use data mining techniques to discover consistent and useful patterns of system features that describe program and user behavior, and use the set of relevant system features to compute (inductively learned) classifiers that can recognize anomalies and known intrusions. The paper also discusses the current level of computer security development in Tanzania with particular interest in IDS application with the fact that approach is easy to implement with less complexity to computer systems architecture, less dependence on operating environment (As compared with other security-based systems) and ability to detect abuse of user privileges easily. The findings are geared towards developing security infrastructure and providing ICT services. Index Terms computer security; data mining; security; Intusion detection, ICT

XIX. INTRODUCTION

n the last decade there have been great advances in data mining research and many data mining methods have been applied to everyday business, such as market basket analysis, direct marketing and fraud detection. If we want to find something unusual about a system we need to know something about, expected behavior of the system, behavior of functionalities and whether there are additional, unwanted functionalities introduced. According to R.L Grossman in Data Mining: Challenges and Opportunities for Data Mining during the Next Decade, he defines data mining as being concerned with uncovering patterns, associations changes, anomalies, and statistically significant structures and events in data. Simply put it is the ability to take and pull from it patterns or deviations which may not be seen easily to the naked eye. The recent rapid development in data mining has made available a wide variety of algorithms, drawn from the fields of statistics, pattern recognition, machine learning, and database. Several types of algorithms are particularly relevant to our research . We provide an overview

general of what are intrusion detections for better understanding before we implement our approach. The data mining techniques can be used to compute the intra- and inter-packets record patterns, which are essential in describing program or user behavior. The discovered patterns can guide the audit data gathering process and facilitate feature selection. A. Why Data Mining to security? Applications of Data Mining in Computer Security concentrate heavily on the use of data mining in the area of intrusion detection. The reason for this is twofold. First, the volume of data dealing with both network and host activity is so large that it makes it an ideal candidate for using dat mining techniques. Second, intrusion detection is an extremely critical activity. An ideal application in intrusion detection will be to gather sufficient normal and abnormal audit data for a user or a program, then apply a classification algorithm to learn a classifier that will determine (future) audit data as belonging to the normal class or the abnormal class; B. Sequence Analysis Approach on models sequential patterns algorithms practices have been applied in our manuscript. These algorithms can help us understand what (time-based) sequence of audit events are frequently encountered together. These frequent event patterns are important elements of the behavior profile of a user or program. We are developing a systematic framework for designing, developing and evaluating intrusion detection Systems. Specifically, the framework consists of a set of Environment-independent guidelines and programs that can assist a system administrator or security officer to a) Select appropriate system features from audit data to build models for intrusion detection. b) Architect a hierarchical detector system from component detectors. c) Update and deploy new detection systems as needed. d) Understand behavior patterns of in-norm behavior which are unique for each system, which

39
makes IDS system also unique. Uniqueness makes in turn the intrusion detection system itself more resistant to attacks. XX. METHODOLOGY Prior work has shown the need for better security tools to detect malicious activity in networks and systems. These studies also propose the need for more usable tools that work in real contexts [3, 4]. To date, however, there has been little focus on the preprocessing steps of intrusion detection. We designed our study to fill this gap, as well as to further the understanding of IDS usability and utility, particularly as the IDS are installed and configured in an organization. Consequently, our research questions were: a. What do security practitioners expect from IDS, security mechanism? b. What are the di culties that security practitioners face when installing and configuring an IDS and security mechanism? c. How can the usability of IDS, security mechanism be improved? We used a qualitative approach to answer these questions, relying on empirical data from security practitioners who have experience with IDSs in real environment. Below we detail our data sources and analysis techniques. A. Data Collection We collected data from two deferent sources. First, we conducted semi-structured interviews with security practitioners. Second, we used participatory observation, an ethno-graphic method [5], to both observe and work with two senior security specialists who wanted to implement IDS in their organization. These two sources of data allowed us to triangulate our findings; the descriptions from interviewees about the usability of IDSs were complemented by the richer data from the participatory observation. B. Data Analysis The data from the interviews and participatory observation were analyzed using qualitative description with constant comparison and inductive analysis. We first identified instances in the interviews when participants described IDSs and security activities in the context of the activities they had to perform. We next contrasted these descriptions with our analysis on the participatory observation notes. These notes were coded iteratively, starting with open coding and continuing with axial and theoretical coding [7]. XXI. MOTAVING APPLICATION A file server can utilize association discovered between requests to its stored files for predictive flowing. For instance, if it discover that a file A is usually requested after a file B of the same size, then flowing of packet A after request of B could reduce the latency, especially if this pairing holds in a large percentage of file requests. A detection mechanism likes to know which packet and what size are usually being transmitted through. It could be beneficial to know, for example the flowing sequence an size of the packet to avoid unusual flowing capturing behavior control. The detection filter can reduce its response time by allowing the only authorized traffic (from and to the internal network). Results were then organized by the challenges that the participants faced when deploying and maintaining an IDS system.

From the mechanism outside (intra) and inside (inter) two default decisions are possible: a. Default = discard: That which is not expressly permitted is prohibited. b. Default = forward: That which is not expressly prohibited is permitted.

XXII. SECURITY SUPPORT AND EVALUATION Despite the fact that the security specialists had tried to simplify the deployment of some of the security techniques like IDS by limiting its purpose, the IDS integration proved to be a challenging task, due to a number of organizational constraints. For example, to connect to the IDS at place, the specialists needed to have available ports (at least two in the case of the

40
IDS used during participatory observation). We looked on security techniques that was not only easy to use, but also gave relevant information about the security of the organizations systems. Consequently, the ideal situation would have been to install the IDS and use of antiviruses in the most critical network domain of different organization to generate meaningful reports about the security level of the networks, with a minimal use of resources. However, we found that this did not occur; as discussed, organizational factors like distribution of IT responsibilities a ected the decision to not involve critical networks due to the corresponding overhead of involving multiple administrators. A key concept that appears in both the authentication and confidentiality mechanism for transmitted packet is the security association (SA). An association is one way relationship to the traffic, if a peer relationship is needed, for two ways secure exchange, then two security associations are required. A security association rules is uniquely identified by the Following parameters: 1. Sequence counter overflow: A flag indicating whether overflow of the sequence number counter should generate an audible event and prevent further transmission of packet on this security association (SA). 2. Anti-Replay number of packets: Used to determine Whether an inbound packet is a replay. 3. Lifetime of this security Association: A time interval or packet count after which an SA must be replaced with a new SA or terminated, plus an indication of which of there actions should occur. 4. Path: Any observed path maximum transmission unit (maximum size of a packet that can be transmitted without fragmentation) and aging variables. Because the packets are connectionless, unreliable service, the protocol did not guarantee that packets will be delivered in order and did not guarantee that all packets willbe delivered when we tested our approach. Therefore the receiver should have implement number of packets, with a default of NP = 60. The right edge of the graph represents the highest sequence member, NP, so far received for a valid packet. For any packet with a sequence number in the range from Total Packets (TP) (NP) + 1 to NP that has been correctly received (i.e. properly authenticated) we authorize the traffic. Finding results are presented experimentally in the figure 2 below.

Effects of Number of Packets on Misclassification Rates To understand the effects of the time intervals on the misclassification rates, we run the experiments using various time intervals: 5s, 10s, 30s, 60s, and 90s. The effects on the out-going and inter-LAN traffic were very small. However, as Figure 2 shows, for the in-coming traffic, the misclassification rates on the intrusion data increase dramatically as the time interval goes from 5s to 30s, then stabilizes or tapers off afterwards. XXIII. CONCLUSION In this paper we have proposed intrusion detections Method that employs data mining techniques for intrusion detection. We believe that a periodic comprehensive of IDS presented could be valuable for acquisition managers, security analysts and R&D program mangers. The internet does not recognize administrative borders and hence making the internet an attractive option for people with criminal intents. The accuracy of the detection model proposed depends on sufficient training data and the right feature set. We suggested that the association rules and frequent packets detections can be used to compute the consistent patterns from audit data. With the infancy of Security technology at place the intrusion detections approach could be where to begin with. Preliminary experiments of using presented techniques on the LAN data in the region has shown promising results.

XXIV.

REFERENCES

[1] E. Kandogan and E. M. Haber. Security administration tools and practices. In Security and Usability: Designing Secure Systems that People

41
Can Use, chapter 18, pages 357378. OReilly Media, Inc., Sebastapol, 2005. [2] D. Botta, R. Werlinger, A. Gagn, K. Beznosov, L.Iverson, S. Fels, and B. Fisher. Towards understanding IT security professionals and their tools. In Proc. of ACM Symposium on Usable Privacy and Security (SOUPS), pages 100111, Pittsburgh, Pennsylvania, July 18-20 2007. [3] D. M. Fetterman. Ethnography: Step by Step. Sage Publications Inc., 1998. [4] M. Sandelowski. Whatever happened to qualitative description? Research in Nursing & Health, 23(4):334340, 2000. [5] K. Charmaz. Constructing Grounded Theory. SAGE publications, 2006. [6] J.Yu, Z.Chong, H.Lu and A.Zhou. False Positive or False Negative:Mining Frequent Itemsets from High Speed Transaction Data Streams. In Proceeding of the 30th ACM VLDB International Conference on very large Data bases, pages 204215, 2004. [7] Wu et al, Top 10 Algorithms in Data Mining, Springer-Verlag London 2007. [8] Yin, Scalable Mining and Link Analysis Across Multiple Database Relations, Volume 10 Issue 2, SIGKDD 2008. [9] Z. Jiuhua, Intrusion Detection System Based on Data Mining Knowledge Discovery and Data Mining, WKDD 2008. First International Workshop on Volume, Issue, 23-24 Jan. Page(s):402 405, 2008.

42

Web Caching Proxy Services: Security and Privacy issues

Abstract A Web proxy server is a server that handles HTTP requests from clients. If the clients are of a common organization or domain, or exhibit a similarity in browsing behavior, the proxy can effectively cache requested documents. Caching, which migrates documents across the network closer to the users, reduces network traffic, reduces the load on popular Web servers and reduces the time that end users wait for documents to load. A proxy server accepts requests from clients. When possible and desired, it generates replies based upon documents stored in its local cache; otherwise, it forewords the requests, transfers the replies to the clients and caches them when possible. WWW clients perceive better response time, improved performance and speed when response to requested pages are served from the cache of a proxy server, resulting in faster response times after the first document fetch.

XXV. OVERVIEW: Web proxy works as an intermediary tool between Aa browser and the Web itself.2 a browser configured to use a proxy sends all of its URL requests to the proxy instead of sending them directly to the target Web server. In addition to funneling requests, proxies can also implement a caching mechanism, in which the proxy returns a cached version of the requested document for increased speed.3,4 many companies, governmental gencies, and universities sometimes make their proxies mandatory by blocking direct access to the Internet. Proxies typically handle both clear text and SSLencrypted traffic, and users can configure their browser for either or both situations (caching works only with clear text traffic). Web proxies are sometimes coupled with Web accelerators, which attempt to further reduce the time

required to display a requested Web page by predicting what page the user will request next.5These accelerators work on the server side, on the proxy, or on the client side, but in all three cases, the aim is to send the Web page to the client before its requested. Server-side accelerators push pages to clients, whereas proxy and client-side accelerators pull pages from the server. Clientside accelerators require a local component the client to be installed on the users machine. In this article, were interested in client-side accelerators for which HTTP traffic goes from the browser to the local client and then to the proxy, which either returns a cached copy of the requested document or fetches it from the site to which it belongs. Some Web accelerators add a local cache on the client (so that pages are cached at three different levels the browser, the client, of the proxy), some prefetch the pages they anticipate will be requested next, and some perform compression between the client and the proxy server. Examples of freeware Web accelerators are Fasterfox (http://fasterfox.mozdev.org). Security and privacy issues are common on the Internet, and many of the threats this article identifies arent unique to Web proxies. Network architecture, the widespread use of clear-text messages, the accumulation of sensitive data by various actors, flawed software, gullible and imprudent users, and unclear yet constantly changing and globally inconsistent legal protections are just some of the well-known sources of Internet security problems.1 The security and privacy implications of installing a Web proxy are an excellent illustration of this general problem, but ideally this article will increase awareness about the issue beyond this specific example. In todays fully connected working environment, individuals and companies often choose to use new Software and services without anticipating all the possible side effects. Employees indeed, entire corporations often add tools to their environments or use services that promise various

43
benefits by tapping into a vast offering of software freely available on the Internet from both unknown and well-known companies. Inevitably, this leads to problems: someone in an organization might use a free Web-based email system, for example, thereby exposing confidential documents over the Internet, or an employee might install a file-searching tool without realizing that it sends results outside the companys firewall. From a security and privacy viewpoint, such decisions ramifications are often overlooked some are easy to anticipate, but others are more difficult to foresee and stem from a combination of factors. In this article, I review the security and privacy implications of one such decision: installing an externally run, accelerated Web proxy. I assume a worst case scenario in which this decision is widespread (that is, the Web proxy has a large user base). I also consider three different levels of impact: on a proxy user, on an organization whose employees use a proxy, and on a Web site owner whose site is accessed through a proxy.
ISP

Figure 1 shows users accessing a Web page with and without a proxy. Requests without a proxy (in pink) go directly to the target Web server (pink path), but proxy requests go first to the proxy and then to the target Web server (red paths). If the proxy has a cached copy of the requested page, it might return that copy to avoid extra network activities.

XXVI. WEB CACHING (PROXY SERVER) The proxy servers main goal is to satisfy clients request without involving the original web server. It is a server acting like as a buffer between the Clientss web browser and the Web server. It accepts the requests from the user and responses to them if it contains the requested page. If it doesnt have the requested page, then it requests to Original Web Server and responses to the client. It uses cache to store the pages. If the requested web page is in cache, it fulfills the request speedily.

ISP ISP

TARGET WEBSITE

ISP

INTERNET USER WITHOUT PROXY


ORGANIZATION

INTERNET USER WITH PROXY

Figure 1. Internet users accessing a target Web site. Their requests are routed through their ISP, from node to node (black circles) to the target. The users with a Web proxy have their requests routed to the Web proxy first (red paths), whereas the users who dont have a proxy access the Web site directly (in pink).

The two main purpose of proxy server are: 1. Improve Performance As it saves the result for a particular period of time. If the same result is requested again and if it is present in the cache of proxy server then the request can be fulfilled in less time. So it drastically improves the performance. The major online searches do have an array of proxy servers to serve for a large number of web users. 2. Filter Requests Proxy servers can also be used to filter requests. Suppose a company wants to restrict the user from accessing a specific set of sites, it can be done with the help of proxy servers.

44
XXVII. WEB PROXY SECURITY A Web proxy is externally run if the service is controlled entirely by a third party with no connection to the end user, the end users employer, or the end users ISP. This situation is typical of Web based services, which external corporations often provide freely and without contractual obligations. the Web is under the proxy owners control, so the user can be deceived or manipulated at will. XXXI. CONTENT DOWNLOADS Another concern for users is the storage of unwanted content on their machines, which is most commonly found in accelerated Web proxy client-server architectures. (Classic Web proxies usually dont have a client program running on the users machine, so this concern doesnt apply to them.) As part of prefetching, the proxy service can download pages to the users machine that he or she never requested, which puts that person in the difficult position of downloading potentially offensive or inappropriate documents. This concern is often mitigated by the fact that the accelerators clients handle the prefetched pages, not the browser, so someone competent enough to know the difference might sort out the problem for the user. However, this doesnt help if the users activity is directly monitored over the network, which is common practice in many companies. Its still possible to find out if the pages are, in fact, automatically prefetched, but doing so requires specific technical skills and understanding of the accelerated Web proxys inner workings. In the best cases, HTTP headers that the accelerator sends contain something specific, in which case a network administrator can capture and scan request content to identify prefetching requests. If requests from the accelerator are identical to requests in the clients Web browser, a detailed analysis of the clients system log will distinguish prefetched from requested pages. Note that, conversely, users can also intentionally get some offensive pages prefetched so that they can read them freely while claiming not to have downloaded those pages themselves.

XXVIII. WEB PROXIES AND USERS To evaluate the security implications of using an externally run Web proxy, lets first look at the consequences from the end users viewpoint. Most casual users dont have a clear understanding of the systems workings or the implications of using it, even if the end user license agreement they accepted when installing or configuring the product mentions privacy considerations. Consequently, these users end up putting their own privacy and security at risk.

XXIX. PRIVACY A proxy owner can store and analyze any unencrypted traffic that goes through its Web browser proxy (pages read, links followed, forms filled out, and so on),9 a situation thats especially unavoidable when that owner is also the connectivity provider (such as a company, a university, or an ISP). This situation is similar to a node mining unencrypted plaintext traffic flowing over the Internet between a source and a destination, but the difference is that information in a Web proxy is visible to a third party that otherwise wouldnt have access to it. Although the proxy owner can gather only a limited amount of information over an encrypted channel, users arent protected from the accumulation of personal data during unencrypted browsing activities or when accessing Web sites that should use encryption but dont. Moreover, users cant directly detect such information-gathering activities.

XXXII. WEB PROXIES AND ORGANIZATIONS The use of Web proxies can also affect entire organizations. When employees start using an externally run proxy without the organizations knowledge or without understanding the consequences of their actions, they can expose some of the organizations confidential activities or bypass some of its rules. XXXIII. USER ACTIVITIES If enough users in an organization use an externally run proxy, then that proxys owner accumulates an enormous amount of information about the activities within the organization. Obviously, much of it might be inconsequential, but targeted data mining can reveal important information about an organizations technical choices, current prospects, and so forth.

XXX. DECEPTION Next, lets consider what proxies can send to users instead of what users can send through proxies. The data a proxy sends is completely under the proxys control, which means it can modify the pages viewed without any obvious way for the user to detect those changes. For example, a proxy could claim that a particular site is temporarily unavailable, send back an older version of the page, or return a version that never actually existed. Ultimately, the users view of

45
Conversely, the company controlling a Web proxy might withhold, delay, or alter information sent back to users, thereby sending them down the wrong path or driving them away from important opportunities. Again, this situation isnt unique an organizations ISP can do the same thing. However, if users operate a proxy without organizational consent, the organization hasnt explicitly chosen to use that proxy and didnt enter into a contractual agreement with the company offering the service. XXXIV. ORGANIZATIONAL CONTROL An externally run Web proxy also has implications for organizational control because a proxy can bypass or defeat system checks. If the system in place is meant to block access to a list of forbidden sites by scanning the destination at the IP level, for example, it wont be able to trigger any hits because outgoing requests go to the proxy, not the target URL. Thus, the controlling tool must be modified to analyze the targets in HTTP headers because they contain the actual end destination. If the system in place scans incoming documents for forbidden patterns, it can also fail if the accelerated Web proxy modifies the encoding in some unexpected way for example, by compressing messages. Such modifications circumvent an organizations attempts to control user activities for productivity or other purposes public libraries and schools might have to control content, but a proxy can forestall compliance plans. Of course, control is still possible (the site running the proxy can itself be banned), but the mechanisms in place must be adapted to handle this new situation, which comes at a price and requires a level of technical expertise that might be difficult to find in any given organization. Using proxies to circumvent checks is hardly a new way to escape controlling techniques: technically savvy users routinely use similar approaches to bypass filtering. However, casual users can also employ evasive techniques without intending to do so. XXXV. Internal Information Disclosure An additional concern is the possibility of leaking internal network information to the company running the Web proxy. If user requests go to a proxy first before being processed locally, then information about document names and internal network topology can leak out. (This doesnt necessarily give the proxy owner access to those documents, though.) A similar concern is disclosure of information thats secret merely because nothing betrays its existence. Obviously, this practice is extremely insecure in the first place, but the problem worsens with an externally run Web proxy. If a user releases a document to another user by dropping it on the organizations Web site, for example, then using an externally run Web proxy discloses that document to the company running it (in this case, access to the document is possible). XXXVI. WEB PROXIES AND WEB SITE OWNERS Centrally controlled Web proxies can have a substantial impact on Web site owners and application developers as well. It blurs the picture of the sites actual usage and can interfere with the sites applications and its content delivery. Content Control If a large portion of a sites visitors use the same Web proxy provider, then the Web content provider should be concerned about the proxy providers ability to alter or withhold some or all of the content being published. (This is technically no different than what any host between a Web site and its users already does, but in this case, the proxy provider is artificially included in a path with a potentially large number of users.) This behavior is considered malicious if the proxy owner (or a hacker) is intentionally interfering with any Web pages, but it can also be a technical side effect for example, an excessive caching time at the proxy level might jeopardize the Web site owners efforts to provide rapidly reactive content. Connectivity problems at the proxy level also affect Web site providers. Site-Usage Tracking Detailed site-usage statistics are crucial to most content providers: how many visitors viewed what information, what patterns did these visitors follow, when and from where did they access the Web site? This information is important for maintaining and improving a site, but it can also be a source of additional revenue in the form of advertising. If a large portion of Internet users goes through a Web proxy, it jeopardizes the ability to track usage. The first problem is request origin, which mostly stems from a single source, the Web proxy. The second problem is page caching: when a popular Web proxy service delivers a cached version of a document, the traffic on the original site drops significantly. This differs from the situation in which a multitude of small proxies scattered across the Internet cache various pages, where a good ratio of actual access to the site would be maintained (instead of one proxy serving a very large user base). The third issue is prefetching: with automatically prefetched documents, its difficult for Web site owners to make sense of a visit. Web Applications

46
10 Page prefetching also sometimes interferes with Web-based applications because it involves automatically following links in a currently displayed page. If the link triggers an action in the application (such as logout or delete), then page prefetching can automatically trigger these actions and thus break the application. However, its arguably safe to prefetch GET-based links for applications compliant with the HTTP 1.1 specification because, according to the specification, GET methods should not have the significance of taking an action other than retrieval, and thus shouldnt trigger any action in the application A Real-World Example 7 Google launched its Google Web Accelerator (GWA), a freely distributed accelerated Web proxy. GWA is a good example of an externally run, accelerated Web proxy controlled entirely by a third party with no connection to end users, their employers, or their ISPs. HTTP proxies and caches are common, but theyre scattered across the Internet, each managing relatively few users. A GWA of Google-sized proportions changes the situation because several users have their traffic proxied by this one source GWA is thus an excellent realworld test bed because of its potential scale. Of course, I dont claim that GWA poses all the threats discussed in this article, and I certainly dont suggest that Googles aim was to create any of these issues; I merely use GWA to illustrate the fact that the problem outlined here could materialize. If we analyze GWA with respect to the threats outlined earlier, we should first note that even if the service were pre-installed on a users machine, the current version advertises itself very clearly, so the user would likely notice a normal installation. Moreover, the user can configure prefetching and local caching behavior for the GWA. In the versions I tested (googlewebaccclient, version 0.2. 62.80-pintail.a, googlewebaccwarden.exe version 0.1.62.80dogcatcher.a, and googlewebacctoolbar.dll version 0.1.62.80-ingot.a in Firefox 1.5 on Windows XP SP2), 12 prefetching is enabled and local caching always checks for a newer version of the page by default; theres no obvious way to disable local caching entirely, although the user can delete the cache content. Citing security reasons, GWA doesnt handle SSL-encrypted traffic at all. Google makes no secret of storing the data sent through GWA (www.google.com/privacypolicy. html). Google also says that it temporarily caches cookies to improve performance, but it doesnt provide any details about how long cached data is kept http://webaccelerator.google.com/support.html). GWA does compress data between the server and the client, which means that some GWA users could bypass some of the filters put in place by their organizations. On the positive side, the version we tested doesnt route internal information (that is, access the machine by name or via a nonroutable IP), thus the content disclosure concern we discussed earlier doesnt apply here. GWA also deals with caching issues by letting content providers specifically request that their pages not be cached. However, this assumes that GWA indeed doesnt cache the page, which the user or content provider cant control; it also means that site providers must disable page caching entirely, regardless of the caching source. Googles connectivity to the Internet is obviously outstanding, so in practice, its likely that having connectivity through GWA is actually an improvement over the vast majority of content providers. (That said, because of its high exposure, Google is more likely than most content providers to be the target of various network-level attacks, as well discuss later.) Google has taken steps to prevent the problems of activity tracking to some degree for example, it adds the real origin as part of its request, so a modified log analyzer can retrieve real sources of hits. However, this doesnt help if GWA returns a cached version of the page (although Web site owners can get around this by requesting the page not be cached at all). GWA also adds a special HTTP header x-moz: prefetch to its prefetching requests so that Web masters can configure the server to return a 403 HTTP response code (access forbidden) if they want to deny such a request. However, this places a large burden on technical teams, who now have to modify their Web servers and log analyzers to adapt to the decisions made by a single company. Moreover, they must follow GWAs evolution to maintain their tools and Web sites. XXXVII. POTENTIAL WEAKNESS From a reliability viewpoint, one potential weakness of using a popular Web proxy service is that it creates a single point of failure. If numerous Internet users subscribe to the same service, an attacker might see the Web proxy as a target of choice because it has a formidable global impact. This problem is common to any popular service, but its exacerbated by the level of control over users Web sessions that Web proxies provide. Although many Web sites might be affected, only the proxy sites owner can assess such a crucial sites overall security. Other situations present similar threats not directly linked to Web proxies as such, but as consequences of the way Web proxies work.

47
XXXVIII. CONCLUSION Some Solutions and Assessing the Threat So what can be done to protect users, organizations, and content providers against the threats weve outlined here? From a technical viewpoint, one approach is the systematic use of encryption and server authentication: it prevents both page modification and caching, and reduces activity monitoring to just the URL. This solution has some drawbacks, though it has an overhead price, only the site provider can do it, and it amounts to making proxies essentially useless except as bridges. From the user and organizational viewpoint, the best approach seems to be a generic one something that isn't specific to Web proxies but instead applies to using third-party services in general. Such services must be understood as ceding your control over various things to another entity, so its important to analyze and truly understand what youre giving up, how it could backfire, how likely it is to backfire, and then compare it with the real benefits you can expect from using the service. Its also important to understand how the service provider fits in with this: do you have a legally enforceable contract, and does this contract provide sufficient privacy protection? Why is the service offered? How long is the data kept? Whats the long-term impact if the situation changes? In general, a company policy that prevents the modification of its computing environment seems like a sensible choice. However, this kind of policy is difficult to enforce in practice and should be coupled with extensive user education efforts as well as technical enforcement. Weve seen that a third-party Web proxy (such as GWA) has the potential to create security and reliability problems because it can provide unwanted external control over Web activities, impacting end users, organizations, and content providers. It can also interfere with organizations attempts to control their users activities and potentially disclose important internal information. Furthermore, it can alter content providers ability to accurately track activities on their sites and create a particularly dangerous single point of failure over which only the company running the service has control. These threats might never materialize if a trusted party manages the external Web proxy, as long as this trusted company isnt forced into providing information by governmental agencies. The threats Ive outlined here havent fully materialized in the real world yet, but several rogue players have enough incentives to at least attempt some of them, so organizations depending on this kind of architecture must be aware of the potential threats and do what they can to mitigate them.

REFERENCES: [1] A. Luotonen and K. Altis, World-Wide Web Proxies, Computer Networks and ISDN Systems,. [2] G. Colouris: Distributed Systems. [3] Singhle and Shivratre: Distributed Operating System [4] R. Anderson, Security Engineering: A Guide to Building Dependable Distributed Systems, Wiley, 2001 [5] www.whatis.techtarget.com [6] www.discovervirtualisation.com [7] http://webaccelerator.google.com [8] R. Anderson, Security Engineering: A Guide to Building Dependable Distributed Systems, Wiley, 2001 [9] A. Luotonen and K. Altis, World-Wide Web Proxies, Computer Networks and ISDN Systems, vol. 27, no. 2, 1994, pp. 147154. [10] J. Wang, A Survey of Web Caching Schemes for the Internet, ACM SIGCOMM Computer Comm. Rev., vol. 29, no.5, 1999, pp. 3646. [11] A. Rousskov and V. Soloviev, A Performance Study of the Squid Proxy on HTTP/1.0, World Wide Web, vol. 2, nos. 12, 1999, pp. 47 67. [12] J. Domnech et al., The Impact of the Web Prefetching Architecture on the Limits of Reducing Users Perceived Latency, Proc. IEEE/ACM Intl

48

A Comparative Study To Solve Job Shop Scheduling Problem Using Genetic Algorithms And Neural Network

This paper presents the difference between genetic algorithm and neural network for solving a job shop scheduling problem. We have mentioned the operation of genetic algorithm and algorithm of genetic for solving job shop scheduling problem. In this paper we have compare the genetic algorithm and neural network .And solve a fuzzy flexible job shop scheduling problem using genetic algorithm and neural network. In each generation, crossover and mutation are only applied to one part of the chromosome and these populations are combined and updated by using half of the individuals Index Terms Genetic Algorithm, Neural Network, Job Shop Scheduling Problem.

XXXIX. INTRODUCTION

he job shop scheduling problem (JSSP) is one of the most well-known problems in both fields of production management and combinatorial optimization. The classical n-by-m JSSP studied in this paper can be described as follows: scheduling n jobs on m machines with the objective to minimize the completion time for processing all jobs. Each job consists of m operations with predetermined processing sequence on specified machines and each operation of the n jobs needs an uninterrupted processing time with given length. Operations of the same job cannot be processed concurrently and each job must be processed on each machine exactly once. Efficient methods for solving JSSP are important for increasing production efficiency, reducing cost and improving product quality. Moreover, JSSP is acknowledged as one of the most challenging NPhard problems and there is no any exact algorithm can be employed to solve JSSP consistently even when the problem scale is small. So it has drawn the attention of researches because of its theoretical, computational, and empirical significance since it

was introduced. Due to the complexity of JSSP, exact techniques, such as branch and bound , and dynamic programming are only applicable to modest scale problems. Most of them fail to obtain good solutions solving large scale problems because of the huge memory and lengthy computational time required. On the other hand, heuristic methods, include dispatching priority rules, shifting bottleneck approach and Lagrangian relaxation , are attractive alternatives for large scale problems. With the emergence of new techniques from the field of artificial intelligence, much attention has been devoted to meta-heuristics. One main class of metaheuristics is the construction and improvement heuristic, such as tabu search and simulated annealing . Another main class of meta-heuristic is the population based heuristic. Successful examples of population based algorithms include genetic algorithm (GA) , particle swarm optimization (PSO), artificial immune system and their hybrids , and so on. Among the above methods, GA, proposed by John Holland, is regarded as problem independent approach and is well suited to dealing with hard combinational problems. GAs use the basic Darwinian mechanism of survival of the fittest and repeatedly utilize the information contained in the solution population to generate new solutions with better performance. Classical GAs use binary strings to represent potential solutions. One main problem in classical GA is that binary strings are not naturally suited for JSSP. Another problem in classical GAs is premature convergence. Although GAs have better performance than most of conventional methods, they could not guarantee to resist premature convergence when individuals in the population are not well initialized. From the view point of the JSSP itself, it is a hard combinatorial optimization problem with constraints. The goal of the scheduling methods is to

49
find a solution that satisfies the constraints. However, some of the infeasible solutions are of similarity with the feasible optimal solutions, and may provide useful information for generating optimal solution. XL. GENETIC ALGORITHM Genetic algorithms are one of the best ways to solve a problem for which little is known. They are a very general algorithm and so will work well in any search space. Genetic algorithms use the principles of selection and evolution to produce several solutions to a given problem. The genetic algorithms developed USA in the 1970s.[3] Genetic Algorithm Application to solve a simple problem: maximizing f(x) =x2; x=0,....., 31 The most common type of genetic algorithm works like this: A population is created with a group of individuals created randomly .The individuals in the population are then evaluated. The evaluation function is provided by the programmer and gives the individuals a score based on how well they perform at the given task [2]. A) Individual:-Any possible solution. B) Population:-Group of all individuals. C) Search space:-All possible solution to the problems. D) Chromosome:-Blueprint for an individual. E) Trait:-Possible aspect of an individual. F) Allele:-Possible setting for a trait. G) Locus:-The position of a gene on the chromosome. H) Genome:-Collection of all chromosomes for an individual. fitness method simply calculated the amount of free space each individual/solution offers. 2. Cross-over (Swapping information):Exchange (Swapping) of genes between parent model in order to produce child model(next generation). Control the degree of mixing and sharing of information. It is performed only by those that the fitness tests. The example of how two parents crossover to make two children.

Fig.1 Simple Problem Using Genetic Algorithm. The Operations of a Genetic Algorithm 1. Selection (Based on misfit):-A filter to select the model that best fit the data is commonly used. You will need a method to calculate this fitness. [4]Lets use the space optimization example. The

Fig.2 swapping number. 3. Mutation:-Making random changes in genes. It is made in order to maintain some degree of randomness in the population (helps to avoid local minima).The probability of mutation should be kept low in order to prevent excess of randomness. 4. Reproduction:-pair of"parentsolutions are selected to generate child solutions with many of the characteristics of theirparent. Reproduction includes crossover(High probability)and mutation(low probability). Advantages of Genetic Algorithm A) They efficiently search the model space, so they are more likely (then local optimization technique) to converge toward a global minima. B) There is no need of linearization of the problems. C) There is no need to compute partial derivatives. D) It can quickly scan a vast solution set. E) This is very useful for complex or loosely defined problems. Disadvantages of Genetic Algorithms A) They show a very fast initial convergence followed by progressive slower improvements (sometimes is good to combine it with a local optimization method). B) In presences of lots of noise, convergence is difficult and the local optimization technique might be useless. C) Models with many parameters are computationally expensive. D) Sometimes not particularly good models are better then the rest of the population and cause premature convergence to local minima. E) The fitness of all the models may be similar , so convergence is slow . F) It is too hard for the individuals to venture

50
away from their current peak. The Flowchart Illustrates the Basic Steps in a Genetic Algorithm. improving productivity as well! Chances are increasing steadily that when you get that trip plan packet from the travel agency, a GA contributed more to it than the agent did. 4. .Business: Using genetic algorithm we can solve the business problems .firstly we analysis then evaluated the problems then get optimal solution. XLI. NEURAL NETWORK

Fig.3 Flowchart of genetic algorithm. Fitness:-A Measure of the goodness of the organism. Expressed as the probability that the organism will live another cycle (generation).Basic for the natural selection simulation. Offspring:-Common task. Applications of Genetic Algorithm 1. Automotive Design: Using Genetic Algorithms [GAs] to both design composite materials and aerodynamic shapes for race cars and regular means of transportation (including aviation) can return combinations of best materials and best engineering to provide faster, lighter, more fuel efficient and safer vehicles for all the things we use vehicles for. Rather than spending years in laboratories working with polymers, wind tunnels and balsa wood shapes, the processes can be done much quicker and more efficiently by computer modeling using GA searches to return a range of options human designers can then put together however they please. 2. Computer Gaming: These GAs have been programmed to incorporate the most success ful strategies from previous games-the programs. 3. Trip, Traffic and Shipment Routing: New applications of a GA known as the "Traveling Salesman Problem" or TSP can be used to plan the most efficient routes and scheduling for travel planners, traffic routers and even shipping companies. The shortest routes for traveling. The timing to avoid traffic tie-ups and rush hours. Most efficient use of transport for shipping, even to including pickup loads and deliveries along the way. The program can be modeling all this in the background while the human agents do other things,

Neural Network is a network of many very simple processors ("units"),each possibly having a (small amount of) local memory[10]. The units are connected by unidirectional [6] communication channels("connections"),which carry numeric(as opposed to symbolic)data. The units operate only on their local data and on the inputs they receive via the connections. The design motivation is what distinguishes neural networks from other mathematical techniques: A neural network is a processing device, either an algorithm, or actual hardware, whose design was motivated by the design and functioning of human brains and components thereof. There are many different types of Neural Networks, each of which has different strengths particular to their applications. The abilities of different networks can be related to their structure, dynamics and learning methods. Neural Networks offer improved performance over conventional technologies in areas which includes: Machine Vision, Robust Pattern Detection, Signal Filtering, Virtual Reality, Data Segmentation, Data Compression, Data Mining, Text Mining, Artificial Life, Adaptive Control, Optimization and Scheduling, Complex Mapping and more. Advantage of Neural Network A) Systems which combine disparate technologies. B) Systems which capitalize on the synergy of multiple technologies. C) Systems which implement multiple levels or facets of activities from different perspectives Met strategies such as simulated annealing (SA); tabu search (TS); genetic algorithms (GAs) guide a local heuristic, embedded within their structure, through the search domain and hence are able to provide a superior method (cf. results of Vaessens et al. 1996). Disadvantage of Neural Network A) HOPFIELD: For complex problems such as G the Hopfield model is unable to converge to the global optimum as it has a tendency to

51
become trapped within local minima solutions, hence there is no guarantee of achieving good solutions. B) THE BEP NEURAL MODEL: Although the BEP model is able to perform classification effectively it exhibits limited success in dealing with optimisation problems because of the inherent lack of generic patterns between inputs and outputs in optimisation problems. Application of Neural Network 1. Character Recognition: The idea of character recognition has become very important as handheld devices like the Palm Pilot are becoming increasingly popular. Neural networks can be used to recognize handwritten characters. 2. Image Compression: Neural networks can receive and process vast amounts of information at once, making them useful in image compression. With the Internet explosion and more sites using more images on their sites, using neural networks for image compression is worth a look. 3. Stock Market Prediction: The day-to-day business of the stock market is extremely complicated. Many factors weigh in whether a given stock will go up or down on any given day. Since neural networks can examine a lot of information quickly and sort it all out, they can be used to predict stock prices. 4. Traveling Salesman's Problem: Interestingly enough, neural networks can solve the traveling salesman problem, but only to a certain degree of approximation. 5. Medicine, Electronic Nose, Security, and Loan Applications: These are some applications that are in their proof-of-concept stage, with the acception of a neural network that will decide whether or not to grant a loan, something that has already been used more successfully than many humans. 6. Miscellaneous Applications: These are some very interesting (albeit at times a little absurd) applications of neural networks. job consists of m operations with predetermined processing sequence on specified machines and each operation of the n jobs needs an uninterrupted processing time with given length. Operations of the same job cannot be processed concurrently and each job must be processed on each machine exactly once. Efficient methods for solving JSSP are important for increasing production efficiency, reducing cost and improving product quality.

XLII. SOLUTION OF JOB SHOP SCHEDULING PROBLEM USING GENETIC ALGORITHM The job shop scheduling problem (JSSP) is one of the most well-known problems in both fields of production management and combinatorial optimization. The classical n-by-m JSSP studied in this paper can be described as follows: scheduling n jobs on m machines with the objective to minimize the completion time for processing all jobs [7]. Each

There are two important issue related to use genetic algorithm to job shop problem. (1) How to encode a solution of the problem into a chromosome so as to ensure that a chromosome will correspond to a feasible solution. (2) How to enhance the performance of genetic search by incorporating traditional heuristic methods. [5]The job shop scheduling with alternative machine decomposed to two. First, operations are allocated to specific machine. Second, determine the sequence of operation allocated each machine with respect to operation sequence constraint specified. There are many approaches used to find solutions to job scheduling problems. Dispatch rules have been used to solve job scheduling problems (Baker 1984). The operations of a given job have to be processed in a given order. The problem consists in finding a schedule of the operations on the machines, taking into account the precedence constraints that minimize the makespan (Cmax), that is, the finish time of the last operation completed in the schedule. Each operation uses one of the m machines for a fixed

52
duration. Each machine can process at most one operation at a time and once an operation initiates processing on a given machine it must complete processing on that machine without interruption. JSSP is acknowledged as one of the most challenging NP-hard problems and there is no any exact algorithm can be employed to solve JSSP consistently even when the problem scale is small. The JSP is amongst the hardest combinatorial optimization problems. It is assumed that a potential solution to a problem may be represented as a set of parameters. The individuals, during the reproductive phase, are selected from the population and recombined, producing offspring, which comprise the next generation. The JSP is NPhard (Lenstra and Rinnooy Kan, 1979), and has also proven to be computationally challenging. There are four jobs; each has three different operations to be processed according to a given sequence. There are six different machines, and the alternative routing and processing times are shown. threshold x and a net input N i.

Since an operation can never start before time 0, the starting time of the entire schedule, the threshold x can be set at 0, resulting in the implementation of a non-negativity constraint. The thresholds can also be determined in a problem specific manner by calculation of the earliest possible starting time of the operations. The earliest possible starting time of the first operation of a job is the starting time of the scheduled time-span: 0 in the [6]. Example. The second operation's (i (j+ 1) k) earliest possible starting time is 0 + tijk. The earliest possible starting time of the third operation of job 1 is 0 + tijk + ti (j+l)k. In this manner the thresholds of the units representing the starting time are determined, thus reducing the 'search space' and resulting in a more-stable network.[5]. SC and RC units. The net input of these units is calculated by adding the bias to the summed incoming weighted activation:

The bias (Bi) added to the incoming weighted activations of the connected units is the processing time of the operation as formulated in the equation this unit represents. The constraint representing units are of a deterministic negated linear threshold type with x being the threshold and Ni the input: XLIII. SOLUTION OF JOB SHOP SCHEDULING PROBLEM USING NEURAL NETWORK The job-shop scheduling neural network should contain units that are capable of representing the starting times of the operations (S units),[9] whether sequence constraints are violated (SC units), whether resource constraints are violated (RC units), and the value of the Yipk indicator variables (Y units). S units. The input N i of these units is calculated by adding the previous activation Ai(t-1) (thus simulating a capacitor) to the summed incoming weighted activation.

This activation function allows for an indication of the violation of the equation being represented. The bias is applied to represent the problem-specific operation processing times. Y units. The net input of the Y units is calculated by summation of the incoming weighted activation:

The activation is determined according to a deterministic step function: The selected unit for this is of a deterministic linear threshold type with an activation A i, a

53
designed for the example problem consists of 6 Sijk units, the second layer consists of 10 constraint units and the third layer consists of 3 Y units. The thresholds of the S units can be determined in a problem-specific manner. For instance the threshold of the S unit representing the third operation of job 1 (S 133) can be set at 13 (t111 + t122). The first unit of the second layer (representing the first sequence equation) for instance, collects negated information (connection weight -1) from the unit representing S111 and positive information (weight +1) from the unit representing S122. Together with the bias of this unit (-5) the violation of this constraint can be determined. If, for instance, S111 = 1 and S122 = 2, a constraint violation should be signalled since the second operation starts before the first operation has ended. The net input of the first constraint unit, SC1 in Fig, will be 2 - 1 - 5 = -4, resulting in an activation of 4, signalling a violation. This information has to be fed back to the S units to cause an appropriate change in starting times. For this reason, the S units collect the information from the SC units (see Fig). The corresponding S units will receive this redirecting information resulting in an inhibition (+4 * -1) of the S111 unit and an excitation (+4 * 1) of the S122 unit, thus working towards an acceptable solution. If these feedback weights are set correctly, feasible solutions will be generated without requiting explicit initialisation of the S units. The resource constraints are implemented in the general structure presented in Fig. The RC units collect information from the adequate S units and the Y units according to the resource equations. Suppose the starting time of operation 1 of job 1 on machine 1 (Slll) is 1, and the starting time of operation 2 of job 2 on machine 1 (S221) is 2. In that case, unit Y121 receives a net input of -I + 2 = 1 resulting in an activation of 1, signalling that S111 precedes operation S221. The RC unit representing equation 5 receives -1 from S111,2 from unit S221 and -35 from Y121. This value added to the bias (30) of this RC unit results in a net input of -5 resulting in an activation of 5, signalling a violation. The S111 and S221 units receive this activation through their weighted feedback connections, resulting in an advanced starting time of operation 111 and a delayed starting time of operation122. The RC unit representing equation 6 receives 1 from S111, -2 from S221, and 35 from Y121. This value added to the bias (-3) of this RC unit results in a net input of 31, resulting in an activation of 0.

The activation of this unit represents whether job 1 precedes job 2 on machine k or job 2 precedes job 1. The proposed structure consists of three layers; the bottom layer containing the S units, the middle layer containing the SC and RC units, and the top layer containing the Y units. As an example a 2/3/J/C max scheduling problem is used with its machine allocations and operation times presented in Tables I and 2.

Before a dedicated neural network can be designed an integer linear programming representation has to be created according to the method presented. For the sequence constraints this translation results in: n(m1) = 4 sequence constraints of type Sijb - Si(j-1)a" ti(j-1)a >- O, 1) S122- $111- 5 20 2) S133- $122 - 8 20 3) S221- $213- 7 20 4) S232 -$221- 3 20. For the resource constraints the value of the constant H must be determined. This constant should have a value that is large enough to ensure that one of the disjunctive statements holds and the other is eliminated so:

For the example problem there are nm(n-1) = 6 resource constraints of type Spjk - Sijk + (-H* rijk) + H ..it..>_ 0 Sok- Spjk + ~+n * Yijk) ~->- O, 5) S221 - $111 + (-35"Y12 I) + 35 - 5 _>0 6) S111 - $221 + ( 35* Y121) -3 _>0 7) S232- $122 + (-35"Yl22) + 35 - 8 20 8) S122 - $232 + ( 35* Y122) -9_>0 9) S213 - $133 + (-35"Y123) + 35 - 2 _20 10) S133 - $213 + ( 35* Y123) -7 _20. In total there are n(nm-l) = 10 equations, nm = 6 starting time (Sijk) variables and ran(n-I)/2 = 3 disjunction (Yipk) variables for the 2-job, 3-machine problem. The first layer of the neural network

54
obviously effective and viable tools for realtime prediction tasks. G) For the two extreme cases of building block scaling, uniform and exponential, genetic algorithms with perfect mixing have time complexities of O(m) and O(m2) respectively. H) Fast,accurate and easy-to-use Genetic algorithm better then neural network.

XLV. CONCLUSION Fig4.solving job shop scheduling problem using neural network XLIV. COMPARISON A) Genetic Algorithms (GAs) essentially started with the work of Holland (1975), who in effect tried to use Natures genetically based evolutionary process to effectively investigate unknown search spaces for optimal solutions. Neural Networks (NNs) are based on early work of McCulloch & Pitts (1943), who buil a first crude model of a biological neuron with the aim to simulate essential traits of biological information handling. For a comprehensive survey of the use of evolutionary algorithms, and GAs in particular, in management applications, see Nissan (1995). In Industry: production planning, operations scheduling, personnel scheduling, line balancing, grouping orders, sequencing, and sitting. In financial services: risks assessment and management, developing dealing rules, modeling trading behavior, portfolio selection and optimization, credit scoring, and time series analysis. Moreover, the way a NN learns also depends on the structure of the NN and cannot be examined separately from its design. In GA theory, it is usually necessary to put related parameters together in the genome. In our case, we keep close together in the genome the weights corresponding to the same hidden layer neuron. The name Neural Networks already describes what they try to do, i.e. to handle information like biological neurons do, and thus using the accumulated experience of nature or evolution in developing those In this paper we have presented Genetic algorithm and Neural Network for solving a job shop scheduling problem. We have fully evolution of genetic and neural network and explain the benefit and Application of genetic and neural network. Genetic Algorithm produced quite good results, ones apparently at least as good as those in the literature. With the help of genetic algorithm we can solve business problem. And get a optimum solution. We have done numerical for solving a job shop scheduling problem using Genetic algorithm, Neural network. This research considered only nominal features. Our work extends in a natural way to other varieties of features. REFERENCES
[1] James F. Frenzel received a Ph.D. in Electrical Engineering from Duke University in 1989. [2] Holland J.H., Adaptation in natural and artificial system, Ann Arbor, The University of Michigan Press, 1975. [3] [Goldberg D., Genetic Algorithms, Addison Wesley, 1988 . [4] Baker, K.R. (1974). Introduction to Sequencing andScheduling, Wiley, New York. [5] Foo, Y.P.S. and Takefuji, Y. (1988a). Neural Networks for Solving Job-Shop Scheduling. [6] CoEvolution of Neural Networks for Control of Pursuit & Evasion. [7] B.J. Lagewag, J.K. Lenstra and A.H.G. Rinnooy Kan, 1977, Job shop scheduling by implicit enumeration., Management Science, Vol. 24, pp. 441-450. [8] M. Garey, D. Johnson and R. Sethi, 1976, The Flowshop and job shop scheduling. Maths Ops Res. 1, pp 117129. [9] S.W. Mahfoud, D.E. Goldberg, "A Genetic algorithm For parallel simulated annealing", in Parallel Problem Solving from Nature, 2, R. Manner, B. Manderick eds., Elsevier Science Publishers B.V., 1992. [10] P.J. Verbos, \An Overview of Neural Networks.

B)

C)

D)

E)

F)

55

Innovation & Entrepreneurship In Information And Communication Technology

Abstract This paper describes ICT (information and communication technology) and the new innovations which are related to it. ICT as we all know is a wider perspective of information technology. Information technology deals with unified communications (UC), integration of telecommunications and the audio-visual systems in modern IT. This paper illustrates new innovations by Cisco Company for both the larger and smaller organization. Unified communications are the important aspect related with ICT. These are integrated to optimized business processes. Unified communications integrates real time and as well as non real time communication with business processes and requirements.ICT is a powerful tool for the development of various business and IT activities. It concentrates harder for the economic issues related with business and digital era. Later in this paper the focus is on the telecommunication. Innovations and the digital enterpenureship will provide better chances to rise in ICT field.. Telecommunication link and medium play a vital role in the smooth life of any business as well as in the success of ICT. Digital enterpenureship describes the relationship between an entrepreneur and the digital world. Index Terms communication, technology, economy, unified, Cisco

audio-visual, building management and telephone network with the computer network system using a single unified system of cabling, signal distribution and management. XLVII. NEW INNOVATIONS IN ICT

An innovation starts as a concept that is refined and developed before application. Innovations may be inspired by reality. The innovation process, which leads to useful technology, requires: A) Research B) Development C) Production D) Marketing E) Use In this section we are defining some of the greatest and newer innovations in the field of unified communication, telecommunications and audio Visual audio-visual systems. XLVIII. MICROSOFTS LITE GREEN IT PROJECT The Microsoft research labs in India have been working on a project: Lite Green is used to reduce the bill and be energy efficient. It is very important innovation in the field of ICT just because of the desktops, When running at full capacity consume close to 100 220 watts and 62-80 watts when running at close to zero percent CPU usage .This project is as most effective during weekends and overnight. The energy saving is close to 80 percent in such cases. XLIX. SOFTWARE DEFINED RADIO Software defined Radio is a radio communication system where components that have been typically implemented in hardware for example filters, amplifiers, modulators/demodulators, detectors, etc. is instead implemented by means of software on a personal computer or embedded computing devices. This brings benefits to any actor involved in the telecommunication market manufactures operator

XLVI. INTRODUCTION

he world of information and communications technology is constantly changing without stopping anywhere .ICT consists of all technical issues related to handle information and aid communication, including computer and network hardware, communication as well as necessary software. In other words, ICT consists of IT as well as telephony, broadcast media, all types of audio and video processing and transmission and network based control and monitoring functions. The term ICT is now also used to refer to the merging of audio-visual and telephone networks with computer networks through a single cabling or link system. There are large economic incentives (huge cost savings due to elimination of the telephone network) to merge the

56
users. The advantage for users is priority to room their communication to other cellular system and the tape advantage of worldwide mobility and coverage. A basic SDR system may consist of a personal computer equipped with a sound card, or other analog-to-digital converter, preceded by some form of RF front end. Software radios have significant utility for the military and cell phone services, both of which must serve a wide variety of changing radio protocols in real time. L. IPTV Internet Protocol television (IPTV) is a system through which Internet television services are delivered using the architecture and networking methods of the Internet Protocol Suite over a packetswitched network infrastructure, e.g., the Internet and broadband Internet access networks, instead of being delivered through traditional radio frequency broadcast, satellite signal, and cable television (CATV) formats.There are lot of regulations are coming in the Information and entertainment sector due to the changing technological scenario coupled with digitization of broadcasting industries. This changing environment led to the growing popularity of IPTV at the international level. The scope of IPTV in India is not highly recognized. IPTV services may be classified into three main groups: A) Live television with or without interactivity related to the current TV show. B) Time-shifted programming: Catch up TV that replays a TV show that was broadcast hours or days ago, start-over TV works over that replays the current TV show from its beginning. C) Video on demand (VOD): browse a catalog of videos, not related to TV programming LI. MOBILE TV Mobile TV is expected to accept sufficient growth in the Asia pacific region. The mobile TV has already arrived in INDIA and its future is bright. The mobile TV is considered as to be wireless device and wireless services provide a large success in India with both the urban areas and rural areas are rising steadily. IPTV when introduced in the country and was considered to be the next big technology driven in the telecom industry. However the service did not as pick up as considered to be up to the mark. This was due to some factors including low broadband penetration and slow internet access speed. LII. UNIFIED COMMUNICATIONS 300 SERIES: The Unified Communications 300 Series (UC300) is part of Ciscos Foundational Unified Communications (UC) offering which provides basic or foundational UC features for small businesses, typically voice, data and wireless integration, plus some basic integrated messaging applications. UC300 is positioned for businesses that require more basic UC features at an affordable price. Ciscos earlier Unified Communications 500 Series (UC500) also for smaller businesses belongs to the Advanced UC offering which has a more advanced feature set, including video, enhanced security and mobility. LIII. CISCO UNIFIED COMMUNICATIONS MANAGER BUSINESS EDITION 3000 The Cisco Unified Communications Manager Business Edition 3000 is an all-in-one solution specifically designed for mid-sized businesses with 75-300 users (400 total devices) and up to 10 sites (nine remote sites) with centralized call processing. Cisco Unified Communications Manager software, the Cisco Unity Connection messaging solution (12 ports) and Cisco Unified Mobility are pre-installed on a single Cisco Media Convergence Server. LIV. DIGITAL ENTERPENURESHIP Entrepreneurship is the act of being an entrepreneur, which can be defined as "one who undertakes innovations, finance and business activities in an effort to transform innovations into economic goods". The digital enterpenureship term is introduced to define any organization digitally. Each and every detail of enterprise is termed digitally. LV. CONCLUSION In this paper the final output is to focus on emerging field in innovations of Information and communication technology. Finally you can conclude that not only in area of wireless devices for example mobile TV but also in the other fields of ICT like audio visual system may be wire included may the information and communication technology is successful. References
[1] [2] [3] [4] [5] [6] http://foldoc.org/Information+and+Communication+Techno ly http://specials.ft.com/lifeonthenet/FT3NXTH03DC.html www.microsoft.com/education/MSITAcademy/curriculum/r oadmap/default.mspx www.google.co.in www.wikipedia.org www.sciencedaily.com

57

Insider Threat: A Potential Challenges For The Information Security Domain

Abstract The growth of insider threat is ever expanding it proliferation in information technology sectors, managing such threat is one of the exquisite challenge for Information security professionals as well as it is also one of the earnest duties of the members of board and executives of the company concern. The insiders have exceptional privilege of accessing the various vital information and information systems in the organizations; they do sometime misuse such privilege due to immense reasons. Our studies depict that such threat can cause unbounded destruction to the business of the organization and make a situation highly exacerbated for an organization to achieve their objective. In this paper we deliver the result of an empirical study which shows that what the several reasons are which tends the insider of an organization to turn hostile, various methods used by insiders to create IT sabotage and also we researched various measures used to deter, detect and mitigate malicious insider threats.

the main cause of data leakage of organization. This results in huge financial loss to such organization, loss of assets, company defamation even it tends to close of the business of the organization. These types of threat existing across many discipline which includes security related to environmental and technology. Placing the appropriate level of defense and security against such type of threat is a stiff challenge for each concern. It is very much time consuming and also at the same time cost consuming practice. The magnitude of threat provided by the insider is greater than outsider threat, mainly due to enormous level of knowledge of the organizations vital and sensitive information. Various security measures are being taken till date like multilevel security policies and access control lists, but they are not up to mark in mitigating the risk of insider threat.

Index Terms Insider threat, Contentious threat, Disgruntled Employee, Mala fide intention, Instigation.

LVI. INTRODUCTION

ver since with the expansion of computerized data entry in the field of information technology we are paving way towards the digitalized era and so on the potential threat attached to it security is enhancing its vigor accordingly and subsequently creating a situation encumbrance for an organization to face and counter measure those threats. One of such contentious threat is insider threat. Insider threat are typically are those legitimate users who has or had authorization to access the organizations critical Information and information systems like trade secrets, account numbers, social security numbers, intellectual property rights and personal/health records etc figure1 illustrate. Further fig. 1 show that they furnish such critical information which can be stored on the network or shared drive of information systems to the counterpart of the organization like its market competitors, regulators, unauthorized internal users or press and media with mala fide intentions so as to obliterate the confidentiality, integrity and availability of the organizations information or information systems in short it is

58

LVII. METHODS OF ATTACK BY INSIDERS It is been stated that most of the insiders were believed as disgruntled and most of them do crime just because to take revenge. They take revenge mainly due to some negative incident occurred with then in the organization like unsatisfied with the salary package of the organization. It is been found that 95% of the insiders steal the data in working hours. 85% insiders use their credentials to perform IT sabotage where as 44% of them use unauthorized account which they created previously. The survey also reveals that 15% use technical methods and means to carry out their attacks like with the help of logic bombs, virus and various spywares. They post these malware inside the targeted computers or computer systems. The most of the insiders stole or alter the vital information during normal working hour of their duties. Tentatively 8% insiders use remote access from outside the organization to access the secret information of their employers. Finding reveals that most of the insiders are mostly male technical and carry high dignified designation in the organization in majority they are former employee. Social engineering is also one of the biggest weapons in the hands of the insiders to commit data stealing activities. Fig. 2 illistrutes a survey conducted in the year 2008/09 by ZDNet Asia's latest survey on the region's top IT security priorities in which in special emphasis is given to the insider threat. The survey reveals that amongst the existing threat protection against the insider threat is 52.8 percent comparing to others. LVIII. PROPOSED COUNTERMEASURES AGAINST INSIDER ATTACKS Following are the proposed countermeasures against insider attacks, the deployment of which will benefit the organization at large to combat with the insider attacks.

Fig: 2 Top IT Security priorities Source: ZDNet Asia IT Priorities Survey 2008/2009.

A) Conduct proper and result oriented risk assessment for insider threat specially. Comprehensive risk based security should be implemented by the management within the whole organization so as to provide appropriate level of protection to the vital asset of the organization. It is fact that one can not implement 100% against every

risk factor in the organization for each resources but certain adequate level of protection can be provided to vital and most critical resource (Ciechanowicz, 1997). A real security goal of every organization is to protect it critical asset and resources against every possible internal and external threats. Many organization conduct risk assessment but they fail to give proper and special emphasis to the insider

59
threats of such organization as a results only partial protection is provided. During assessment they must not overlook insider of the organization. It is imperative that both qualitative and quantitative risk assessment should be done specially for insider of the organization. Case study: An organization fails to secure his vital data and computer systems form one of his disgruntled employee. The organization runs a business of maintaining the database of phone numbers and addresses for emergency services. Such insider deleted the whole database from the server of organizations network operation centre by getting access to such server by defeating all physical security of such server by using the badge of the network administrator illegally ,thereafter the situation turn more worse as the organization has no more backup mechanism left with them because such backup tapes for recovering the database is also stolen by such malicious insider which is residing inside the organizations network operation centre. It was noticed that no outsider can able to do such disaster as easily as the aforesaid insider can do. The main reason of such calamity is because such insider is well acquainted with the security feature of the organization thus exploitation of such security is easier for insider than any outsider. Thus if the organization has conducted the proper risk assessment prior to such incident to ascertain what vulnerabilities and threat is exiting ,the organization would have easily over come such risk factor. B) Separate security policies and procedure should be stated in organization for insiders. which is inevitable. Case study: A disgruntled employee of the organization is the software developer who downloads the password files from the organizations server into his laptop with pure malicious intention to break the password of it organization. He cracked the password with the help of various password cracking tools available in the internet and finally he cracked the password along with the root password of the server of the organization. Thereafter he started unauthorized network access of such organization and started bragging to the network administrator for dismantling the important database of such organization which is residing in that server. In this case the organization has no specific security policies which oversee and control the line of action of its employees, despites of having organizational security policies and procedure there must be a separate security policies and procedure which place the stringent restriction on the performance of it employee so that they do not exceed their limits beyond requirements. Later the organization has modified the policies and procedure and placed the rigorous password management policy for its insider of the organization. The gist of whole case is to strengthen the inner infrastructure of the organization so that all operational activities should go seamless. C) Use Advanced and stringent logical control technologies. It is Imperative for an organization to secure his sensitive data from malicious insiders which is in digital form by some advanced and stringent logical control technologies like for instance (DLP) Data leakage prevention system . This tool to huge extent is very utile to prevent to insider attacks on sensitive data of organization it is a software which enforce polices to protect sensitive and critical information of organization. This software tries to capture all this activities of the user if they deviate from their legitimate duty like for instance if any person tries to copy some sensitive data of the organization by inserting USB storage drive in computer and trying to steal such data which is against security policies ,this software is set in such a way that immediately trigger the alarm to restrict such action in this way its defends the organizations critical data from getting robbed by in own disgruntled insiders malicious action. It is a preventive measure against insider threat in short it is a watch dog inside the organization (Magklaras and Furnell, 2002). Likewise, other technology which helps in combating

In the organization where all sensitive information is in digitized form it is highly recommended that along with the general security policy of the organization it will be more advantageous for a organization if they draft one separate security polices and procedure for all conduct and activities of their insiders like implementation of strong password and account management (Roy Sarkar, 2011). This effort wil result in getting more close and stringent control on the insider daily activities in such organization and prevents misunderstandings amongst the employees. The policies and procedure should specifically mention the constraints, privilege and responsibilities of their employees. There should also exists a flexibility of change in such security policies and procedure reason is very obvious that organization is not a static entity there is always some changes occurs in the working of organization and

60
with insider threat is network access control which provides control over client and servers activities. D) Must follow separation of duties and least privilege The separation of duties and principle of least privilege must be followed up by every business process in the organization so as to reduce to damage that may be caused by malicious insiders in the organization (Alghathbar, 2007). It requires dividing the duties of every employees according to their skill set so as to reduces the possibility that their can be a situation one employees may embezzle or steal the sensitive or vital information or may commit IT sabotage without the help of other employee. The separation of duties in an organization can be enforced both by technical and non technical way. The role based accesses control is also plays vital role in controlling the activities of the insiders, such least privilege mechanism provides greater strength and reliability in the organization regarding the proper utilization of resource and limiting the impact of insider threat. Case study: One of the employee of the immigration service providing organization has exceed the limit of his work and he fraudulently made the modification in the clients records related to united state immigration asylum decision which is highly sensitive data. Later in the investigation it was found that such modification was done with the help of organizations computer system which was only managed by the organizations authorized employees only. This fraudulent act was done in consideration for $50,000. Thereafter the organization to control such situation, implemented the separation of duties via role based access control methodology so as to limit the authorization of its employees in the organization. In addition to this least privilege also been implemented to prevent its officials of organization to approve or modify any immigration decision without having authority to do it. E) Effective leadership limits the insider security threat. members to integrate all the activities and interest (Williams, 2008). Leadership is binding energy which control the ideology of its team members so as to focus on the achieving the stated objective of organization only. It motivate the insider and it is a constant source of inspirations for it followers. It has rightly been said that a good leadership helps in mitigating the insider threat to a great extent in the organization by maintaining a proper and well defined discipline in the organization. Therefore it is recommended that to establish the practice of effective leadership.

F)

Discontinue the computer access to exemployees who left the job in organization.

A survey reveal that many organization do feel uneasy in their effort to combat with insider threat only due to not having practice of effective leadership over its staffs and employees. For accomplishment of organizations objectives, leadership plays a very vital role it provides team spirits amongst the organizations staffs and its

It is suggested that whenever any employees gets termination or leaves the organization once the employment gets terminated it is crucial to follow stringent pattern which disable all the access path to the organizations network and computer systems for that terminated employees. If such procedure is not duly followed then organizations computer system and its network is vulnerable to access by those unauthorized and illegitimate terminated employees which can be disastrous for such organizations sustainability. It is found that many former employees frequently use remote access to the network of the organization thus it is recommended that those remote access or VPN (virtual private network) should be disabled for particularly for those aforesaid ex- employees. In addition, the termination process must include the termination of physical access to the organization this can be done by collecting back all the keys, badge and parking permits etc from the terminated employee(Hines,2007). When any employee gets fired it is important to circulate the termination notice of that employee to whole other working employees of the organization, this will considerably reduces the insider threat in the organization. Case study: One banking organizations system administrator was terminated startling without the termination noticed from his employer for his termination .After getting terminated he started access the web server of the company from his home for nefarious activities in order to retaliate against the company. It is been found that although such administrator was been terminated from the company but his accessibility to the system of the company in not been removed so far and on account of that such terminated administrator had used the login password

61
of web server which is not been changed after his termination and he started exploiting the vulnerability of such web server through remote login and put the server to get shut down and finally get crashed. This incident cost heavily to the company due to this such company lost many of his venture with his client on account of the failure of his web server which carry critical information of the company. The essence of the case is that it is the responsibility of the company to thoroughly disable the all access points of the terminated employees once they get terminated. G) Organize an response plan. emergency insider threat external disgruntled employees to wag a attack against organization or not especially to steal the critical information of the organization for financial gain (Colwill, 2009). Frequent security awareness program will keep up- to- date the security professional of the organization from latest types of threat and help them to manage those threat efficiently and by report to its management about it. Social engineering is also one of the malevolent practices which help the perpetrator to gain the physical or electronic access of the victim network by account and password theft of such victim .The social engineering that how the malicious insiders insert hardware key logger on the computer to steal the vital information of the organization . Thus we can overcome this types of risk factor by instituting security awareness programs to organizations employees so as they remain vigilant in advance to face this types of threat. I) Perform auditing and monitoring on every online action of the employee.

The organization must have a well defined and duly documented insider response plan to face the stiff challenge during period when insider attack occurs in the organization. This plan is back up to combat with the insider threats. The insider threat response plan usually differ form the response plan for the external attackers. These plans are usually drafted by the technical staff of the organization they are designated as insider threat response team for such organization (Cappelli , Moore, Trzeciak and Shimeall,2009). The plan must contains the specific action to be taken against the malicious attackers, how to redress the organization from it impact and also the responsibilities of the response team members. Last but not least such plan must acquire the full support of top management of the organization. The details of the plan must not be shared by all employees of the organization it should be handle only by the confidential staff member of the organization H) Organize periodic security programs for all employees. awareness

It has been found that periodic security awareness program to all employees enhances the efficiency to identify the malicious insider to great extent inside the organization. These program help organization to measure the behavior of it employees, that what kind of behaviors may amounts to threat to the organization. Whether their conduct is suspicious or not, all this can be ascertain only through organizing aforesaid security awareness programs. It is also recommended to managers and top officials to procure these awareness programs as it helps in ascertaining the what types of social networking is prevailing inside the organization amongst the employees whether insiders are engaged with

Auditing and monitoring are the activities which can help in discovering the suspicious activities of the malicious insiders in the organization before any adverse consequence happens (Peecher, Schwartz and Solomon, 2007). In information technology domain the auditing and monitoring refers to as the verification and investigation of various electronic systems, computer networks log files etc which helps in great extent to trace the root cause of insider threat in the organization. In addition, auditing must verify the sanctity and integrity of all access logged file in the network of the organization (Moynihan, 2008). It is also imperative to conduct random auditing and monitoring in the organization it will serve as deterrent control from performing any malicious action against the organization. Various automated monitoring tools helps in preventing and detecting those Emails which are written for counterparts of the organization by malicious insiders of such organization, it also help in monitoring and detecting the copied documents form hard drive or flash media or drives and also helps in preventing insider from printing, copying, or downloading critical data of the organization, thus it provides protection of privacy. J) Protection to physical environment of the organization.

Although the organization is having electronic security for its vital business assets and information,

62
it is also imperative to give emphasis on securing the physical environment structure of the organization from both internal and external threat. The organization must firstly protect his employees who are one of the critical assets of the organization. This can be achieved only by securing the office surrounding from various occupational hazard and form malicious outsiders (Magklaras and Furnell, 2002). By securing the physical environment of organization it tends to prevent the terminated employees to regain the access again with the legitimate current employees of such organization as such physical security will act like an extra layer of defense. Thus by maintain such layer of physical security an insider of such organization can have least chance to turn hostile against his organization for financial gain on the instigation of any terminated employee. Therefore, physical security also carries equal importance in eradication of insider threat form organization. according to their security requirement. REFERENCES
[1] ZDNet Asia IT Priorities Survey 2008/2009, http://www.zdnetasia.com/asia-worried-about-insiderthreat-62047738.htm. (Accessed on 22/02/2011) [2] Ciechanowicz.Z(1997), Risk analysis: requirements, conflicts and problems, Computers & Security, 16(3), 223-232. [3] Roy Sarkar.K(2011) , Assessing insider threats to information security using technical, behavioral and organizational measures, Information security technical report, Article in press. [4] Magklaras.G.B and Furnell.S.M (2002), Insider Threat Prediction Tool: Evaluating the probability of IT misuse, Computers & Security, 21(1), 62-73. [5] Alghathbar.K(2007) , Validating the enforcement of access control policies and separation of duty principle in requirement engineering, Information and Software Technology ,49(2), 142157. [6] Williams.A.H (2008), In a trusting environment, everyone is responsible for information security, Information security technical report, 13(4), 207215. [7] Hines.M(2007), Insider threats remain ITs biggest nightmare, InfoWorld, September 22. [8] Cappelli .D,Moore.A ,Trzeciak .R and Shimeall.T(2009), Common Sense Guide to Prevention and Detection of Insider Threats, 3rd Edition Version 3.1, Carnegie Mellon University CyLab. [9] Colwill.C(2009), Human factors in information security: The insider threat e Who can you trust these days?, Information Security Technical Report,14(4),186-196. [10] Peecher.M, Schwartz.R and Solomon.I(2007) , Its all about audit quality: Perspectives on strategic-systems auditing, Accounting, Organizations and Society ,32(4-5) ,463485. [11] Moynihan.J(2008),Managing the Insider Threat: Data Surveillance, Information Systems Audit and Control Association, www.isaca.org. (Accessed on 11/02/2011) [12] Magklaras.G.B and Furnell.S.M(2002), Insider Threat Prediction Tool: Evaluating the probability of IT misuse, Computers & Security, 21(1), 62-73.

IV. CONCLUSIONS
Insider threat is the long continued security issue for every organization. We can only mitigate the threat even with the use of the highly sophisticated tools and technique we cannot altogether eradicate the threat .Only some precautionary aforesaid measures if followed meticulously can helps organization to counter the insider threat to some extent. In this paper we attempted to propose some specific guidelines for combating with insider threat in every organization. Indeed we do not claim that our guidelines completely secure to face the aforesaid threat, it may differ from organization to organization

63

Search Engine: Factors Influencing the Page Rank

Abstract In todays world Web is considered as ocean of data and information(like text,videos, multimedia etc.) consisting of millions and millions of web pages in which web pages are linked with each other like a tree. It is often argued that, especially considering the dynamic of the internet, too much time has passed since the scientific work on PageRank, as that it still could be the basis for the ranking methods of the Google search engine. There is no doubt that within the past years most likely many changes, adjustments and modifications regarding the ranking methods of Google have taken place, but PageRank was absolutely crucial for Google's success, so that at least the fundamental concept behind PageRank should still be constitutive. This paper describes the factors which affects the ranking of the web pages and helps in calculating those factors. By adapting these factors website developers can increase their sites page rank and within the PageRank concept, considering the rank of a document is given by the rank of those documents which link to it. Their rank again is given by the rank of documents which link to them. The PageRank of a document is always determined recursively by the PageRank of other documents

links leading to them. [1]PageRank thinks of links as votes, where a page linking to another page is casting a vote. A) Page Rank: PageRank is the algorithm used by the Google search engine, originally formulated by Sergey Brin and Larry Page in their paper The Anatomy of a Large-Scale Hypertextual Web Search Engine. It is based on the premise, prevalent in the world of academia, that the importance of a research paper can be judged by the number of citations the paper has from other research papers. Brin and Page have simply transferred this premise to its web equivalent: the importance of a web page can be judged by the number of hyperlinks pointing to it from other web pages. [2]Now web graph has huge dimensions and is subject to dramatic updates in terms of nodes and links, therefore the PageRank assignment tends to became obsolete very soon.[4] B) About algorithm

LIX. INTRODUCTION

ageRank was developed by Google founders Larry Page and Sergey Brin at Stanford. At the time that Page and Brin met, search engines typically linked to pages that had the highest keyword density, which meant people could game the system by repeating the same phrase over and over to attract higher search page results. The rapidly growing web graph contains several billion nodes, making graph-based computations very expensive. One of the best known web-graph computations is Page-Rank, an algorithm for determining the importance of Web pages. [7] Page and Brin's theory is that the most important pages on the Internet are the pages with the most

It may look daunting to non-mathematicians, but the PageRank algorithm is in fact elegantly simple and is calculated as follows: i) PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) where PR(A) is the PageRank of a page A PR(T1) is the PageRanck of a page T1 C(T1) is the number of outgoing links from the page T1 d is a damping factor in the range 0 < d < 1, usually set to 0.85 The PageRank of a web page is therefore calculated as a sum of the PageRanks of all pages linking to it (its incoming links), divided by the number of links on each of those pages (its outgoing links). From a search engine marketer's point of view, this

64
means there are two ways in which PageRank can affect the position of your page on Google: ii) The number of incoming links. Obviously the more of these the better. But there is another thing the algorithm tells us: no incoming link can have a negative effect on the PageRank of the page it points at. At worst it can simply have no effect at all. iii) The number of outgoing links on the page which points at your page. The fewer of these the better. This is interesting: it means given two pages of equal PageRank linking to you, one with 5 outgoing links and the other with 10, you will get twice the increase in PageRank from the page with only 5 outgoing links. At this point we take a step back and ask ourselves just how important PageRank is to the position of your page in the Google search results. The next thing we can observe about the PageRank algorithm is that it has nothing whatsoever to do with relevance to the search terms queried. It is simply one single (admittedly important) part of the entire Google relevance ranking algorithm. Perhaps a good way to look at PageRank is as a multiplying factor, applied to the Google search results after all its other computations have been completed. The Google algorithm first calculates the relevance of pages in its index to the search terms, and then multiplies this relevance by the PageRank to produce a final list. The higher your PageRank therefore the higher up the results you will be, but there are still many other factors related to the positioning of words on the page which must be considered first.[2] LX. THE EFFECT OF INBOUND LINKS It has already been shown that each additional inbound link for a web page always increases that page's PageRank. Taking a look at the PageRank algorithm, which is given by PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) one may assume that an additional inbound link from page X increases the PageRank of page A by d PR(X) / C(X) where PR(X) is the PageRank of page X and C(X) is the total number of its outbound links. But page A usually links to other pages itself. Thus, these pages get a PageRank benefit also. If these pages link back to page A, page A will have an even higher PageRank benefit from its additional inbound link. The single effects of additional inbound links shall be illustrated by an example.

We regard a website consisting of four pages A, B, C and D which are linked to each other in circle. Without external inbound links to one of these pages, each of them obviously has a PageRank of 1. We now add a page X to our example, for which we presume a constant Pagerank PR(X) of 10. Further, page X links to page A by its only outbound link. Setting the damping factor d to 0.5, we get the following equations for the PageRank values of the single pages of our site: PR(A) = 0.5 + 0.5 (PR(X) + PR(D)) = 5.5 + 0.5 PR(D) PR(B) = 0.5 + 0.5 PR(A) PR(C) = 0.5 + 0.5 PR(B) PR(D) = 0.5 + 0.5 PR(C) Since the total number of outbound links for each page is one, the outbound links do not need to be considered in the equations. Solving them gives us the following PageRank values: PR(A) = 19/3 = 6.33 PR(B) = 11/3 = 3.67 PR(C) = 7/3 = 2.33 PR(D) = 5/3 = 1.67 We see that the initial effect of the additional inbound link of page A, which was given by d PR(X) / C(X) = 0,5 10 / 1 = 5 is passed on by the links on our site. A) The Influence of the Damping Factor The degree of PageRank propagation from one page to another by a link is primarily determined by the damping factor d. If we set d to 0.75 we get the following equations for our above example: PR(A) = 0.25 + 0.75 (PR(X) + PR(D)) = 7.75 + 0.75 PR(D) PR(B) = 0.25 + 0.75 PR(A) PR(C) = 0.25 + 0.75 PR(B) PR(D) = 0.25 + 0.75 PR(C) Solving these equations gives us the following PageRank values: PR(A) = 419/35 = 11.97

65
PR(B) = 323/35 = 9.23 PR(C) = 251/35 = 7.17 PR(D) = 197/35 = 5.63 First of all, we see that there is a significantly higher initial effect of additional inbound link for page A which is given by d PR(X) / C(X) = 0.75 10 / 1 = 7.5 We remark that the way one handles the dangling node is crucial, since there can be a huge number of them. According to Kamvar et al. [Kamvar et al. 03b], a 2001 sample of the web containing 290 million pages had only 70 million nondangling nodes. This large amount of nodes without out-links includes both pages that do not point to any other page and also pages whose existence is inferred by hyperlinks but not yet reached by the crawler. Besides, a dangling node can represent a pdf, ps, txt, or any other file format gathered by a crawler but with no hyperlinks pointing outside.[4] This initial effect is then propagated even stronger by the links on our site. In this way, the PageRank of page A is almost twice as high at a damping factor of 0.75 than it is at a damping factor of 0.5. At a damping factor of 0.5 the PageRank of page A is almost four times superior to the PageRank of page D, while at a damping factor of 0.75 it is only a little more than twice as high. So, the higher the damping factor, the larger is the effect of an additional inbound link for the PageRank of the page that receives the link and the more evenly distributes PageRank over the other pages of a site. B) The Actual Effect of Additional Inbound Links At a damping factor of 0.5, the accumulated PageRank of all pages of our site is given by PR(A) + PR(B) + PR(C) + PR(D) = 14 Hence, by a page with a PageRank of 10 linking to one page of our example site by its only outbound link, the accumulated PageRank of all pages of the site is increased by 10. (Before adding the link, each page has had a PageRank of 1.) At a damping factor of 0.75 the accumulated PageRank of all pages of the site is given by PR(A) + PR(B) + PR(C) + PR(D) = 34 This time the accumulated PageRank increases by 30. The accumulated PageRank of all pages of a site always increases by (d / (1-d)) (PR(X) / C(X)) Where X is a page additionally linking to one page of the site, PR(X) is its PageRank and C(X) its number of outbound links. The formula presented above is only valid, if the additional link points to a page within a closed system of pages, as, for instance, a website without outbound links to other sites. As far as the website has links pointing to external pages, the surplus for the site itself diminishes accordingly, because a part of the additional PageRank is propagated to external pages. The justification of the above formula is given by RaphLevien and it is based on the Random Surfer Model. The walk length of the random surfer is an exponential distribution with a mean of (d/(1-d)). When the random surfer follows a link to a closed system of web pages, he visits on average (d/(1-d)) pages within that closed system. So, this much more PageRank of the linking page - weighted by the number of its outbound links - is distributed to the closed system. For the actual PageRank calculations at Google, Lawrence Page und Sergey Brin claim to usually set the damping factor d to 0.85. Thereby, the boost for a closed system of web pages by an additional link from page X is given by (0.85 / 0.15) (PR(X) / C(X)) = 5.67 (PR(X) / C(X)) So, inbound links have a far larger effect than one may assume.[2] LXI. THE EFFECT OF OUTBOUND LINKS Since PageRank is based on the linking structure of the whole web, it is inescapable that if the inbound links of a page influence its PageRank, its outbound links do also have some impact. To illustrate the effects of outbound links, we take a look at a simple example.

We regard a web consisting of two websites, each having two web pages. One site consists of pages A and B, the other consists of pages C and D. Initially, both pages of each site solely link to each other. It is

66
obvious that each page then has a PageRank of one.[6] Now we add a link which points from page A to page C. At a damping factor of 0.75, we therefore get the following equations for the single pages' PageRank values: PR(A) = 0.25 + 0.75 PR(B) PR(B) = 0.25 + 0.375 PR(A) PR(C) = 0.25 + 0.75 PR(D) + 0.375 PR(A) PR(D) = 0.25 + 0.75 PR(C) Solving the equations gives us the following PageRank values for the first site: PR(A) = 14/23 PR(B) = 11/23 We therefore get an accumulated PageRank of 25/23 for the first site. The PageRank values of the second site are given by PR(C) = 35/23 PR(D) = 32/23 So, the accumulated PageRank of the second site is 67/23. The total PageRank for both sites is 92/23 = 4. Hence, adding a link has no effect on the total PageRank of the web. Additionally, the PageRank benefit for one site equals the PageRank loss of the other. from that page lose PageRank accordingly.[6] Even if the actual PageRank values for the pages of an existing web site were known, it would not be possible to calculate to which extend an added outbound link diminishes the PageRank loss of the site, since the above presented formula regards the status after adding the link. B) Intuitive Justification of the Effect of Outbound Links The intuitive justification for the loss of PageRank by an additional external outbound link according to the Random Surfer Modell is that by adding an external outbound link to one page the surfer will less likely follow an internal link on that page. So, the probability for the surfer reaching other pages within a site diminishes. If those other pages of the site have links back to the page to which the external outbound link has been added, also this page's PageRank will deplete. We can conclude that external outbound links diminish the totalized PageRank of a site and probably also the PageRank of each single page of a site. But, since links between web sites are the fundament of PageRank and indispensable for its functioning, there is the possibility that outbound links have positive effects within other parts of Google's ranking criteria. Lastly, relevant outbound links do constitute the quality of a web page and a webmaster who points to other pages integrates their content in some way into his own site. C) Dangling Links An important aspect of outbound links is the lack of them on web pages. When a web page has no outbound links, its PageRank cannot be distributed to other pages. Lawrence Page and Sergey Brincharacterize links to those pages as dangling links.

A) The Actual Effect of Outbound Links As it has already been shown, the PageRank benefit for a closed system of web pages by an additional inbound link is given by (d / (1-d)) (PR(X) / C(X)), Where X is the linking page, PR(X) is its PageRank and C(X) is the number of its outbound links. Hence, this value also represents the PageRank loss of a formerly closed system of web pages, when a page X within this system of pages now points by a link to an external page. The validity of the above formula requires that the page which receives the link from the formerly closed system of pages does not link back to that system, since it otherwise gains back some of the lost PageRank. Of course, this effect may also occur when not the page that receives the link from the formerly closed system of pages links back directly, but another page which has an inbound link from that page. Indeed, this effect may be disregarded because of the damping factor, if there are enough other web pages in-between the link-recursion. The validity of the formula also requires that the linking site has no other external outbound links. If it has other external outbound links, the loss of PageRank of the regarded site diminishes and the pages already receiving a link

A A

B
The effect of dangling links shall be illustrated by a small example website. We take a look at a site consisting of three pages A, B and C. In our example, the pages A and B link to each other. Additionally,

67
page A links to page C. Page C itself has no outbound links to other pages. At a damping factor of 0.75, we get the following equations for the single pages' PageRank values: PR(A) = 0.25 + 0.75 PR(B) PR(B) = 0.25 + 0.375 PR(A) PR(C) = 0.25 + 0.375 PR(A) Solving the equations gives us the following PageRank values: PR(A) = 14/23 PR(B) = 11/23 PR(C) = 11/23 So, the accumulated PageRank of all three pages is 36/23 which is just over half the value that we could have expected if page A had links to one of the other pages. According to Page and Brin, the number of dangling links in Google's index is fairly high. A reason therefore is that many linked pages are not indexed by Google, for example because indexing is disallowed by a robots.txt file. Additionally, Google meanwhile indexes several file types and not HTML only. PDF or Word files do not really have outbound links and, hence, dangling links could have major impacts on PageRank. Regarding our example website for dangling links, removing page C from the database results in page A and B each having a PageRank of 1. After the calculations, page C is assigned a PageRank of 0.25 + 0.375 PR(A) = 0.625. So, the accumulated PageRank does not equal the number of pages, but at least all pages which have outbound links are not harmed from the dangling links problem. The definition of PageRank above has another intuitive basis in random walks on graphs. The simplified version corresponds to the standing probability distribution of a random walk on the graph of the Web. Intuitively, this can be thought of as modeling the behavior of a random surfer. The random surfer simply keeps clicking on successive links at random. However, if a real Web surfer ever gets into a small loop of web pages, it is unlikely that the surfer will continue in the loop forever. Instead, the surfer will jump to some other page.[5] By removing dangling links from the database, they do not have any negative effects on the PageRank of the rest of the web. Since PDF files are dangling links, links to PDF files do not diminish the PageRank of the linking page or site. So, PDF files can be a good means of search engine optimization for Google. LXII. CONCLUSION So what we conclude from here is the main factors influencing the page rank is the inbound links and the outbound links including the dangling links. Future work that can be done is the total no of pages affecting the page rank of a web site. In order to prevent PageRank from the negative effects of dangling links, pages without outbound links have to be removed from the database until the PageRank values are computed. According to Page and Brin, the number of outbound links on pages with dangling links is thereby normalized. As shown in our illustration, removing one page can cause new dangling links and, hence, removing pages has to be an iterative process. After the PageRank calculation is finished, PageRank can be assigned to the formerly removed pages based on the PageRank algorithm. Therefore, as many iterations are needed as for removing the pages. Regarding our illustration, page C could be processed before page B. At that point, page B has no PageRank yet and, so, page C will not receive any either. Then, page B receives PageRank from page A -and during the second iteration, also page C gets its PageRank. REFERENCES
[1] The PageRank Citation Ranking: Bringing Order to the Web (PDF, 1999) by Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd Sergey Brin, Larry Page (1998). "The Anatomy of a LargeScale Hypertextual Web Search Engine". Proceedings of the 7th international conference on World Wide Web (WWW). TaherHaveliwala and SepandarKamvar. (March 2003). "The Second Eigenvalue of the Google Matrix" Stanford University Technical Report: 7056. Gianna M. Del Corso, Antonio Gull, Francesco Romani (2005). "Fast PageRank Computation via a Sparse Linear System". What can you do with a Web in your Pocket (PS, 1998) by Sergey Brin, Rajeev Motwani, Larry Page and Terry Winograd Efficient Crawling Through URL Ordering (PDF, 1998) by Junghoo Cho, Hector Garcia-Molina and Lawrence Page L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. StanfordDigital Libraries Working Paper, 1998.

[2]

[3]

[4]

[5] [6] [7]

68

Are The CMMI Process Areas Met By Lean Software Development?

Abstract Agile software development is a conceptual framework for undertaking software engineering projects. There are a number of agile software development methodologies such as Extreme Programming (XP), Lean Development, and Scrum. Along with these, many organizations demand CMMI compliance of projects where agile methods are employed. This paper analyzes to what extent the CMMI process areas can be covered by Lean Development and where adjustments have to be made.

to be followed in developing a project using Lean thinking [1] are: Step 1: Identify value Step 2: Map the value stream Step 3: Create flow Step4: Establish pull Step 5: Seek perfection LXV. PRINCIPLES OF LEAN THINKING 1. Eliminate Waste 2. Increase Learning/ Feedback 3. Make Decisions as Late as Possible/ Delay commitment 4. Deliver as Quickly as Possible/ Deliver fast 5. Empower the Team 6. Building Integrity In 7. See the "Big Picture"/ See the Whole

Index Terms Agile, CMMI, Lean, Lean software development, process areas, maturity levels

LXIII. INTRODUCTION

rganizational maturity indicators like CMMI levels have become increasingly important for software development. In large organizations there are policies which enforce that all parts of the organization have to achieve certain maturity levels. At the same time, Lean software development is gaining increasing attention in the software development community. Just like the other agile methods, lean methods offer new approaches to the existing challenges in software development. In this paper we would investigate applicability and usefulness of CMMI model suite in agile/lean development efforts.

LXVI. CMMI Capability Maturity Model Integration (CMMI) is a process improvement approach that helps an organization to improve their performance. According to the Software Engineering Institute (SEI, 2008), CMMI helps "integrate traditionally separate organizational functions, set process improvement goals and priorities, provide guidance for quality processes, and provide a point of reference for appraising current processes."[2]

LXIV. LEAN SOFTWARE DEVELOPMENT The methodology of lean software development is the application of lean manufacturing principles when developing software. Bob Charette, the originator, writes that the measurable goal of Lean development is to build software with one-third the human effort, one-third the development hour and one-third the investment as compared to what SEI CMM Level 3 organization would achieve. The steps

LXVII. ARE THE PROCESS AREAS OF CMMI MET BY LEAN SOFTWARE DEVELOPMENT? Process areas are the areas that will be covered by the organization's processes. The process areas have been listed below: A) Project Planning[4]

69
plans for iteration at a time. B) Project Monitoring and Control [4] SG 1: Monitor the Project against the Plan Table 4 Specific practices and their analysis Specific practices SP 1.1 Monitor Project Planning Parameters SP 1.2 Monitor Commitments SP 1.3 Monitor Project Risks SP 1.4 Monitor Data Management SP 1.5 Monitor Stakeholder Involvement SP 1.6 Conduct Progress Reviews SP 1.7 Conduct Milestone Reviews Covered by lean? Find out the wastes, check the progress, continuous review, continuous waste elimination, communication with stakeholders, pull kanban metrics [3], meeting commitments on daily basis. Priorities are set for each iteration.

SG1: Establish estimates Table 1 Specific practices and their analysis Specific practices SP 1.1 Estimate the Scope of the Project SP 1.2 Establish Estimates of Work Product and Task Attributes SP 1.3 Define Project Lifecycle Phases SP 1.4 Estimate Effort and Cost Covered by lean? Decide the iterations Value stream mapping is the mechanism used
[3]

To eliminate wastes

SG2: Develop a project plan Table 2 Specific practices and their analysis Specific practices SP 2.1 Establish the Budget and Schedule SP 2.2 Identify Project Risks SP 2.3 Plan Data Management SP 2.4 Plan the Project's Resources SP 2.5 Plan Needed Knowledge and Skills SP 2.6 Plan Stakeholder Involvement SP 2.7 Establish the Project Plan SG3: Obtain commitment to the plan Table 3 Specific practices and their analysis Specific practices SP 3.1 Review Plans that Affect the Project SP 3.2 Reconcile Work and Resource Levels SP 3.3 Obtain Plan Commitment Covered by lean? Plan for single iteration not the whole project at the beginning. Covered by lean? Decide the work flow, define the definition of done, option analysis [3], involve the stakeholders for feedback.

SG 2: Manage Corrective Action to Closure Table 5 Specific practices and their analysis Specific practices SP 2.1 Analyze Issues SP 2.2 Take Corrective Action SP 2.3 Manage Corrective Actions Covered by lean? Rework is done if iterations performance deviates from what was to be done. Plan is flexible so deviation from plan is allowed. Features are dropped for the next iteration rather than not meeting the current iterations deadline[3]

C)

Supplier agreement management [4]

SG 1 Establish Supplier Agreements Table 6 Specific practices and their analysis Specific practices SP 1.1 Determine Acquisition Type SP 1.2 Select Suppliers Covered by lean? Tight coupling of suppliers, fast response times[3]

SP 1.3 Establish Supplier Agreements SG 2 Satisfy Supplier Agreements Table 7 Specific practices and their analysis

CMMI covers the entire project in a single go, whereas Lean breaks the project into iterations and

70
Covered by lean? Quick delivery in short durations

Issues
This process area is relevant for entire project but in case of lean software development teams and the customer are more crucial. Therefore it might be replaced with a more relevant process area that is related to integration of team or customers.

Specific practices SP 2.1 Review COTS products SP 2.2 Execute the Supplier Agreement SP 2.3 Accept the Acquired Product SP 2.4 Ensure Transition of Products

E) Risk Management [4] SG 1 Prepare for Risk Management Table 10 Specific practices and their analysis

D) Integrated Project Management [4] SG 1 Use the Project's Defined Process Table 8 Specific practices and their analysis

Specific practices Specific practices Covered by lean? Define the steps for iteration, learn within a team and through organization, change quickly according to the need [3], contribute to team by equal participation.

Covered by lean? Find the wastes/ risks and eliminate them.

SP 1.1 Establish the Project's Defined Process SP 1.2 Use Organizational Process Assets for Planning Project Activities SP 1.3 Establish the Project's Work Environment SP 1.4 Integrate Plans SP 1.5 Manage the Project Using the Integrated Plans SP 1.6 Contribute to Organizational Process Assets

SP 1.1 Determine Risk Sources and Categories SP 1.2 Define Risk Parameters SP 1.3 Establish a Risk Management Strategy

SG 2 Identify and Analyze Risks Table 11 Specific practices and their analysis

Specific practices SP 2.1 Identify Risks SP 2.2 Evaluate, Categorize, and Prioritize Risks

Covered by lean? Find out the value added activities and separate wastes.

F) Quantitative Project Management [4] SG 2 Coordinate and Collaborate with Relevant Stakeholders Table 9 Specific practices and their analysis SG 1 Prepare for Quantitative Management Table 12 Specific practices and their analysis

Specific practices

Covered by lean? Close collaboration with stakeholders, frequent feedback from stakeholders [3].

Specific practices SP 1.1 Establish the Projects Objectives SP 1.2 Compose the Defined Processes

SP 2.1 Manage Stakeholder Involvement SP 2.2 Manage Dependencies SP 2.3 Resolve Coordination

Covered by lean? Subprocesses are the iterations in the value stream.

71
SP 1.3 Select Subprocesses and Attributes SP 1.4 Select Measures and Analytic Techniques SP 1.1 Elicit Needs SP 1.2 Transform Stakeholder Needs into Customer Requirements SG 2 Develop Product Requirements Table 16 Specific practices and their analysis SG 2 Quantitatively Manage the Project Table 13 Specific practices and their analysis Specific practices Specific practices
SP 2.1 Monitor the Performance of Selected Subprocesses SP 2.2 Manage Project Performance SP 2.3 Perform Root Cause Analysis

Rough idea of customer needs

Covered by lean?
The technique of 5 whys can be used for root cause analysis.

SP 2.1 Establish Product and Product Component Requirements SP 2.2 Allocate Product Component Requirements SP 2.3 Identify Interface Requirements

Covered by lean? Prioritize the requirements at iteration level


[3]

G) Requirements Management [4] SG 1 Manage Requirements Table 14 Specific practices and their analysis Specific practices SP 1.1 Understand Requirements SP 1.2 Obtain Commitment to Requirements SP 1.3 Manage Requirements Changes SP 1.4 Maintain Bidirectional Traceability of Requirements SP 1.5 Ensure Alignment Between Project Work and Requirements H) Requirements Development [4] SG 1 Develop Customer Requirements Table 15 Specific practices and their analysis Covered by lean? Understanding of requirements by meeting the customer [3] . Requirements evolve as project progresses. Work is more important than signing agreements. Initial requirements might not be useful as the project move to completion. SG 3 Analyze and Validate Requirements Table 17 Specific practices and their analysis

Specific practices SP 3.1 Establish Operational Concepts and Scenarios SP 3.2 Establish a Definition of Required Functionality and Quality Attributes SP 3.3 Analyze Requirements SP 3.4 Analyze Requirements to Achieve Balance SP 3.5 Validate Requirements
I)

Covered by lean? Functional analysis is limited to iteration. Requirements are met according to the definition of done.

Technical Solution [4]

Table 18 Specific goal, its specific practices and their analysis

Specific goal

Specific practices

Covered by lean?

Specific practices

Covered by lean?

72
SG 1 Select Product Component Solutions SP 1.1 Develop Alternative Solutions and Selection SP 1.2 Select Product Component Solutions SP 3.1 Implement the Design SP 3.2 Develop Product Support Documentation Q) Causal Analysis and Resolution [4] This process area is followed by lean software development. R) Organizational Process Focus [4] The focus here is on iterations rather than on the process of entire organization. S) Organizational Process Definition [4] The focus is on iterations rather than on the process of entire organization. T) Organizational Training [4] The focus of training is individuals. They are provided with the environment and tools and are trusted to develop the project. U) Organizational Process Performance [4] The process is fixed in lean software development so SP 1.2 [4] is not applicable. The focus of lean software development is to improve the process by eliminating the wastes. V) Organizational Performance Management [4] The practices of this process area fully support the lean practices. LXVIII. CONCLUSION CMMI and lean software development are approaches to continuous improvement. This paper concludes that CMMI tends to reduce risk in lean software development. These practices make good sense, and you could argue that it has always inherently been expected as part of your agile method. In general the CMMI model provides a good understanding what practices to consider but you will have to adopt it to your context, and find lean implementations for the practices. REFERENCES
[1] M. Poppendieck and T. Poppendieck, Lean Software Development: An Implementation Guide: AddisonWesley,2006. Integrating Lean, Six Sigma, and CMMI by David N. Cardhttp. http://www.daytonspin.org/Presentations/CardIntegrating%20Lean-Oct5.pdf Agile/Lean Development and CMMI by Jeffrey L. Dutton, Richard S. McCabe Systems and Software Consortium http://en.wikipedia.org/wiki/Capability_Maturity_Model_Int egratin

Options are considered till last moment possible.

SG 3 Implement the Product Design

Design is not to be prepared for the whole project before coding begins.

J) Product Integration [4] This process area is not religiously followed in lean software development. K) Verification [4] Reviews are conducted whenever there are meetings by the whole team and the customer. Thus verification takes place more frequently exposing the defects. L) Validation [4] This process area conforms to the lean software development methodology. M) Measurement and Analysis [4] The focus here is collection of data in huge volumes. But lean software development focuses of storing only small and necessary data, so that the team can focus on what provides value to the customer. N) Process and Product Quality Assurance [4] The work products are evaluated by the team and the customer together. Any changes required are carried out quickly instead of waiting for documentation to be completed. O) Configuration Management [4] This process area is followed by lean software development but changes are made quickly rather than following long procedures. P) Decision Analysis and Resolution [4] This process area is followed by lean software development.

[2]

[3]

[4]

73

Password Protected File Splitter And Merger (With Encryption And Decryption)

Abstract Splitter is the program that is use to split a single file into different pieces of small size files. And the file merger is used to combine these small size files into a single file of summation size of these small size files. Splitters work like a splitting a file into the number of pieces to reduce the file size and length. And the Merger is used to combine the different pieces of a file to make it readable and useable by the people who want to perform some task on it with a single large file. Actually the small pieces of the splitted file are not usable because they lose their identity and cannot recognize by the operating system. Because these files do not have any type of file extension and its only the small chunks of the large file. The purpose of splitting the files into small.This project research is on the file Splitters and Mergers and tries to enhance the functionality of the existing splitters and mergers , by providing extended functionality added to the file splitters and mergers. In this, we are trying to split the desired file into the number of pieces according to the given size by the users in MBs (Megabytes). If there is 100 MB file is there and user wants to split it into 10 MB pieces, then the system will generate 10 different pieces of 100 MB file each of 10MB size. So it depends on the user that how much single file size he/she wants. A file named file and it broken into 10 pieces so files break with the index 0 to 9 i.e. file0 to file9. Now the file is broken into 10 pieces with no meaning of individual file to anyone until it is merge again to recover it into a human understandable form sizes to store them in the less memory space and transfer these files over the internet, where large files are not allowed to transfer.

LXIX. INTRODUCTION

hat is File Splitter? File splitter is one of the software programs to split a particular file into a specified size. So we reduce the size of the file, which is desired from transfer point of view. File splitter breaks the file into the file size which desired by the user, it breaks a big file into number of small files consisting of file size very less than the size of the original file. File splitter uses a program to divide the original file into number of small size files. For example you have a file which is 100MB in size and you want to deliver this file to your friend over the

Internet through the use of Internet, then you are not allowed to send it as in the same size as it is. So you have to use a splitter program to break it into the size which is allowed by the service provider who is carrying your file. So it is necessary to break it into the size which is supported by the service provider. Such as Gmail, Yahoomail and other mailing service which send and receive mails to and from the remote location to its server, allows only file size of 25 MB that you can deliver or receive. If any file which is greater than this size, it will not send or receive it from the remote server. So if you are using a net and want to receive and deliver some kind of files from your friends then you really need a program to convert the file size which is allowed by your service providers. And the program you are using for the splitting of the files must have some type of restrictions that no one other than you can use it. So if you are using a splitter program which is accessible for every one, that is, it is used by any one who can use your computer. So you have to use such type of splitter program which at least guarantee you that the software is not used by the person who are not authorized to do so. Because if you are paying for anything no other one have rights to use it against your permission. So always use a program that provide with a password protection and cannot use by any other who are not authorized for this program. Because if you are paying for a program to split the files and sometimes your friend comes and copy your program and execute it on its local machine, it must not work. Otherwise it wastage of money for that another person using the same thing for which you have paid a money to buy or purchase it. What is File Merger? Once you break your files into the smaller size files, you again need a program which can again assemble it to the original file size, so you can use it again. So you have to select a merger program which is used to merge the splitted files. Merger is the program which takes small size files as an input and generate a file having the actual

74
file size, and which can be used by the normal people who are not very aware of the different technologies. So a merger is the program that you use for the reassembling of the file which you have broken down before at the time of send it through the Internet. Merger always uses the properties of the splitter in the sense that it uses the same process as splitter. But the difference is that it only combines the splitted part and splitter split the combined part. Merger is the one of the only technology which redefined the splitted files and provides a better structure to the unreadable files for the people who are not very much aware of the different technology to read a break file. What is Encryption? Encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as key. The result of the process is encrypted information. Encryption has long been used by militaries and governments to facilitate secret communication. Encryption is now commonly used in protecting information within many kinds of civilian systems. Encryption is used to protect data in transit, for example data being transferred via networks (e.g. Internet, e-commerce), mobile telephones, wireless microphones, wireless intercom systems, Bluetooth devices and bank automatic teller machines. There have been numerous reports of data in transit being intercepted in recent years. Encryption can protect the confidentiality of messages, but other techniques are still needed to protect the integrity and authenticity of a message (your files). So when you sending a video file to your friend, which is confidential to both of you and if it hacked by any one of the hacker in between or your friend mail is hacked by anyone by knowing his/her password , then it will be no longer confidential to you and leaked. So it is necessary to have a system that can protect your files which are very much confidential and you do not want to share it with any one then you have to use a program to protect your files from unauthorized access. What is Decryption? Decryption process is the inverse of the encryption process in which an encrypted text is again converted into the plain text or normal information that is human understandable. Decryption is the process where the conversion of the encrypted text to normal text exists. Decryption is the only possible if the user who wants to decrypt the encrypt file who knows, how it is encrypted or there is any key that can explain the encrypted process, so decrypt the particular file. So that key may be any of the technique of the conversion of the encrypted text. So the key can be any password or any kind of scheme that can convert an encrypted file to the normal file, whatever it was. LXX. SOFTWARE REQUIREMENT A) Purpose This document details the software requirements specification for the file Splitter and Merger with encryption and decryption open source project. It will later be used as a base for the extension of the existing software itself. This document follows the IEEE (Institute of Electrical and Electronics Engineers) standard for software requirements specification documents.A computer file is a block of arbitrary information, or resource for storing information, which is available to and is usually based on some kind of durable. Most computers have at least one file system. Some computers allow the use of several different file systems. For instance, on newer MS Windows computers, the older FAT (File Allocation Table) file systems of and old versions of Windows are supported, in addition to the file system that is the normal file system for recent versions of Windows. B) Document Conventions File Splitter and Merger were created prior to this document, so all requirements stated here are already satisfied. It is very important to update this document with every future requirement and clarify its priority for consistency purposes, so that this document can remain useful. Because of the fact that file splitter and merger are already implemented, parts of this document have a style similar to a manual document. C) Project Scope File Splitter and Merger is a tool that can split, merge and manipulate files. It provides a command line Interface and a shell Interface (Console). It is available in two versions, basic and enhanced, both are open source.The command line interface provides the user with all the functionality needed to handle a file (or more files together). The functionalities are distributed in modules. Each module performs a specific function and loads in the main menu In the basic version, the software contains four modules: A) Split B) Merge C) Encrypt D) Decrypt

75
E) Password Manager

LXXI. USE CASES A) SPLITTER USE CASE

Fig 3.2: USE CASE for MERGER C) ENCRYPTION AND DECRYPTION USE CASE In this use case user have to set the password first to encrypt the file, because the encrypted file only again decrypted through the password set at the time of encryption. Here password is the main part for the encrypted file, because if the password is stolen by any of the unauthorized person, he/she can view your encrypted data by means of the again decrypt it. So you have to very careful about the settings of the password. Once you set your password, file is encrypted and decrypted again only by providing the same password to the decryption module. So we merge the use case for the Encryption/Decryption because of the reason that they both are dependent on each other. Without using the encryption, it is waste to use the decryption module.

Fig 3.1: USE-CASE for SPLITTER In splitter use case the firstly user ask for the file size in which he/she wants to break the file. That is, module will ask for the size in which the small piece of an original file is needed by the user. The size of the file will play an important role for the splitting a particular file into smaller pieces. Then the user will set the destination folder for keeping the different small pieces of a single large file. And after set the destination folder, all the small pieces will be place there as an output of the module. In this case a user will be any end user or customer of the intended program that will play the role of user who wants to transfer a big sized file to the another locations. . B) MERGER USE CASE In this use case diagram, user will load the file from the destination folder to the merger module and again set the destination folder for the binding of all files. That is, now the users have to set the destination for the file which will be a arrangement of all the break small files.

Fig 3.3: USE CASE for ENCRYPTION/DECRYPTION

76
LXXII. CONCLUSION Data splitter and merger really can work as the basis of a good file transfer system. The biggest problem with transfer of files is their larger size. People mostly either use the USB pen drive (flash drives) and other storage devices for file transfer but they does not work when file is transferred at the remote location, which is something more than the space not handled by these devices. At any time and from 50 percent from 100 percent, you need a good file splitter to send a file to the remote locations with file size more than accepted data on the Internet. The other option for file transfer is Email ids but they restrict us up to a finite file size for transferring the files over the Internet, for example yahoo gives 25 MB and Gmail give 30 MB file size that you can send from your local computer to the remote computers or other electronic devices where mailing system can be accessed. This password protected File splitter and merger offer a better solution for all of the above problems. In addition of these problems one another problem arises when there is need of secure file transferring from one computer device to another over the network. Generally files do not predict any information after they are broken into the smaller files, but the main part or you can say the master part of the file(the first of the splitted file), which contains the Beginning of File (BOF) Byte in it, runs even after breaking down into small piece of a original file. So this problem may be eliminate by using a secure file transfer system, that is not other than encrypted files, those are not understandable easily by the human, because it converts the normal text into the encrypted text. So according to our software a secure file transfer system is provided with the splitting and merging process. By which you can safely transfer your data over the Internet without sharing the confidential information to anyone, if it is stolen by anyone. With encryption process you can set your password for the file which you are going to encrypt. After receiving the file at the other end, you have to tell your friend the password which you applied on the intended file. And at the other end your friends have our software and he/she can decrypt the same file with the set password, for which software will ask at the time of decrypting it. According to need, we prepare a system which consists of the generally four modules, (a) Splitter, (b) Merger, (c) Encryption and finally (d) Decryption. These all of the above modules have their own importance for the file transferring from local computer device to other device over the network. Firstly you have to set your password for the first time, which will guarantee of not accessed by an unauthorized person. You have to always remember your password to use the software from the second time onwards. If you forget your password and enter the wrong password for three times then the software will not work further, that is, it will be dumb and do not perform any work after that. Splitter is used split the files into the desires small size pieces of the original file, then Encryption is used to encrypt the master file of the original file so that it is not readable to anyone. After doing this, Internet is used to transfer the file pieces from one device to another device. At the receiver end, the encrypted file is again decrypted using the Decryption module to decrypt the master file and then use the Merger module to merge all the pieces of file received through the Internet. REFERENCES
[1] [2] [3] [4] [5] [6] [7] Balagurusamy E. (2004), Programming in ANSI C, Tata McGraw-Hill. Booch Grady, Runbaugh James, Jacobson Ivar (2004), The UML User Guide, Pearson Edition. Ivar Jabcobson (2004), Object Oriented Software Engineering, Pearson Edition. Kaner, C. (2004), A Course in Black Box Software Testing.Available: http://www.testingeducation.org/BBST P. Kanetkar Yashwant (1999) Let Us C 5th Edition, BPB Publications. Stallings William (2006), Cryptography and Network Security Principles and Practices, 4th Edition, Pearson Prentice Hall. Aggarwal K.K. & Singh Yogesh (2005),Software Engineering, New age International.

77

Security Solution in Wireless Sensor Network

Abstract Wireless sensor networking is an emerging technology that promises a wide range of potential applications in both civilian and military area. Wireless sensor networks (WSNs) have many potential civilian and military application for eg. environmental monitoring, battlefield surveillance, and homeland security. In many important military and commercial applications, it is critical to protect a sensor network from malicious attacks, which presents a demand for providing security mechanisms in the network.A wireless sensor network (WSN) typically consists of a large number of low - cost, low - power, and multifunctional sensor nodes that are deployed in a region of interest. These sensor nodes are small in size but are equipped with sensors, embedded microprocessors, and radio transceivers. Therefore, they have not only sensing, but also data processing and communicating capabilities. They communicate over short distance via a wireless medium and collaborate to accomplish a common task, for example, environment monitoring, military surveillance, and industrial process control .Wireless sensor networks are result of developments in micro electro mechanical systems and wireless networks. These networks are made of tiny nodes which are becoming future of many applications where sensor networks are deployed in hostile environments. The deployment nature where sensor networks are prone to physical interaction with environment and resource limitations raises some serious questions to secure these nodes against adversaries. The traditional security measures are not enough to overcome these weaknesses. To address the special security needs of tiny sensor nodes and sensor networks as a whole we introduce a security mode. In our model we emphasize on three areas: (1) cluster formation (2) secure key management scheme (3) Secure Routing algorithm. Our security analysis shows that the model presented in this paper meets the unique security needs of sensor networks. Index Terms Wireless sensor networks security, secure key management.

LXXIII. LXXIV. INTRODUCTION

dvancements in micro electro mechanical systems (MEMS) and wireless networks have made possible the advent of tiny sensor nodes called smart dust which are low cost small tiny devices with limited coverage, low power, smaller memory sizes and low bandwidth. Wireless sensor networks are consisting of large number of sensor nodes which are becoming viable solution to many challenging domestic, commercial and military applications. Sensor networks collect and disseminate data from the fields where ordinary networks are unreachable for various environmental and strategically reasons. In addition to common network threats, sensor networks are more vulnerable to security breaches because they are physically accessible by possible adversaries, consider sensitive sensor network applications in military and hospitals compromised by adversaries. Many developments have been made in introducing countermeasures to potential threats in sensor networks; however, sensor network security remains less addressed area. In this paper we present a security framework for wireless sensor networks to provide desired security countermeasures against possible attacks. Our security framework consists of three interacting phases: cluster formation, secure key management and secure routing schemes. We make three contributions in this paper: A) We discuss cluster formation and leader election in a multi-hop hierarchical cluster model B) We present a secure key management scheme. C) We propose a secure routing mechanism which addresses potential threats in node to cluster leader and cluster leader to base station and vice versa communication. The rest of paper is organized as follows. Section

78
II provides summary of related work in key management and routing protocols in wireless sensor networks. Section III presents our security framework discussing the cluster formation and leader election process, secure key management scheme, secure routing and their algorithms, . Section IV provides analysis of our security model , and finally in Section V we conclude our paper providing the future research directions. LXXV. RELATED WORK Researchers have addressed many areas in sensor network security. Some of the related work has been summarized in the following paragraphs. Eschenauer et al. [1], present a probabilistic key pre-distribution scheme where each sensor node receives a random subset of keys from a large key pool before deployment. To agree on a key for communication, two nodes find one common key within their subsets and use that key as their shared key. Chan et al [2], extended idea of Eschenauer et al. [14] and developed three key pre-distribution schemes; q-composite, multipath reinforcement, and random-pairwise keys schemes. Pietro et al [3], Present a random key assignment probabilistic model and two protocols; direct and cooperative to establish a pairwise communication between sensors by assigning a small set of random keys to each sensor. This idea later converges to pseudo random generation of keys which is energy efficient as compare to previous key management schemes. Liu et al [4] propose a pairwise key schemes is based on polynomial pool-based and grid based key pre-distribution schemes have high resilience against node captures and communication overhead. Du et al [5] pairwise key pre-distribution is an effort to improve the resilience of the network by lowering the initial payoff of smaller scale network attacks and pushes adversary to attack at bigger scale to compromise the network. Du et al [6] present a key scheme based on deployment knowledge. This key management scheme takes advantage of the deployment knowledge where sensor position is known prior to deployment. Because of the randomness of deployment, it is not feasible to know the exact neighbor locations, but knowing the4 set of likely neighbors is realistic, this issue is addressed using the random key pre-distribution of Eschenauer et al. Adrian et al [7] have introduced SPINS (Security Protocols for Sensor Networks). SPINS is a collection of security protocols (SNEP) and mircoTESLA. SNEP (Secure Network Encryption Protocol provides data confidentiality and two-way data authentication with minimum overhead. MicroTESLA, a micro version of TESLA (Time Efficient Streamed Loss-tolerant Authentication) provides authenticated streaming broadcast. SPINS leaves some questions like security of compromised nodes, DoS issues, network traffic analysis issues. Furthermore, this protocol assumes the static network topology ignoring the ad hoc and mobile nature of sensor nodes. Chen et al [8] proposed two security protocols. First, base station to mote confidentiality and authentication which states that an efficient sharedkey algorithm like RC5 be used to guarantee the authenticity and privacy of information. Second, Source authentication, by implementing a hash chain function similar to that used by TESLA (timed efficient stream loss-tolerant authentication) to achieve mote authentication. Jeffery et al [9] proposed a light weight security protocol that operates in the base station of sensor communication where base station can detect and remove an aberrant node if it is compromised. This protocol does not specify any security measures in case of any passive attacks on node where an adversary is intercepting the communication. LXXVI. THE SECURITY MODEL Our security model consists of two interacting phases: cluster formation, secure key management. A) Cluster formation As soon as sensor nodes are deployed, they broadcast their IDs and listens to the neighbors, add the neighbor IDs in its routing table and count the number of neighbor it could listen to. Hence these connected neighbors become a cluster. Each cluster elects a sensor node as a leader. All inter-cluster communication is routed through cluster leaders. Cluster leaders also serve as fusion nodes to aggregate packets and send them to the base station. The cluster leader receives highest number of messages, this role changes after reaching an energy threshold, hence giving opportunity to all the nodes becoming a cluster leader when nodes move around in a dynamic environment. Coverage of clusters depends on the signal strength of the cluster leader. Cluster leader and its neighbor nodes form a parentchild relationship in a tree-based network topology. In this multi hop cluster model, data is collected by

79
the sensor nodes, aggregated by the cluster leader and forwarded to the next level of cluster, eventually reaching the base station. Figure 1 below shows a network of 210 sensor nodes forming 10 clusters. scheme, self enforcing scheme, and key predistribution scheme. The trusted server scheme depends on a trusted server e.g., Kerberos [11]. Since there is no trusted infrastructure in sensor networks, therefore trusted-server scheme is not suitable in this case. The self-enforcing scheme depends on asymmetric cryptography using public keys. However, limited computation resources in sensor nodes make this scheme less desirable. Public key algorithms such as Diffe-Hellman [12] and RSA [13] as pointed out in [6, 7] require high computations resources which tiny sensors does not provide. The key pre-distribution scheme, where key information is embedded in sensor nodes before the nodes are deployed is more desirable solution for resource starved sensor nodes. A simple solution is to store a master secret key in all the nodes and obtain a new pair wise key. In this case capture of one node will compromise the whole network. Storing the master key in tamper resistant sensor nodes increases the cost and energy consumption of sensors. Another key pre-distribution scheme [5] is to let each sensor carry N 1 secret pairwise keys, each of which is known only to this sensor and one of the other N 1 sensors (N is the total number of sensors). Extending the network makes this technique impossible as existing nodes will not have the new nodes keys. In our security framework we introduce a secure hierarchical key management scheme where we use three keys: two pre-deployed keys in all nodes and one in network generated cluster key for a cluster to address the hierarchical nature of sensor network. Kn (Network key) Generated by the base station, pre-deployed in each sensor node, and shared by the entire sensor network. Nodes use this key to encrypt the data and pass onto next hop. Ks (sensor key) Generated by the base station, pre-deployed in each sensor node, and shared by the entire sensor network. Base station uses this key to decrypt and process the data and cluster leader uses this key to decrypt the data and send to base station. Kc (cluster key) Generated by the cluster leader, and shared by the nodes in that particular cluster. Nodes from a cluster use this key to decrypt the data and forward to the Cluster Leader. By providing this key management scheme we make our security model resilient against possible attacks on the sensor network. In this key management scheme base station uses Kn to encrypt and broadcast data. When a sensor node receives the message, it decrypts it by using its Ks. In this key calculation, base station uses Kn1..nn to broadcast the message. This process follows as:

Fig 1: Cluster formation B) Secure key management scheme Key management is critical to meet the security goals of confidentiality, integrity and authentication to prevent the Sensor Networks being compromised by an adversary. Due to ad-hoc nature and resource limitations of sensor networks, providing a right key management is challenging. Traditional key management schemes based on trusted third parties like a certification authority (CA) are impractical due to unknown topology prior to deployment. Trusted CA is required to be present all the times to support public key revocation and renewal [10]. Trusting on a single CA for key management is more vulnerable, a compromise CA will risk the security of entire sensor network. Fei et al [10] decompose the key management problem into: Key pre-distribution installation of keys in each sensor node prior to distribution Neighbor discovery discovering the neighbor node based on shared key End-to-end path key establishment end to end communication with those nodes which are not directly connected. Isolating aberrant nodes identifying and isolating damaged nodes. Re-keying re-keying of expired keys Key-establishment latency reducing the latency resulted from communication and power consumption. The core problem we realize in wireless sensor network security is to initialize the secure communication between sensor nodes by setting up secret keys between communicating nodes. In general we call this key establishment. There are three types of key establishment techniques [5, 6]: trusted-server

80
Base station encrypts its own ID, a current time stamp TS and its Kn as a private key. Base station generates a random seed S and assumes itself at level 0. Sensor node decrypts the message received from the base station using Ks. When a node sends a message to cluster leader, it constructs the message as follows: {ID, Ks, TS, MAC, S (message)} Cluster leader checks the ID from the packet, if the ID in the packet matches the ID it holds, verifies the authentication and integrity of the packet through MAC. Otherwise, packet is dropped by the cluster leader. Node builds the message using the fields below: Cluster leader aggregates the messages received from its nodes and forwards it to next level cluster leader or if the cluster leader is one hop closer to the base station, it directly sends to the bases station. Receiving cluster leader checks its routing table and constructs the following packet to be sent to next level cluster leader or base station. Cluster leader adds its own ID, its network and cluster key in incoming packet and rebuilds the packet as under: {ID, Kn, kc, [ID, Ks, TS, MAC, S (Aggr message)]} Here ID is the ID of receiving cluster leader which aggregates and wraps the message, and sends it to the next hop cluster leader or to the base station if directly connected. Next hop cluster leader receives the packet and checks the ID, if the ID embedded in the packet is same as it holds, it updates the ID for the next hop and broadcast it, else the packet is discarded. Base station receives the packet from its directly connected cluster leader; it checks the ID of sending cluster leader, verifies the authentication and integrity of the packet through MAC. Cluster leader directly connected with base station adds its own ID along with the packet received from the sending cluster leader. Packet contains the following fields: {ID[ID, Kn, kc, [ID, Ks, TS, MAC, S (Aggr message)]]} C) Secure Routing In our secure routing mechanism, all the nodes have a unique ID#. Once the network is deployed, base station builds a table containing ID#s of all the nodes in the network. After self organizing process base station knows the topology of the network. Using our secure key management scheme nodes collect the data, pass onto the cluster leader which aggregates the data and sends it to the base station. We adapt the energy efficient secure data transmission algorithms by [15] and modify it with our secure key management scheme to make it more resilient against attacks in wireless sensor networks. Following two algorithms: sensor node and base station algorithms are presented for secure data transfer from node to base station and base station to node communication: Node algorithm performs the following functions: A) Sensor nodes use the Kn to encrypt and transmit the data B) Transmission of encrypted data from nodes to cluster leader C) Appending ID# to data and then forwarding it to higher level of cluster leaders D) Cluster leader uses Kc to decrypt and then uses its Kn to encrypt and send the data to next level of cluster leaders, eventually reaching the base station Base station algorithm is responsible of following tasks: A) Broadcasting of Ks and Kn by the base station B) Decryption and authentication of data by the base station Node algorithm Step 1: If sensor node i wants to send data to its cluster leader, go to step 2, else exit the algorithm Step 2: Sensor node i requests the cluster leader to send the Kc. Step 3: Sensor node i uses Kc and its own Kn to compute the encryption key Ki, cn. Step 4: Sensor node i encrypts the data with Ki,cn and appends its ID# and the TS to the encrypted data and then sends them to the cluster leader. Step 5: Cluster leader receives the data, appends its own ID#, and then sends them to the higher-level cluster leader or to the base station if directly connected. Go to Step 1. Base Station Algorithm Step 1: Check if there is any need to broadcast the message. If so, broadcast the message encrypting it with Kn. Step 2: If there is no need to broadcast the message then check if there is any incoming message from the cluster leaders. If there is no data being sent to the base station go to step 1. Step 3: If there is any data coming to the base station then decrypt the data using Ks, ID# of the node and TS within the data. Step 4: Check if the decryption key Ks has decrypted the data perfectly. This leads to check the credibility of the TS and the ID#. If the decrypted data is not perfect discard the data and go to step 6.

81
Step 5: Process the decrypted data and obtain the message sent by sensor nodes Step 6: Decides whether to request all sensor nodes for retransmission of data. If not necessary then go back to step 1. Step 7: If a request is necessary, send the request to the sensor nodes to retransmit the data. When this session is finished go back to step 1. This routing technique provides stronger resilience towards spoofed routing information, selective forwarding, sinkhole attacks; Sybil attacks wormholes and HELLO flood attacks presented in [16]. station. We have presented a hierarchical secure key management scheme based on three levels of predeployed keys and lastly we have presented a secure routing mechanism which provides a stronger resilience towards susceptible attacks on sensor networks. We plan to implement this security framework in Berkeleys motes having confidence that this framework will provide added security in wireless sensor network communication. REFERENCES
[1] L. Eschenauer and V. Gligor, A Key-management Scheme for Distributed Sensor Networks, Proceedings of the 9th ACM conference on Computer and Communication Security 2002, Washington DC, USA. P. Ganesan, R. Venugopalan, P. Peddabachagari, A. Dean, F Mueller, and M Sichitiu, Analyzing and Modeling Encryption Overhead for Sensor Network Nodes, WSNA03, September 19, 2003, San Diego, California,USA. R. Pietro, L. Mancini, and A. Mei, Random key-Assignment for Secure Wireless Sensor Networks, ACM,SANS,2003. D. Liu and P. Ning, Establishing Pairwise Keys in Distributed Sensor Networks, ACM CCS 2003. W. Du, J. Deng, Y. S. Han, and P. K. Varshney, A Pairwise Key Pre-Distribution Scheme for Wireless Sensor.Networks, W. Du, J. Deng, Y. S. Han, S. Chen, and P. K. Varshney, A Key Management Scheme for Wireless Sensor Networks Using Deployment Knowledge, IEEE InfoCom,2004. A. Perrig, R. Szewczyk, V. Wen, D. Culler, J. D. Tygar. SPINS: Security Protocols for Sensor Networks, in Wireless Networks Journal (WINE), September 2002. H. Chan, A. Perrig, and D. Song, Random Key Predistribution Schemes for Sensor Networks. In Proceedings of the IEEE Symposium on Security and Privacy,.Oakland,.California,USA J. Undercoffer, S. Avancha, A. Joshi, and J. Pinkston, Security for Sensor Networks 2002 CADIP Research_Symposium F. Hu, J. Ziobro, J. Tillett, and N. Sharma,Wireless Sensor Networks: Problems and Solutions Rochester Institute of Technology, Rochester, New York USA. B.C. Neuman and T. Tso., Kerberos: An authentication service for computer networks. IEEE communications 32(9):pgs33-38, 1994. W. Diffie and M. E. Hellman, New directions in cryptography. IEEE transactions on information theory22:644-654, 1976. R. L. Rivest, A. Shamir, and L. M. Adleman, A method for obtaining digital signatures and public keycryptosystems. Communications of the ACM, 21(2):120-126, 1978 T. Li, H. Wu and F. Bao, SenSec Design, Institute for Infocomm research, Singapore, 2004 H. Cam, S. Ozdemir, D. Muthuavinashiappan, and P. Nair, Energy Efficient Security Protocol for Wireless Sensor Networeks, 2003 IEEE C.Karlof and D. Wagner, Secure Routing in Wireless Sensor Networks: Attacks and Countermeasures, University of California at Berkeley, USA 2003. P. K. Goel and V.K. Sharma ,Wireless Sensor Network: Security Model, International Journal of Science Technology and Management, IJSTM Vol. 2, Issue 2, ISSN: 2229-6646 (online) , pp 100-107.

[2]

LXXVII. ANALYSIS OF PROPOSED SECURITY MODEL This section presents an analysis to explain the features of our security model which make this model feasible to implement. In our security model packet format in a typical node to cluster leader communication would be as under:
[3] [4] [5]

[6]

This gives us 44 bytes of data packet to transmit. Taking into account 128K program memory of ATmega128L MICA2Dot our model can be best implemented in a network of up to 3000 sensor nodes. Going beyond this number we might need to have a tradeoff between the security and performance which is highly unlikely because most of the applications so far do not deploy sensor nodes at that large quantity. Assuming the ongoing developments in enhancing the program memory this framework will be feasible in even larger and denser networks. The algorithms presented in this model takes into consideration the nodes and cluster leaders which are not participating in sending and aggregating the data. These nodes forward the data packets without applying any further cryptographic operation, thus further saving the processing power and memory.

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14] [15]

LXXVIII. CONCLUSION AND FUTURE WORK


[16]

In this paper we have presented a security framework for wireless sensor network which is composed of three phases: cluster formation, secure key management scheme and secure routing. Cluster formation process has described the topology formation and self organization of sensor nodes, leader election and route selection towards base

[17]

82

Vertical Perimeter Based Enhancement of Streaming Application

Abstract-The explosion of web and increase in processing power meets a large number of short lived connections making connection setup time equally important. With Fire Engine, the networking stack went through one more transition where the core pieces (i.e. socket layer, TCP, UPD, IP, and device driver) used an IP Classifier and serialization queue to improve the connection setup time, scalability, and packet processing cost. The Fire Engine approach is to merge all protocol layers into one STREAMs module which is fully multi threaded. Inside the merged module, instead of using per data structure locks, use a per CPU synchronization mechanism called vertical perimeter. The vertical perimeter is implemented using a serialization queue abstraction called squeue

system does not provide related system calls. Although there are some existing mechanisms, like procfs, to access system information and system calls, like sched_setaffinity (), to dispatch threads to certain processors, dispatch wraps these into a complete API. It not only provides generic interface for future extension and high portability without kernel modification, but also performs much better than procfs. A. Fire Engine The Fire Engine [10] networking stack for the Solaris Operating System (OS) is currently under development by Sun. Enhanced network performance and a flexible architecture to meet future customer networking needs are twin goals of Fire Engine development. Addressing existing requirements, including increased performance and scalability, Disaster Recovery (DR), Secure Internet Protocol (IPSec), and IP Multiprocessing (IPMP), as well as future requirementssuch as 10-gigabits per second (Gbps) networking, 100-Gbps networking, and TCP/IP Offload Engine (TOE)are given equal priority. Implemented in three phases, Fire Engines development stages are structured to provide increased flexibility and a significant performance boost to overall network throughput. Phase 1 has already been completed and these goals have been realized in services using TCP/IP. Web-based benchmarks show a 30- to 45-percent improvement on both SPARC and Intel x86 architectures, while bulk data transfer benchmarks show improvements in the range of 20 to 40 percent. Phases 2 and 3 should deliver similar overall performance improvements. With increased flexibility and performance boosts of this magnitude, FireEngine is well on its way to reinforcing Suns Solaris OS as the commercial standard for networking infrastructure. B. Performance Barriers

Keywords-

threading, scheduling, dispatching, STREAM, Multicore, Fire Engine, Vertical Perimeter, CPU scheduling, task queue, IP Multithreading.

I. INTRODUCTION In recent years, since the clock rate of single processor cannot be increased without overheating, to increase performance, manufacturers develop multicore systems instead. In order to run the applications on multicore systems, there are many parallel algorithms developed, e.g. parallel H.264 applications. By exploiting parallelism, multicore systems compute more effectively. Multiple threads and processes are common useful approaches to speed up user a task with one thread in some operating systems, e.g. Linux. In our observation, threads sometimes are not dispatched reasonably on processors. We redefine these anomalies formally from multiprocessing timing anomalies and focus on thread manipulation on multicore systems. For example, even if some cores are idle, the operating system does not dispatch any thread to the idle ones. Furthermore, even if users find out this situation, they still cannot directly dispatch these threads accordingly if the operation

83

The existing TCP/IP stack uses STREAMS perimeters and kernel adaptive mutex for multithreading. As the current STREAMS perimeter provides per module, per protocol stack layer, or horizontal perimeters. This can, and often does, lead to a packet being processed on more than one CPU and by more than one thread, leading to excessive context switching and poor CPU data locality. C. Network Performance FireEngine introduces a new highly scalable, packet classification architecture called Firehose. Each incoming packet is classified early on, then proceeds through an optimized list of functionsthe Event List that makes it easy to add protocols without impacting the network stacks complexity, performance, or scalability. FireEngine concentrates on improving the performance of key server workloads that have a significant networking component. The impact of network performance on these workloads, as well as benchmarks that describe overall workload performance D. Performance metrics Applications often use networking in two distinct ways: To perform transactions over the network, or to stream data over the network. Transactions are shortlived connections transferring a small amount of application data, while streaming data is a transfer of large amounts of data during long-lived connections. In the transaction case, performance is determined by a combination of the time it takes to get the first byte (first-byte latency), connection set up/tear down, plus network throughput (bits per second or bps). In the streaming case, performance is dominated by overall network throughput. These parameters impact performance in various ways, depending on the amount of data transferred. For instance, when transferring one byte of data, only first-byte latency and connection set up/tear down count. When transferring very large amounts of data, only network throughput is relevant. Finally, there is the ability to sustain performance as the number of active simultaneous connections increases. This is often a requirement for Web servers. A networking stack must take into account the host systems hardware characteristics. For low-end systems, it is important to make efficient use of the available hardware resources, such as memory and CPU. For higher-end systems, the stack must take into account the high variability in memory access time, as well as system resources that offload some functions to specialized hardware.

Fire Engine focuses on these network performance metrics [10]: Network throughput Connection set up/tear down First-byte latency Connection and CPU scalability Efficient resource usage. E. Vertical Perimeters The Solaris 10 FireEngine project introduces the abstraction of a vertical perimeter, which is composed of a new kernel data structure, the squeue_t (serialization queue type), and a worker thread owned by the squeue_t , which is bound to a CPU. Vertical perimeters or squeues by themselves provide packet serialization and mutual exclusion for the data structures. FireEngine uses a per-CPU perimeter, which is a single instance per connection. For each CPU instance the packet is queued for processing, and a pointer to the connection structure is stored inside the packet. The thread entering squeue may either process the packet immediately, or queue it for later processing. The choice depends on the squeues entry point and its state. Immediate processing is possible only when no other thread has entered the same squeue. A connection instance is assigned to a single squeue_t so it is processed only within the vertical perimeter. As a squeue_t is processed by a single thread at a time, all data structures used to process a given connection from within the perimeter can be accessed without additional locking. This improves both CPU and thread context data locality of access for the connection metadata, packet metadata, and packet payload data, improving overall network performance. This approach also allows: The removal of per-device driver worker thread schemes, which are often problematic in solving system-wide resource issues. Additional strategic algorithms to be implemented to best handle a given network interface, based on network interface throughput and system throughput (such as fanning out per-connection packet processing to a group (of CPUs). II.
RELATED WORKS

Many techniques have been developed to exploit parallelism. OpenMP [5] is a tool where looped tasks can be partitioned into multiple independent tasks automatically. Affine partition [6], [7], is another method that can find the optimal partition, which maximizes the parallelism with the minimum

84
synchronization. However, few works have been done to address on how to allocate resources to threads on multicore systems. Andr C. Neto, Filippo Sartori, [1] proposed MARTe is a framework built over a multiplatform library that allows the execution of the same code in different operating systems. Drawback is latency. Franois Trahay, lisabeth Brunet Alexander Denis [2], the author present a thread safety while processing by locking mechanism. Drawback is of Deadlock. Fengguang Song, Shirley Moore [3], This paper proposes an analytical model to estimate the cost of running an affinity-based thread schedule on multicore systems. Tang-Hsun Tu, Chih-Wen Hsueh [4], Decomposing of thread by User Dispatching Mechanism (UDispatch) that provides controllability in user space to improve application performance. Ana Sonia Leon [8], proposed Chip Multi-Threading (CMT) architecture which maximizes overall throughput performance for commercial workloads. Drawback is low performance. Sunay Tripathi, Nicolas Droux, Thirumalai Srinivasan, presented a paper that is a new architecture which addresses Quality of Service (QoS) by creating unique flows for applications and Services. III.FIRE ENGINE ARCHITECTURE The Solaris FireEngine networking performance improvement project adheres to these design principles [10]: Data locality: Ensures that a connection is always processed by the same CPU whenever possible CPU modelling: Efficient use of available CPUs and interrupt/worker thread model. Allows use of multiple CPUs for protocol processing Code path locality: Improves performance and efficiency of TCP/IP interactions TCP/IP interaction: Switches from a message passing-based interface to a function call-based interface. Because of the large number and dependent nature of changes required to achieve FireEngine goals, the development program is split into three phases: Solaris 10 Fire Engine phase 1 [10]: Fundamental infrastructure implemented and a large performance boost realized. Application and STREAMS module developers see no changes other than better performance and scalability. Solaris 10U and SX Fire Engine Phase 2 [10]: Feature scalability, offloading, and the new Event List framework implemented. 1. Solaris 10 Fire Engine phase 1 architecture A. IP Classifier-Based Fan Out When the Solaris IP receives a packet from a NIC, it classifies the packet and determines the connection structure and vertical perimeter instance that will process that packet. New incoming connections are assigned to the vertical perimeter instance attached to the interrupted CPU. Or, to avoid saturating an individual CPU, a fan-out across all CPUs is performed. A NIC always sends a packet to IP in interrupt context, so IP can optimize between interrupt and noninterrupt processing, avoiding CPU saturation by a fast NIC. There are multiple advantages with this approach: The NIC does minimal work, and complexity is hidden from independent NIC manufacturers. IP can decide whether the packet needs to be processed on the interrupted CPU or via a fan out across all CPUs. Processing a packet on the interrupted CPU in interrupt context saves the context switch, compared to queuing the packet and letting a worker thread process it. IP can also control the amount of work done by the interrupt without incurring extra cost. On low loads, processing is done in interrupt context. With higher loads, IP dynamically changes between interrupt and polling while employing interrupt and worker threads for the most efficient processing. In the case of a single high bandwidth NIC (such as 10Gbps), IP also fans out the connection to multiple CPUs. If multiple CPUs are applied, the connection is bound to one of the available CPUs servicing the NIC. Worker threads, their management, and special fan-out schemes can be coupled to the vertical perimeter with little code complexity. Since these functions reside in IP, this architecture benefits all NICs. The DR issues arising from binding a worker thread to a CPU can be effectively handled in IP. The Solaris 10 FireEngine project introduces the abstraction of a vertical perimeter, which is composed of a new kernel data structure, the squeue_t (serialization queue type), and a worker thread owned by the squeue_t, which is bound to a CPU. Vertical perimeters or squeues by themselves provide packet serialization and mutual exclusion for

85
the data structures. FireEngine uses a per-CPU perimeter, which is a single instance per connection. For each CPU instance the packet is queued for processing, and a pointer to the connection structure is stored inside the packet.

Figure 3 : Scheduling and process state transition A. Fair share scheduling The FS scheduler has two levels of scheduling: process and user. Process level scheduling same as in standard UNIX (priority and nice values act as bias to scheduler as it repositions processes in the runqueue). User level scheduler relationship can be seen in the following (simplified) pseudo-code. Whereas process-level scheduling still occurs 100 times a second, user-level scheduling adjustments (usage parameter) occur once every 4 seconds. Also, once second, process-level priority adjustments that were made in the previous second begin to be forgotten. This is to avoid starving a process. FSS is all about making scheduling decisions based on process sets rather than on basis of individual processes.

Figure 2: Packets flowing in TCP through vertical perimeter tcp_input - All inbound data packets and control messages tcp_output - All outbound data packets and control messages tcp_close_output - On user close tcp_timewait_output - timewait expiry tcp_rsrv_input - Flow control relief on read side. tcp_timer - All tcp timers IV. ANALYSIS 1. Scheduling algorithm Good cluster schedulers attempt to minimize job wait time while maximizing cluster utilization. The maximization of utilization and minimization of wait time are subject to the policy set by the scheduler administrator. Types of scheduling [11]: Long-term scheduling : the decision to add to pool of processes to be executed Mid-term scheduling : the decision to add to the number of processes that are partially or fully in memory Short-term scheduling : decision as to which available process will be executed I/O scheduling : decision as to which processs pending request shall be handled by an available I/O device

Figure 4: Scheduling process fs_interval (): Duration of each fairshare window. fs_depth (): Number of fairshare windows factored into the current fairshare utilization calculation. fs_decay (): Decay factor applied to weighting the contribution of each fairshare window in the past. V.RESULTS Comparison of Single Queue and Multiple Queues:

86
Here we are examine that execution time is less,using of multiple queues than single queue.comparison regarding to the paket count and length of bytes. Execution time in nanosecond but easy reference we converting into seconds. Length of bytes 300:
E xec ution T im e(S ec )

300 B ytes
15 10 5 0 S ingle q Multi q 10 0.4 0.15 100 0.9 0.34 500 5 2 1000 10 4 S ingle q Multi q

In solaris 10 the fire engine has the concept of vertical perimeter. By using the vertical perimeter we can improve the streaming application and we mainly concentrated on the queue management, assigning the core, thread allocating and time profiling. For the allocation of process we need the scheduling algorithm. Here we used fair share scheduling (FSS). FSS will make scheduling process easier and efficient. So are going to analysis and implement the FSS in this paper.
VII. REFERENCES Andr C. Neto, Filippo Sartori, MARTe: A Multiplatform Real-Time Framework, IEEE transactions on nuclear science, vol. 57, no. 2, april 2010. [2] Franois Trahay, lisabeth Brunet, An analysis of the impact of multi-threading on communication performance, IEEE transactions 2009. [3] Fengguang Song , Shirley Moore, Analytical Modeling and Optimization for Affinity Based Thread Scheduling on Multicore Systems, IEEE transactions 2009. [4] Tang-hsun Tu, Chih-wen hsueh, Rong-Guey Chang, A portable and efficient User Dispatching Mechanism for multicore system, IEEE International conference on Embedded and real-time computing systems and applications, pp.427-436,2009 [5] L. Dagum and R. Menon, OpenMP: An Industry-Standard API for Shared-Memory Programming, IEEE Computational Science & Engineering, vol. 5, no. 1, pp. 46 55, Jan. 1998. [6] W. Lim, G. I. Cheong, and M. S. Lam, An Affine Partitioning Algorithm to Maximize Parallelism and Minimize Communication, Proceedings of the 13th international conference on Supercomputing, pp. 228237, 1999. [7] W. Lim and M. S. Lam, Maximizing Parallelism and Minimizing Synchronization with affine Transforms, Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 201214, 1997. [8] Ana Sonia Leon, Senior Member, A Power-Efficient HighThroughput 32-Thread SPARC Processor, IEEE journal of solid-state circuits, vol. 42, no. 1, jan. 2007. [9] W.-Y. Cai and H.-B. Yang. Cross-layer QoS optimization design for wireless sensor networks. In Wireless, Mobile and Sensor Networks, 2007. [10] Tong Li, Alvin R. Lebeck, Spin Detection Hardware for Improved Management of Multithreaded Systems, IEEE transactions on parallel and distributed systems, vol. 17, no. 6, June 2006. [11] Sunay Tripathi, FireEngine - A New Networking Architecture for the Solaris Operating System www.sun.com/bigadmin/content/networkperf/FireEngine_W P.pdf, Nov. 2004. [12] Prof. Navneet Goyal, Department of Computer Science & Information System, BITS, Pilani,operating system. [1]

P a c ke t C ount

Length of bytes 600:


E xec ution T im e(S ec )

600 B ytes
10

5 0 S ingle q Multi q

10 0.3 0.1

100 0.9 0.4

500 5 2

1000 9 3 S ingle q Multi q

P a c ke t C ount

VI.CONCLUSION In this paper, we presented the vertical perimeter for enhancing the streaming application. In multicore environment, using multiple threads is a common useful approach to improve application performance. Nevertheless, even in many simple applications, the performance might degrade when the number of threads increases. However, in our observation, the more significant effect is the dispatching of threads.

87

Orthogonal Frequency Division Multiplexing for Wireless Communications

Abstract Orthogonal Frequency Division Multiplexing (OFDM) ,a modulation technique has an increased symbol duration which makes it robust against Inter-Symbol-Interference (ISI). OFDM split a high-rate datastream into a number of lower rate streams that are transmitted simultaneously over a number of sub carriers. The advanced transmission techniques of OFDM, applied in wireless LANs and in digital and video broadcasting, and CDMA, the foundation of 3G mobile communications, have been part of almost every communication system. In this paper we study the OFDM transmission and reception scheme, working ,advantages, disadvantages and applications. Keywords OFDM, ADSL, DVB-T

I INTRODUCTION Orthogonal frequency division multiplexing (OFDM) is widely known as the promising communication technique in the current broadband wireless mobile communication system due to the high spectral efficiency and robustness to the multipath interference. Currently, OFDM has been adapted to the digital audio and video broadcasting (DAB/DVB) system, high-speed wireless local area networks (WLAN) such as IEEE802.11x, HIPERLAN II and multimedia mobile access communications (MMAC), ADSL, digital multimedia broadcasting (DMB) system and multi-band OFDM type ultra-wideband (MB-OFDM UWB) system, etc. in multi-carrier OFDM system. Besides being the basis for many high data rate wireless standards, the main advantages of OFDM are its high spectral efficiency and its ability to use the multipath channel to its advantage. A. II PRINCIPLES OF OFDM

number of subcarriers. In OFDM, a rectangular pulse is used as sub carrier for transmission. It facilitates the process of pulse forming and modulation by implementing efficiently with simple IDFT (inverse discrete fourier transform) along with IFFT (inverse fast fourier transform). To reverse this operation at receiver, an FFT (fast Fourier transform) is needed. According to the theorems of the Fourier Transform the rectangular pulse shape will lead to a sin(x)/x type of spectrum of the sub carriers (fig.1). But the spectrums of the sub carriers are not separated but overlap. The information transmitted over the carriers can still be separated because of the orthogonality relation. By using an IFFT for modulation the spacing of the sub carriers is chosen in such a way that at the frequency where the received signal is to be evaluated (indicated as arrows), all other signals are zero.For this orthogonality, the receiver and transmitter must be perfectly synchronized i.e. both must assume the same modulation frequency and the same time-scale for transmission. OFDM is a block transmission technique. In the baseband, complexvalued data symbols modulate a large number of tightly grouped carrier waveforms. The transmitted OFDM signal multiplexes several low-rate data streams each data stream is associated with a given subcarrier. The main advantage of this concept in a radio environment is that each of the data streams experiences an almost flat fading channel. In slowly fading channels, the intersymbol interference (ISI) and intercarrier interference (ICI) within an OFDM symbol can be avoided with a small loss of transmission energy using the concept of a cyclic prefix.

The basic principle of OFDM is to split a high-rate datastream into a number of lower rate streams that are transmitted simultaneously over a

88

(possibly complex) symbol stream using some modulation constellation (QAM, PSK, etc.). Note that the constellations may be different, so some streams may carry a higher bit-rate than others. An inverse FFT is computed on each set of symbols, giving a set of complex time-domain samples. These samples are then quadrature-mixed to passband in the standard way. The real and imaginary components are first converted to the analogue domain using digital-to-analogue

converters (DACs); the analogue signals are then used Fig1: OFDM and the orthogonality principle III SYSTEM 1. Transmiter DESCRIPTION to modulate cosine and sine waves , respectively. at

the carrier frequency,

These

signals are then summed to give the transmission signal, Receiver .

Fig 2 Transmiter of OFDM System An OFDM carrier signal is the sum of a number of orthogonal sub-carriers, with baseband data on each sub-carrier being independently modulated commonly amplitude using some type of quadrature or phase-shift Fig 3 Receiver of OFDM System The receiver picks up the signal , which is then

quadrature-mixed down to baseband using cosine and sine waves at the carrier frequency. This also creates signals centered on , so low-pass filters

modulation (QAM)

keying (PSK). This composite baseband signal is typically used to modulate a main RFcarrier. is a serial stream of binary digits. By inverse multiplexing, these are first demultiplexed into parallel streams, and each one mapped to a

are used to reject these. The baseband signals are then sampled and digitised using analogue-todigital converters(ADCs), and a forward FFT is used to convert back to the frequency domain.

89
transmitter end to the receiver end. Presence of guard band in this system deals with the problem of ISI and noise is minimized by larger number of sub carrier. IV ADVANTAGES & DISADVANTAGES OF AN OFDM SYSTEM Advantages Due to increase in symbol duration, there is a reduction in delay spread. Addition of guard band almost removes the ISI and ICI in the system. Conversion of the channel into many narrowly spaced orthogonal sub carriers render it immune to frequency selective fading. As it is evident from the spectral pattern of an OFDM system, orthogonally placing the sub carriers lead to high spectral efficiency. Can be efficiently implemented using IFFT Disadvantages These systems are highly sensitive to Doppler shifts which affect the carrier frequency offsets, resulting in ICI. Presence of a large number of sub carriers with varying amplitude results in a high Peak to Average Power Ratio (PAPR) of the system, which in turn hampers the efficiency of the RF amplifier. PEAK TO AVERAGE POWER RATIO The main drawback of OFDM is its high PAPR, which distorts the signal if the transmitter contains nonlinear components such as power amplifiers and causes some deficiencies such as intermodulation, spectral spreading and changing in signal constellation. Minimising the PAPR allows a higher average power to be transmitted for a fixed peak power, improving the overall signal to noise ratio at the receiver. There are many solutions in the literature to reduce the effect of PAPR in the OFDM signals. There are a number of techniques to deal with the problem of PAPR. Some of them are amplitude coding, clipping, partial clipping and filtering, (PTS), transmit sequence

This returns

parallel streams, each of which is

converted to a binary stream using an appropriate symbol detector. These streams are then recombined into a serial stream, , which is an

estimate of the original binary stream at the transmitter. B. Data on OFDM The data to be transmitted on an OFDM signal is spread across the carriers of the signal, each carrier taking part of the payload. This reduces the data rate taken by each carrier. The lower data rate has the advantage that interference from reflections is much less critical. This is achieved by adding a guard band time or guard interval into the system. This ensures that the data is only sampled when the signal is stable and no new delayed signals arrive that would alter the timing and phase of the signal.

C. Fig 4 : Data on OFDM The distribution of the data across a large number of carriers in the OFDM signal has some further advantages. Nulls caused by multi-path effects or interference on a given frequency only affect a small number of the carriers, the remaining ones being received correctly. By using error-coding techniques, which does mean adding further data to the transmitted signal, it enables many or all of the corrupted data to be reconstructed within the receiver. This can be done because the error correction code is transmitted in a different part of the signal. In the OFDM system, orthogonally placed sub carriers are used to carry the data from the

selected mapping (SLM) and interleaving These techniques achieve PAPR reduction at the expense of transmit signal power increase, bit error rate (BER) increase, data rate loss, computational complexity increase, and so on AMPLITUDE CLIPPING AND FILTERING A threshold value of the amplitude is set in this process and any sub-carrier having amplitude more

90
than that value is clipped or that sub-carrier is filtered to bring out a lower PAPR value. SELECTED MAPPING In this a set of sufficiently different data blocks representing the information same as the original data blocks are selected. Selection of data blocks with low PAPR value makes it suitable for transmission. PARTIAL TRANSMIT SEQUENCE Transmitting only part of data of varying sub-carrier which covers all the information to be sent in the signal as a whole is called Partial Transmit Sequence Technique. . V APPLICATIONS ADSL OFDM is used in ADSL connections that follow the G.DMT (ITU G.992.1) standard, in which existing copper wires are used to achieve high-speed data connections.Long copper wires suffer from attenuation at high frequencies. The fact that OFDM can cope with this frequency selective attenuation and with narrow-band interference are the main reasons it is frequently used in applications such as ADSL modems. However, DSL cannot be used on every copper pair; interference may become significant if more than 25% of phone lines coming into a central office are used for DSL.For experimental amateur radio applications, users have even hooked up commercial off-the-shelf ADSL equipment to radio transceivers which simply shift the bands used to the radio frequencies the user has licensed. LAN and MAN OFDM is extensively used in wireless LAN and MAN applications, including IEEE 802.11a/g/n and WiMAX.IEEE 802.11a/g/n operating in the 2.4 and 5 GHz bands, specifies a per-stream airside data rates ranging from 6 to 54 Mbit/s. If both devices can utilize "HT mode"

added with 802.11n then the top 20 MHz per-stream rate is increased to 72.2 Mbit/s with the option of data rates between 13.5 and 150 Mbit/s using a 40 MHz channel. Four different modulation schemes are used: BPSK, QPSK, 16-QAM, and 64QAM, along with a set of error correcting rates (1/25/6). The multitude of choices allows the system to adapt the optimum data rate for the current signal conditions. DVB-T By Directive of the European Commission, all television services transmitted to viewers in the European Community must use a transmission system that has been standardized by a recognized European standardization body,[ and such a standard has been developed and codified by the DVB Project, Digital Video Broadcasting (DVB); Framing structure, channel coding and modulation for digital terrestrial television.[Customarily referred to as DVBT, the standard calls for the exclusive use of COFDM for modulation. DVB-T is now widely used in Europe and elsewhere for terrestrial digital TV. VI CONCLUSION

In this paper we studied abou OFDM system and concluded that Orthogonal frequency division multiplexing (OFDM) is a promising technique for the broadband wireless communication system.

91

A Comprehensive Study of Adaptive Resonance Theory

Abstract In this paper we have studied Adaptive resonance theory and their extensions, and to provide an introduction to ART by examining ART1, ART2, ART2-A, FUZZY ART, ART MAP, FUZZY ARTMAP, the first member of the family of ART neural network. ART was specially designed to overcome the stability-plasticity dilemma problem. This paper also describes the use of unsupervised ART2 neural network for recognition patterns. We investigate the performance of ART2-A, and it offers better recognition accuracy even when the illumination of the images is varied. This paper also explores the features of FUZZY ARTMAP neural network classifier. FUZZY ARTMAP is both much faster and incrementally stable. The FUZZY ART is an unsupervised network which is the essential component of the FUZZY ARTMAP. Keywords: ART, neural network, pattern recognition
I. INTRODUCTION

here are based on data mining methods used to preprocess unlabeled examples. Unsupervised learning is closely related to the problem of density estimation in statistics however unsupervised learning also encompasses many other techniques that seek to summarize and explain key features of the data.
II.ADAPTIVE RESONANCE THEORY (ART)

Neural network: Neural network is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks a neural network consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation. In most cases an NN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. Supervised Learning: Supervised learning is the machine learning task of inferring a function from supervised training data. In this learning, each example is a pair consisting of an input object and a desired output value. A supervised learning algorithm analyzes the training data and produces an inferred function, which is called a classifier. The inferred function should predict the correct output value for any valid input object. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way. Unsupervised Learning: Unsupervised learning is a class of problems in machine learning where the goal is to determine how data is organized. Many methods employed

Adaptive Resonance Theory [1], or, ART, is both cognitive and neural theory of how the brain quickly learns to categorize, and predict objects and events in a changing world, and a set of algorithm which computationally embody ART principles and are used in large scale engineering and technological applications here fast, stable Incremental, learning about complex changing environments is needed .ART clarifies the brain processes from which conscious experience emerge. ART predicts how top-down attention works and regulates fast stable learning of recognition categories. In particular, ART articulates a critical role for "resonant" states in driving fast stable learning; hence the name adaptive resonance. These resonant states are bound together, using top-down attentive feedback in the form of learned expectations, into coherent representations of the world. ART hereby clarifies one important sense in which the brain carries out predictive computation. ART algorithms have been used in large scale applications such as medical data basepredictions, remote sensing and airplane design.

Fig1: Basic structure of the ART network

92
neuron in F1 is connected to all neurons in F2 via the continuous-valued forward long term memory (LTM) Wf , and vice versa via the binary-valued backward LTM Wb. The other modules are gain 1 and 2 (G1 and G2), and a reset module. Each neuron in the comparison layer receives three inputs: a component of the input pattern, a component of the feedback pattern, and a gain G1. A neuron outputs a 1 if and only if at least three of these inputs are high: the 'two-thirds rule.' The neurons in the recognition layer each compute the inner product of their incoming (continuous-valued) weights and the pattern sent over these connections. The winning neuron then inhibits all the other neurons via lateral inhibition. Gain 2 is the logical 'or' of all the elements in the input pattern x. Gain 1 equals gain 2, except when the feedback pattern from F2 contains any 1; then it is forced to zero. Finally, the reset signal s sent to the active neuron in F2 if the input vector x & the output of F1 di er by more than some vigilance level. Operation: The network starts by clamping the input at F1. Because the output of F2 is zero, G1 and G2 are both on and the output of F1 matches its input. The pattern is sent to F2, and in F2 one neuron becomes active. This signal is then sent back over the backward LTM, which reproduces a binary pattern at F1. Gain 1 is inhibited, and only the neurons in F1 which receive a 'one' from both x and F2 remain active. If there is a substantial mismatch between the two patterns, the reset signal will inhibit the neuron in F2 and the process is repeated. 1. Initialization:

Basic processing module of ART networks is an extended competitive learning network, as shown in Fig.1 [2]. The m neurons of an input layer F1 register values of an input pattern1= (i1,i2,,im ) every neuron of output layer F2 receives a bottom-up net activity tj, built from all F1outputs S = I. The vector elements of T=(t1,t2,.,tn) can be perceived as the results of comparison between input pattern I and prototypes W1=(w11,,,w1m),,Wn=(wn1,.,wnm). These prototypes are stored in the synaptic weights of the connections between F1 - and F2 -neurons. The ART Architecture:

Fig2.The ART architecture. The system consists of two layers F1 andF2,which are connected to each other via the LTM. The input pattern is received at F1, whereas classification takes Place in F2. As mentioned before, the input is not directly classified. First a characterization takes place by means of extracting features, giving rise to activation in the feature representation field. The expectations, residing in the LTM connections, translate the input pattern to a categorization in the category representation field. The classification is compared to the expectation of the network, which resides in the LTM weights from F2 to F1. If there is a match, the expectations are strengthened, otherwise the classification is rejected. The simplified neural network model:

Where N is the number of neurons in F1, M the number of neurons in F2, 0 i < N, and 0 j <M. Also, choose the vigilance threshold , 0 1; 2. Apply the new input pattern x: 3. Compute the activation values y0 of the neurons in F2:

4. Select the winning neuron k (0 k <M): 5. Vigilance test: if

Where. Denotes inner product, go to step 7, else go to step 6. Note that Wkb .

Fig3.The ART neural network. The ART1 simplified model consists of two layers of binary neurons (with values 1 and 0), called F1 and F2 .Each Essentially is the inner product , which will be large if and near to each other; 6. Neuron k is disabled from further activity. Go to step 3;

93
7. Set for all l, 0 l < N: Yj=-1(inhibit node j)(and continue executing step8 again) If||x||/||s||>, then proceed to step 13. Step13: update the weights for node j Bij(new)=LXi/L=1+||x|| Tji(new)=Xi Step14: test for stopping condition ii) Adaptive Resonance Theory2 (ART2) ART2 accepts continuous valued vectors.ART2 networks [3] are both plastic & stable in that they can learn new data without erasing currently stored information .Thus ART2 networks are suitable for continuous, incremental online training. The main advantages of ART networks are that they do not suffer from the stability ,plasticity problem of supervised networks & thus are more suitable for continuous online learning of the classification task. The difference between ART2 and ART1 reflects the modifications patterns with valued components. The architecture of an ART2 network is delineated inFig4. In this particular conguration, the `feature representation eld (F1) consists of four loops. An input pattern will be circulated in the lower two loops rst. Inherent noise in the input pattern will be sup-pressed [this is controlled by the parameters a and b and the feedback function f()] and prominent features in it will be accentuated. Then the enhanced input pattern will be passed to the upper two F1 loops and will excite the neurons in the `category representation eld (F2) via the bottom-up weights. The established class neuron in F2 that receives the strongest stimulation will re. This neuron will read out a `top-down expectation in the form of a set of top-down weights sometimes referred to as class templates. This top-down expectation will be compared against the enhanced input pattern by the vigilance mechanism. If the vigilance test is passed, the top-down and bottom-up weights will be updated and, along with the enhanced input pattern, will circulate repeatedly in the two upper F1 loops until stability is achieved. The time taken by the network to reach a stable state depends on how close the input pattern is to passing the vigilance test. If it passes the test comfortably, i.e. the input pattern is quite similar to the top-down expectation, stability will be quick to achieve. Otherwise, more iteration are required. After the top-down and bottom-up weights have been updated, the current ring neuron will become an established class neuron. If the vigilance test fails, the current ring neuron will be disabled. Another search within the remaining established class neurons in the F2 layer will be conducted. If none of the established class neurons has a top-down expectation similar to the input pattern, an unoccupied F2 neuron will be assigned to classify the input pattern. This procedure repeats itself until either all the patterns are classier or the memory capacity of F2 has been exhausted. The ART2 algorithm: Step1: Initialize parameters a, b, c, d, e, p, ,, , Step2: perform step 3-13 upto specified number of epochs of training.

8. Re-enable all neurons in F2 and go to step 2. III. EXTENSIONS IN ADAPTIVE RESONANCE THEORY. The Extensions of ART are discussed as follows: i) Adaptive Resonance Theory1 (ART1) ART1 is an efficient algorithm that emulates the self organizing [2] pattern recognition and hypothesis testing properties of the ART neural networks architecture for horizontal and vertical classification of 0-9 digits recognitions. The ART1 model can self organize in real time producing stable & clear recognition. While getting input patterns beyond those originally stored. It can also preserve its previously learned knowledge while keeping its ability to learn new input patterns that can be saved in such a fashion that the stored patterns cannot be destroyed or forgotten. ART1 is important to cluster binary input vectors (nonzero) & direct user control of the degree of similarity among patterns placed on a cluster unit. The learning process is designed such that patterns are not necessarily presented in a fixed order & the number of the patterns for clustering may be unknown in advanced updates for both the bottom up & top down weights are controlled by differential equations. However, this process may be finished in a learning trial. In other words the weights reach the equilibrium during each learning trial. The ART1 algorithm: Step1: Initialize parameters L>1and0<1. Initialize weights 0<bij(0)<L/L-1+n,tji(0)=1 Step2: while stopping condition is false, perform steps 314. Step3: for each training input. do steps4-13 Step4: set activation of all f2 units to zero. Set activations of f1(a)units to input vector s. Step5: compute the norm of s: ||s||=si Step6: send input signal from f1(a)to fi(b)layer Step7: for each f2 node that is not inhibited. If yj1, then yj=bijXi Step8: while reset is true, perform step 9-12. Step9: find j such that yjyj for all nodes j.If yj=1, then all odds are inhibited and this pattern cannot be clustered. Step10: recomputed activation x of f1(b) Xi=Si Tji Step11: compute the norm of vector x||x||=Xi Step12: test for reset If||x||/||s||<,then

94
Step3: for each input vectors do step 4-12 Step4: Update f1 unit activation Ui =0, xi= si /e+\\s\\ Wi = si, qi =0 Pi=0, vi= f(xi) Update f1 unit activations again. I =vi/e+\\v\\, wi=si+aui, Pi=ui, xi=wi/e+\\w\\, qi=pi/e+\\p\\ V=f(xi)+bf(qi) Step5: compute signals to f2 units Yj= bijpi Step6: while reset is true, perform step 7-8. Step7: for f2 unit choose yj with largest signal . step8: Check for reset Ui=vi/e+\\v\\, pi=ui+d tji Ri=ui+cpi/e+\\u\\+c||p|| If ||r||>-e,then Yj=-1(inhibit j) Since reset is true, go to step 6 If||r||>-e, then Wi=si+aui V=f(xi)+bf(qi) Reset is false, so go to step 9. Yj=-1(inhibit node j)(and continue executing step8 again) If||x||/||s||>, then proceed to step 13. Step13: update the weights for node j Bij(new)=LXi/L=1+||x|| Tji(new)=Xi Step14: test for stopping condition. Step11: update f1 activations. Step12: test stopping condition for weight updates. Step13: test stopping condition for number of epochs. iii) Adaptive Resonance Theory2-A (Art2a) The Art networks are designed to allow the user to control the degree of similarity of patterns placed on same cluster. The resulting of the number of clusters then depends on the distance between all input patterns, presented to the network during training periods. The ART2-a network can be characterized by its preprocessing, choice, match and adaptation where choice and match define the search circuit for a fitting prototype [2]. The central functions of the ART2-a algorithm [2] are as followsPreprocessing No negative input values are allowed and all encoded input vectors A to unit Euclidean length, denoted by function symbol N as :

(1) Carpenter and grossberg suggested an additional method of noise suppression to contrast enhancement by setting all input values to zero, which do not exceed a certain bias theta as defined by,

(2) This kind of contrast enhancement does only make sense if characteristics of input patterns, leading to a distribution on different clusters, are coded in their highest values .with theta bounded by

Fig4: Architecture of an ART2 network step9: perform step 10-12 upto specified number of learning iterations. Step10:update weights for wining unit j. Tji(new)=dui+,1+d(d-1)}tji(old) bji(new)=dui+,1+d(d-1)}bji(old) Step11: compute the norm of vector x||x||=Xi Step12: test for reset If||x||/||s||<, then

(3) The upper limit will lead to complete suppression of all patterns having the same constant value for all elements. Choice Bottom up activities, leading to the choice of a prototype are determined by

95

(4) Bottom-up net activities are determined differently for previously committed and uncommitted prototypes. The choice parameter 0 again defines the maximum depth of search for a fitting cluster. With =0 , all committed prototypes are checked before an uncommitted prototype is chosen as winner. The simulations in this paper apply = 0. Match Resonance and adaptation occurs either if j is the index of an uncommitted prototype or if j is a committed prototype and (5) Adaptation Adaptation of the final winning prototype requires a shift towards the current input pattern, (6) ART 2-A type networks always use fast commit slow recode mode. Therefore the learning rate is set to n=1 if j is an uncommitted prototype and to lower values for further adaptation. Since match and choice do not evaluate the values of uncommitted prototypes, there is no need to initialize them with specific values. iv) Fuzzy Art The Fuzzy ART is a neural network introduced by Carpenter, Gross berg and Rosen in 1991[4]. It is a modified version of the binary [5] ART1, which is Notably able to accept analog fuzzy input patterns, i.e. vectors whose components are real numbers between 0 and 1. The Fuzzy ART is an unsupervised neural Network capable of incremental learning that is it can learn continuously without forgetting what it has previously learned. Fuzzy ART network:

Fig5. Sample Fuzzy ART network. A Fuzzy ART network is formed of two layers of neurons, the input layer F1 and the output layer F2, as illustrated in Fig. Both layers have an activity pattern, schematized on the figure with vertical bars of varying height. The layers are fully interconnected, each neuron being connected to every neuron on the other layer. Every connection is weighted by a number lying between 0 and 1. A neuron of F2 represents one category formed by the network and is characterized by its weight vector wj (j is the index of the neuron) . The weight vector's size is equal to the dimension M of layer F1. Initially all the weight vectors' components are fixed to 1. Until the weights of a neuron are modified, we say that it is uncommitted. Inversely, once a neuron's weights have been modified, this neuron is said to be Committed. The network uses a form of normalization called complement coding. The operation consists on taking the input vector and concatenating it with its complement. The resulting vector is presented to layer F1. Therefore, the dimension M of layer F1 is the double of the input vector's dimension. Complement coding can be deactivated, in this case layer F1 will have the same dimension as the input vector. Unless specified otherwise, we will always suppose that complement coding is active. The Fuzzy ART learns by placing hyper boxes in the M 2 {dimensions hyperspace, M being the size of layer F1. As said earlier, each neuron of layer F2 represents a category formed by the network. The position of the box in the space is encoded in the weight vector of the neuron. The general structure of Fuzzy ART: A typical ART network includes three-layers. The layers F0, F1 and F2 are input, comparison and recognition layers, respectively. The input layer F0 gets the attributes of peers which are needed to be classified. Each peer advertises its capability in the comparison layer F1 for competence comparison. The nodes in F0 and F1 are composed of the entities of the ontology. The corresponding nodes of layer F0 and F1 are connected together via one-to-one, non-modifiable links.

96
Nodes in recognition layer F2 are candidates of the semantic clusters. There are two sets of distinct connections between the layers: bottom-up (F1 to F2) and top-down (F2 to F1). Then there is a vigilance parameter which defines some kind of tolerance for the comparison of vectors. F2 is a competitive layer, which means that only the node with the largest activation becomes active and the other nodes will be inactive (in other words, each node in F2 corresponds to a category). Therefore, every node in F2 has its own, unique top-down weight vector, also called prototype vector (it is used to compare the input pattern to the prototypical pattern that is associated with the category for which the node in F2 stands). category. This inter-ART vigilance resetting signal is a form of "back propagation" of information, but one that differs from the back propagation that occurs in the Back Propagation network. For example, the search initiated by inter-ART reset can shift attention to a novel cluster of visual features that can be incorporated through learning into a new ART~ recognition category. This process is analogous to learning a category for "green bananas" based on "taste feedback. However, these events do not "back propagate" taste features into the visual representation of the bananas, as can occur using the Back Propagation network. Rather, match tracking reorganizes the way in which visual features are grouped, attended, learned, and recognized for purposes of predicting an expected taste.

Fig6.The general structure of FUZZY ART V).ARTMAP ARTMAP is also known as Predictive ART[5], combines two slightly modified ART-1 or ART-2 units into a supervised learning structure where the first unit takes the input data and the second unit takes the correct output data, then used to make the minimum possible adjustment of the vigilance parameter in the first unit in order to make the correct classification. ARTMAP models combine two unsupervised modules to carry out supervised learning. Many variations of the basic supervised & unsupervised networks have since been adapted for technological applications& biological analyses. Modules ART, and ARTb self-organize categories for vector sets a and b.ART, and ARTb are connected by an inter-ART module that consists of the Map Field and the control nodes called Map Field gain control and Map Field orienting subsystem. Inhibitory paths are denoted by a minus sign; other paths are ART, and ARTb are here connected by an inter- ART module that in many ways resembles ART 1. This inter-ART module includes a Map Field that controls the learning of an associative map from ART~ recognition categories to ARTb recognition categories. This map does not directly associate exemplars a and b, but rather associates the compressed and symbolic representations of families of exemplars a and b. The Map Field also controls match tracking of the ART vigilance parameter. A mismatch at the Map Field between the ART~ category activated by an input a and the ARTb category activated by the input b increases ART, vigilance by the minimum amount needed for the system to search for and, if necessary, learn a new ART,, category whose prediction matches the ARTb

Fig7: Block diagram of an ARTMAP system.

vi) Fuzzy ARTMAP: The Fuzzy ARTMAP, introduced by Carpenter et al. in 1992 [6] is a supervised network which is composed of two Fuzzy ARTs. The Fuzzy ARTs are identified as ARTa and ARTb. The parameters of these networks are designated respectively by the subscripts a and b. The two Fuzzy ARTs are interconnected by a series of connections between the F2 layers of ARTa and ARTb. The connections are weighted, i.e. a weight wij between 0 and 1 is associated with each one of them. These connections form what is called the map field F ab. The map field has two parameters ( ab and ab) and an output vector ab.

97
version of binary ART1.ARTMAP combines two slightly modified ART1 or ART2 units in a supervised learning structure. FUZZY ARTMAP is a supervised network which is composed of two FUZZY ART.
IV. CONCLUSION

figure 8: Sample Fuzzy ARTMAP network

The input vector a of ARTa is put in complement coding form, resulting in vector A. Complement coding is not necessary in ARTb so we present the input vector B directly to this network. Figure 2 presents the structure of the Fuzzy ARTMAP. The weights of the map field's connections are illustrated by a vertical bar with a height proportional to the size of the weight. The weights of the map field are all initialized to 1.ARTMAP is represented by many rectangles. In other words it learn a to distinguish the data by mapping boxes into the space, enclosing each category with a certain number of these boxes. Classifying Once the Fuzzy ARTMAP is trained [6], it can be used as a classifier. In this case, the ARTb network is not used. We present a single input vector to ARTa which is propagated until resonance, with a (temporary) vigilance a = 0. Thus, the first category selected by the choice function is accepted field used fast learning ( ab = 1), the output vector contains only 0's except a 1. The index of. If the map this component is the number of the category in which the input vector A has been classified. The use of the map field is thus to associate a category number to each neuron of ARTa's F2 layer (Fa 2 ), i.e. to each box in the hyperspace. If fast learning was not used, one neuron of F a2 can correspond to many categories with different degrees. One way to determine the category number could be to select the index of the maximal component. If needed, the desired output vector can be restored from the weights of the neuron of Fb 2 whose index is the category number.
III. COMPARISON BETWEEN DIFFERENT EXTENSIONS

ART is a theory developed by Stephen Gross berg & Gail Carpenter on the aspects of how the brain process information. It describes a number a of neural network models.ART1 is the simplest variety of ART networks acceptation only binary inputs.ART1 is an unsupervised learning model specially designed for recognizing binary inputs.ART2 extends networks capabilities to support continuous inputs. The main advantages of ART2 networks, is that they do not suffer from stability-plasticity problem of supervised networks and thus are more suitable for continuous online learning of the classification task. The ART2-a is the best classifier wgen the images are corrupted by additive random noise.ART2-a is much less time consuming than the other neural networks and its adaptation is fast while introducing a new sample is added. Fuzzy ART implements fuzzy logic into ARTs pattern recognition thus enhancing generalizability. An optional(&very useful)feature of Fuzzy ART is complement coding, a means of in cooperating the absence of features into pattern classification which goes a long way towards preventing inefficient & unnecessary category proliferation. FUZZY ARTMAP is a powerful neural network model with many useful characteristics including stability & online learning. FUZZY ARTMAP gave better performance & fewer rules over other machine learning algorithm & new models. ARTMAP learning process is a lot faster. Moreover, Fuzzy ARTMAP is capable of incrementally stable learning. ARTMAP learning process is a lot faster. Moreover, Fuzzy ARTMAP is capable of incrementally stable learning.

REFERENCES:
[1]Carpenter,G.A. and Grossberg,s.,Adaptive Resonance Theory. The handbook of brain theory and neural networks. [2]T. Frank, K. F. Kraiss, T. Kuhlen, Comparative Analysis of Fuzzy ART and ART_2A Network Clustering Performance, IEEE, Transaction on Neural Networks, Vol. 9, pp. 544 -559, [3]Carpenter,G.A. and Grossberg,s.,self organization codes for analog input patterns [4]Gail A. carpenter, Stephen Gross berg and David B.Rosen Fuzzy ART: fast, stable and learning categorization of analog pattern by adaptive resonance system Neural Networks, 4:759{771, 1991. *5+Gail A. carpenter, Stephen Grasbergs massively parallel architecture for a self organizing neural pattern recognition machine. Computer vision graphics and image processing 115, 1987. [6]Gail A. Carpenter, Stephen Gross berg, Natalya Markesan, John H. Reynolds, and David B. Rosen. Fuzzy ARTMAP: neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Transactions on Neural Networks, 3:698{713, 1992.

There are two models in which ARTMAP,FUZZZY ARTMAP are supervised model and ART1,ART2,ART2-A,FUZZY ART are unsupervised model.ART1 is used to cluster binary inputs vectors(nonzero) but ART2 accepts continuous valued vectors.ART2-A is used to designed to control the degree of similarity of patterns. FUZZY ART is a modified

98

Is Wireless Network Purely Secure?

Abstract Wireless technology has been gaining rapid popularity for some years. Adaptation of a standard depends on the ease of use and level of security it provides. In this case, contrast between wireless usage and security standards show that the security is not keeping up with the growth paste of end users usage. Current wireless technologies in use allow hackers to monitor and even change the integrity of transmitted data. Lack of rigid security standards has caused companies to invest millions on securing their wireless networks. When discussing the security of wireless technologies, there are several possible perspectives. Different authentication, access control and encryption technologies all fall under the umbrella of security. During the beginning of the commercialization of the Internet, organizations and individuals connected without concern for the security of their system or network. While the current access points provide several security mechanisms, my work shows that all of these mechanisms are completely in-effective. As a result, organizations with deployed wireless networks are vulnerable to unauthorized use of internet enabled systems and access to their internal infrastructure. 1. Introduction

external compromise. As a result, the organizations have canalized their external network traffic through distinct openings protected by firewalls. The idea is simple. By limiting external connections to a few well protected openings, the organization can better protect itself. Unfortunately, the deployment of a wireless network opens a back door into the internal network that permits an attacker access beyond the physical security perimeter of the organization. 2. 802.11 Wireless Networks

802.11 wireless networks operate in one of two modes- ad-hoc or infrastructure mode. The IEEE standard defines the ad-hoc mode as Independent Basic Service Set (IBSS) and the infrastructure mode as Basic Service Set (BSS). In the remainder of this section, we explain the differences between the two modes and how they operate. In ad hoc mode, each client communicates directly with the other clients within the network, see figure 1

Organizations are rapidly deploying wireless infrastructures based on the IEEE 802.11 standard. Unfortunately, the 802.11 standard provides only limited support for confidentiality through the wired protocol which contains significant flaws in the design. Furthermore, the standards committee for 802.11 left many of the difficult security issues such as key management and a robust authentication mechanism as open problems. As a result, many of the organizations deploying wireless networks use either a permanent fixed cryptographic variable, or a key, or no encryption what so ever. Organizations over the last few years have expended a considerable effort to protect their internal infrastructure from

Figure 1: Example ad-hoc network Ad-hoc mode is designed such that only the clients within transmission range (within the same cell) of each other can communicate. If a client in an ad-hoc network wishes to communicate outside of the cell, a member of the cell MUST operate as a gateway and perform routing.

99
In infrastructure mode, each client sends all of its communications to a central station, or access point (AP). The access point acts as an ethernet bridge and forwards the communications onto the appropriate network either the wired network, or the wireless network, see figure 2. association request frame, and the access point responding with an association response frame. After following the process described in the previous paragraph, the client becomes a peer on the wireless network, and can transmit data frames on the network. 3. Traditional Wireless Security

Wireless security can be broken into two parts: Authentication and Encryption. Authentication mechanisms can be used to identify a wireless client to an access point and vice-versa, while encryption mechanisms ensure that it is not possible to intercept and decode data. Authentication Access points support MAC authentication of wireless clients, which means that only traffic from authorized MAC addresses will be allowed through the access point. Access point will determine if a particular MAC address is valid by checking it against either a RADIUS server external to the access point or against a database within the nonvolatile storage of the access point. . For many years, MAC access control lists have been used for authentication. Encryption Much attention has been paid recently to the fact that Wired Equivalent Privacy (WEP) encryption defined by 802.11 is not an .industrial strength Encryption protocol. For many years 802.11 WEP has been used for encryption. 4. 802.11 Standard Security Mechanisms

Figure 2: Example infrastructure network Prior to communicating data, wireless clients and access points must establish a relationship or an association. Only after an association is established can the two wireless stations exchange data. In infrastructure mode, the clients associate with an access point. The association process is a two step process involving three states: 1. Unauthenticated and unassociated, 2. Authenticated and unassociated, and 3. Authenticated and associated. To transition between the states, the communicating parties exchange messages called management frames. We will now walk through a wireless client finding and associating with an access point. All access points transmit a beacon management frame at fixed interval. To associate with an access point and join a BSS, a client listens for beacon messages to identify the access points within range. The client then selects the BSS to join in a vendor independent manner. For instance on the Apple Macintosh, all of the network names (or service set identifiers (SSID)) which are usually contained in the beacon frame are presented to the user so that they may select the network to join. A client may also send a probe request management frame to find an access point affiliated with a desired SSID. After identifying an access point, the client and the access point perform a mutual authentication by exchanging several management frames as part of the process. The two standardized authentication mechanisms are described in sections 4.1 and 4.2. After successful authentication, the client moves into the second state, authenticated and unassociated. Moving from the second state to the third and final state, authenticated and associated, involves the client sending an

4.1 Open System Authentication Open system authentication is the default authentication protocol for 802.11. As the name implies, open system authentication authenticates anyone who requests authentication. Essentially, it provides a NULL authentication process. Experimentation has shown that stations do perform a mutual authentication using this method when joining a network, and our experiments show that the authentication management frames are sent in the clear even when WEP is enabled. 4.2 Shared Key Authentication Shared key authentication uses a standard challenge and response along with a shared secret key to provide authentication. The station wishing to authenticate, the initiator, sends an authentication request management frame indicating that they wish to use shared key authentication. The recipient of the authentication request, the responder, responds

100
by sending an authentication management frame containing 128 octets of challenge text to the initiator. The challenge text is generated by using the WEP pseudo-random number generator (PRNG) with the shared secret and a random initialization vector (IV). Once the initiator receives the management frame from the responder, they copy the contents of the challenge text into a new management frame body. This new management frame body is then encrypted with WEP using the shared secret along with a new IV selected by the initiator. The encrypted management frame is then sent to the responder. The responder decrypts the received frame and verifies that the 32-bit CRC integrity check value (ICV) is valid, and that the challenge text matches that sent in the first message. If they do, then authentication is successful. If the authentication is successful, then the initiator and the responder switch roles and repeat the process to ensure mutual authentication. The entire process is shown in figure 4, and the format of an authentication management frame is shown in figure 3.

Sequence number 1 2 3 4

Status Code Reserved Status Reserved Status

Challenge text Not present Present Present Not Present

WEP used No No Yes No

Table 1: Message Format based on Sequence Number Fram e Contr ol Dura tion De st Ad dr Sou rc Add r BSS ID Seq # Fra me Bo dy FC S 4.3 Closed Network Access Control Lucent has defined a proprietary access control mechanism called Closed Network. With this mechanism, a network manager can use either an open or a closed network. In an open network, anyone is permitted to join the network. In a closed network, only those clients with knowledge of the network name, or SSID, can join. In essence, the network name acts as a shared secret. 4.4 Access Control Lists Another mechanism used by vendors (but not defined in the standard) to provide security is the use of access control lists based on the ethernet MAC address of the client. Each access point can limit the clients of the network to those using a listed MAC address. If a clients MAC address is listed, then they are permitted access to the network. If the address is not listed, then access to the network is prevented. 4.5 Wired Equivalent Privacy protocol The Wired Equivalent Privacy (WEP) protocol was designed to provide confidentiality for network traffic using the wireless protocol. WEP provide the security of a wired LAN by encryption through use of the RC4 algorithm with two side of a data communication. A. In the sender side:

Management Frame Format Algorith m Number Seq Nu m Statu s Cod e Eleme nt ID Challen ge Text

Lengt h

Authentication Frame Format Figure 3: Authentication Management Frame The format shown is used for all authentication messages. The value of the status code field is set to zero when successful, and to an error value if unsuccessful. The element identifier identifies that the challenge text is included. The length field identifies the length of the challenge text and is fixed at 128. The challenge text includes the random challenge string. Table 1 shows the possible values and when the challenge text is included based on the message sequence number.

101
WEP try to use four operations to encrypt the data (plaintext).At first, the secret key used in WEP algorithm is 40-bit long with a 24-bit Initialization Vector (IV) that is concatenated to it for acting as the encryption/decryption key. Secondly, the resulting key acts as the seed for a Pseudo-Random Number Generator (PRNG).Thirdly, the plaintext throw in an integrity algorithm and concatenate by the plaintext again. Fourthly, the result of key sequence and ICV will go to RC4 algorithm. A final encrypted message is made by attaching the IV in front of the Cipher text. Now in Fig.2 define the objects and explain the detail of operations. assisted by MIC (Message Integrity Check) also, whose function is to avoid attacks of bit-flipping type easily applied to WEP by using a hashing technique. Figure-7 shows a whole picture of WPA process. As you see, TKIP uses the same WEP's CR4 Technique, but making a hash before the increasing of the algorithm CR4. A duplication of the initialization vector is made. One copy is sent to the next step, and the other is hashed (mixed) with the base key.

Figure 7: WPA Encryption Algorithm (TKIP) Figure 5: WEP encryption Algorithm (Sender Side) B. In the Recipient side: WEP try to use five operations to decrypt the received side (IV + Cipher text).At first, the PreShared Key and IV concatenated to make a secret key. Secondly, the Cipher text and Secret Key go to in CR4 algorithm and a plaintext come as a result. Thirdly, the ICV and plaintext will separate. Fourthly, the plaintext goes to Integrity Algorithm to make a new ICV (ICV) and finally the new ICV (ICV) compare with original ICV. In Fig.3 you can see the objects and the detail of operations schematically. After performing the hashing, the result generates the key to the package that is going to join the first copy of the initialization vector, occurring the increment of the algorithm RC4. After that, there's the generation of a sequential key with an XOR from the text that you wish to cryptograph, generating then the cryptography text. Finally, the message is ready for send. It is encryption and decryption will performed by inverting the process. 5. Weaknesses in 802.11 Standard Security Mechanisms

This section describes the weaknesses in the access control mechanisms of currently deployed wireless network access points. 5.1 Weakness of Lucents access control mechanism In practice, security mechanisms based on a shared secret are robust provided the secrets are wellprotected in use and when distributed. Unfortunately, this is not the case with Lucents access control mechanism. Several management messages contain the network name, or SSID, and these messages are broadcast in the clear by access points and clients. The actual message containing the SSID depends on the vendor of the access point. The end result, however, is that an attacker can easily sniff the network namedetermining the shared secret and gaining access to the protected network. This flaw exists even with

Figure 6: WEP encryption Algorithm (Recipient Side) 4.6 WPA (Wi-Fi Protected Access) The WPA came with the purpose of solving the problems in the WEP cryptography method, without the users need to change the hardware. The main reason why WPA generated after WEP is that the WPA allows a more complex data encryption on the TKIP protocol (Temporal Key Integrity Protocol) and

102
WEP enabled because the management messages are broadcast in the clear. 5.2 Weakness of Ethernet MAC Address Access Control Lists In theory, access control lists provide a reasonable level of security when a strong form of identity is used. Unfortunately, this is not the case with MAC addresses for two reasons. First, MAC addresses are easily sniffed by an attacker since they MUST appear in the clear even when WEP is enabled, and second most all of the wireless cards permit the changing of their MAC address via software. As a result, an attacker can easily determine the MAC addresses permitted access via eavesdropping, and then subsequently masquerade as a valid address by programming the desired address into the wireless card by-passing the access control and gaining access to the protected network. 5.3 Weaknesses of WEP In the WEP mechanism the access point sends the random challenge (plaintext, P) and then the station responds with the encrypted random challenge (ciphertext, C). -The attacker can capture the random challenge (P) and encrypted random challenge (C). - The IV was in the plaintext in the packet. Because the attacker now knows the random challenge (plaintext, P), the encrypted challenge (ciphertext, C), and the public IV, the attacker can derive the pseudo-random stream produced using WEP, with the shared key, K, and the public initialization variable, IV. WEP does not prevent replay attacks. An attacker can simply record and replay packets as desired and they will be accepted as legitimate WEP uses RC4 improperly. The keys used are very weak, and can be brute-forced on standard computers in hours to minutes, using freely available software. WEP reuses initialization vectors. A variety of available cryptanalytic methods can decrypt data without knowing the encryption key. WEP allows an attacker to undetectably modify a message without knowing the encryption key. 5.4 Weaknesses of WPA WPA Personal uses a Pre-Shared Key (PSK) to establish the security using an 8 to 63 character passphrase. The PSK may also be entered as a 64 character hexadecimal string. WPA Personal is secure when used with good passphrases or a full 64-character hexadecimal key. Weak PSK passphrases can be broken using off-line dictionary attacks by capturing the messages in the four-way exchange when the client reconnects after being deauthenticated. This weakness was based on the pair wise master key (PMK) that is derived from the concatenation of the passphrase, SSID, length of the SSID and a number or bit string used only once in each session. The result string is hashed 4,096 times to generate a 256-bit value and then combine with nonce values. The required information to generate and verify this key (per session) is broadcast with normal traffic and is really obtainable; the challenge then becomes the reconstruction of the original values. It explains that the pair wise transient key (PTK) is a keyed- HMAC function based on the PMK; by capturing the four way authentication handshake, the attacker has the data required to subject the passphrase to a dictionary attack Wireless suites such as aircrack-ng can crack a weak passphrase in less than a minute.

C XOR P= WEP [K, IV, PR] (pseudo random stream)


The attacker now has all of the elements to successfully authenticate to the target network without knowing the shared secret K. The attacker requests authentication of the access point it wishes to associate/join. The access point responds with an authentication challenge in the clear. The attacker, then, takes the random challenge text, R, and the pseudo-random stream, WEP[K,IV,PR], and computes a valid authentication response frame body by XOR-ing the two values together. The attacker then computes a new Integrity Check Value (ICV) the attacker responds with a valid authentication response message, and he associates with the AP and joins the network. Thus we can see-

6.

Conclusions and Future Work

My work demonstrates serious flaws in ALL of the security mechanisms used by the vast majority of access points supporting the IEEE 802.11 wireless standard. The end result is that ALL of the deployed 802.11 wireless networks are at risk of compromise providing a network access point to internal networks beyond the physical security controls of the organization operating the network.

103
Unfortunately, fixing the problem is neither easy nor straight forward. The only good long term solution is a major overhaul of the current standard which may require replacement of current APs (although in some cases a firmware upgrade may be possible). Fortunately, the 802.11 standards body is currently working on significant improvements to the standard. However, it is too late for deployed networks and for those networks about to be deployed. A number of vendors are now releasing high-end access points claiming that they provide an increase in security. Unfortunately, few of the products we have examined provide enough information to determine the overall assurance that the product will provide, and worse, several of the products that do provide enough information use un-authenticated Diffie-Hellman which suffers from a well-known man in the middle attack. The use of un-authenticated Diffie-Hellman introduces a greater vulnerability to the organizations network. The increase in risk occurs because an attacker can insert them self in the middle of the key exchange between the client and the access point obtaining the session key, K. This is significantly worse than the current situation where the attacker must first determine the pseudorandom stream produced for a given key, K, and public IV, and then use the stream to forge packets. References [1] LAN MAN Standards of the IEEE Computer Society. Wireless LAN medium access control (MAC) and physical layer (PHY) specification. IEEE Standard 802.11, 1997 Edition, 1997.

[2] J. Walker, Unsafe at any key size: an analysis of the WEP encapsulation, Tech. IEEE 802.11 committee, March 2000. http://grouper.ieee.org/groups/802/11/Documents/ Document Holder/0-362.zi%p. [3] N. Borisov, I. Goldberg, and D. Wagner, Intercepting Mobile Communications: The Insecurity of 802.11. http://www.isaac.cs.berkeley. edu/isaac/wep-faq.html. [4] L. Blunk and J. Vollbrecht, PPP Extensible Authentication Protocol (EAP), Tech. Rep. RFC2284, Internet Engineering Task Force (IETF), March 1998. [5] ARASH HABIBI LASHKARI A Survey on Wireless Security protocols (WEP, WPA and WPA2/802.11i), FCSIT, University of Malaya Malaysia a_habibi_l@hotmail.com

104

An Innovative Digital Watermarking Process

AbstractThe seemingly ambiguous title of this paper use of the terms criticism and innovation in concord signifies the imperative of every organisations need for security within the competitive domain. Where organisational security criticism and innovativeness were traditionally considered antonymous, the assimilation of these two seemingly contradictory notions is fundamental to the assurance of long-term organisational prosperity. Organisations are required, now more than ever, to grow and be secured with their innovation capability rending consistent innovative outputs. This paper describes research conducted to consolidate the principles of digital watermarking and identify the fundamental components that constitute organisational security capability. The process of conducting a critical analysis is presented here. A brief description is provided of the basic field of digital watermarking recently, followed by a description of the advantages and disadvantages that were conducted to evaluate the process. The paper concludes with a summary of the analysis and potential findings for future research. Keywords- Digital Watermarking, property protection, Steganography Intellectual

Fig 1. A digital watermarked picture In visible digital watermarking, the information is visible in the picture or video. Typically, the information is text or a logo, which identifies the owner of the media. The image on the right has a visible watermark. When a television broadcaster adds its logo to the corner of transmitted video, this also is a visible watermark.

Introduction All Digital watermarking is the process of embedding information into a digital signal in a way that is difficult to remove. In digital watermarking, the signal may be audio, pictures, or video. If the signal is copied, then the information also is carried in the copy. A signal may carry several different watermarks at the same time. Fig 2. General Digital Watermarking Process In invisible digital watermarking, information is added as digital data to audio, picture, or video, but it cannot be perceived as such (although it may be

105
possible to detect that some amount of information is hidden in the signal). The watermark may be intended for widespread use and thus, is made easy to retrieve or, it may be a form of Steganography, where a party communicates a secret message embedded in the digital signal. In both the cases, as in visible watermarking, the objective is to attach ownership or other descriptive information to the signal in a way that is difficult to remove. It also is possible to use hidden embedded information as a means of covert communication between individuals. Applications Digital watermarking may be used for a wide range of applications, such as: Copyright protection Source tracking (different recipients get differently watermarked content) Broadcast monitoring (television news often contains watermarked video from international agencies) Covert communication Digital watermarking life-cycle phases Then the watermarked digital signal is transmitted or stored, usually transmitted to another person. If this person makes a modification, this is called an attack. While the modification may not be malicious, the term attack arises from copyright protection application, where pirates attempt to remove the digital watermark through modification. There are many possible modifications, for example, lossy compression of the data (in which resolution is diminished), cropping an image or video, or intentionally adding noise. Detection (often called extraction) is an algorithm which is applied to the attacked signal to attempt to extract the watermark from it. If the signal was unmodified during transmission, then the watermark still is present and it may be extracted. In robust digital watermarking applications, the extraction algorithm should be able to produce the watermark correctly, even if the modifications were strong. In fragile digital watermarking, the extraction algorithm should fail if any change is made to the signal.

Classification: A digital watermark is called robust with respect to transformations if the embedded information may be detected reliably from the marked signal, even if degraded by any number of transformations. A digital watermark is called robust if it resists a designated class of transformations. Robust watermarks may be used in copy protection applications to carry copy and no access control information. Typical image degradations are JPEG compression, rotation, cropping, additive noise, and quantization. For video content, temporal modifications and MPEG compression often are added to this list. A digital watermarking method is said to be of quantization type if the marked signal is obtained by quantization. Quantization watermarks suffer from low robustness, but have a high information capacity due to rejection of host interference. A digital watermark is called imperceptible if the watermarked content is perceptually equivalent to the original, unwatermarked content.[1] A digital watermark is called perceptible if its presence in the marked signal is noticeable, but non-intrusive. A digital watermarking method is referred to as spread-spectrum if the marked signal is obtained by

Fig 3. Digital watermarking life cycle phases General digital watermark life-cycle phases are with embedding-, attacking-, and detection and retrieval functions The information to be embedded in a signal is called a digital watermark, although in some contexts the phrase digital watermark means the difference between the watermarked signal and the cover signal. The signal where the watermark is to be embedded is called the host signal. A watermarking system is usually divided into three distinct steps, embedding, attack, and detection. In embedding, an algorithm accepts the host and the data to be embedded, and produces a watermarked signal.

106
an additive modification. Spread-spectrum watermarks are known to be modestly robust, but also to have a low information capacity due to host interference. A digital watermarking method is referred to as amplitude modulation if the marked signal is embedded by additive modification which is similar to spread spectrum method, but is particularly embedded in the spatial domain. Reversible data hiding is a technique which enables images to be authenticated and then restored to their original form by removing the digital watermark and replacing the image data that had been overwritten. Digital watermarking for relational databases emerged as a candidate solution to provide copyright protection, tamper detection, traitor tracing, maintaining integrity of relational data. Literature Review Wherever The literature surveyed prior to the research process, and throughout the duration of this project, constituted more than hundreds of documents. From this large literature set,some documents were identified as core, directly addressing the subject of digital watermarking. These documents were sourced from many locations, including peer reviewed journals, conference proceedings, white papers, electronic books, etc. The core documents were further subdivided into 9 groups. The topics were tabulated and were used to perform critical analysis. The first step was a detailed manual analysis and interpretation of the documents thus extracted (supplementing the initial literature study) and the second step was a critical approach towards every document for analysis. Critical analysis The analysis was done on the basis of the topics indexed and the present work, the future scope, the methodology used and the conclusion were tabulated for concluding the report on the analysis if digital watermarking. Considering the potential advantages, disadvantages of this technology. Advantages: Content Verification: Invisible digital watermarks allow the recipient to verify the authors identity. This might be very important with certain scientific visualizations, where a maliciously altered image could lead to costly mistakes. For example, an oil company might check for an invisible watermark in a map of oil deposits to ensure that the information is trustworthy. Watermarks provide a secure electronic signature. Determine rightful ownership: Scientific visualizations are not just graphs of data; they are often artistic creations. It is therefore entirely appropriate to copyright these images. If an author is damaged by unauthorized use of such an image, the author is first obligated to prove rightful ownership. Invisible digital watermarking provides another method of proving ownership (in addition to posting a copyright notice and registering the image). Track unlawful use: This technology might allow an author to track how his or her images are being used. Automated software would scan randomly selected images on the Internet (or any digital network) and flag those images which contain the authors watermark. This covert surveillance of network traffic would detect copyright violations thereby reducing piracy. Avoid malicious removal: The problem with copyright notices is that they are easily removed by pirates. However, an invisible digital watermark is well hidden and therefore very difficult to remove. Hence, it could foil a pirates attack. Disadvantages Degrade image quality: Even an invisible watermark will slightly alter the image during embedding. Therefore, they may not be appropriate for images which contain raw data from an experiment. For example, embedding an invisible watermark in a medical scan might alter the image enough to lead to false diagnosis. May lead to unlawful ownership claims for images not yet watermarked: While invisible digital watermarks are intended to reduce piracy, their widespread acceptance as a means of legal proof of ownership may actually have the opposite effect. This is because a pirate could embed their watermark in older images not yet containing watermarks and make a malicious claim of ownership. Such claims might be difficult to challenge. No standard system in place: While many watermarking techniques have been proposed, none of them have become the standard method. Furthermore, none of these schemes have yet been tested by a trial case in the courts. Therefore, they do not yet offer any real copyright protection. May become obsolete: This technology only works if the watermarks cannot be extracted from an image. However, technological advances might allow future pirates to remove the watermarks of today. It is very difficult to ensure that a cryptographic method will remain secure for all time.

107
Table 1: Crtical analysis of the literature reviewed Topic Present work Future work Proposed In future Digital watermarking Watermarking for watermarking should be more 3D Polygons using method is based on wavelet transform robust to possible Multiresolution (WT) and geometric Wavelet multiresolution operation, noise Decomposition representation imposition and (MRR) of the intentional attack. polygonal model. The embedding capacity should also be increased, and processing time be decreased. The method to extract the watermark without the original polygon should be proposed. It must be expanded to the free-form surface model or solid model which has to be more secret than the polygonal model in the CAD/CAM area. A practical method In the future, in A Practical that discourages order to make Method for program theft by watermarks more Watermarking embedding Java tamper-resistant, we Java Programs programs with a are to apply error digital watermark. correcting code to Embedding a our watermarking program method. developers copyright notation as a watermark in Java class files will ensure the legal ownership of class files. Proposed an audio This algorithm A Digital Audio digital does not Watermark watermarking compromise the Embedding algorithm based on robustness and Algorithm the wavelet inaudibility of the transform and the watermark complex cepstrum effectively. transform (CCT) by combining with human auditory model and using the masking effect of human ears.

Method First the requirements and features of the proposed watermarking method are discussed. Second the mathematical formulations of WT and MRR of the polygonal model are shown. Third the algorithm of embedding and extracting the watermark is proposed

Conclusion Finally, the effectiveness of the proposed watermarking method is shown through several simulation results.

Embedding method is indiscernible by program users, yet enables us to identify an illegal program that contains stolen class files.

The result of the experiment to evaluate our method showed most of the watermarks (20 out of 23) embedded in class files survived two kinds of attacks that attempt to erase watermarks: an obfuscactor attack, and a decompilerecompile attack.

This algorithm is realized to embed a binary image watermark into the audio signal and improved the imperceptibility of watermarks.

Experimental results show that this algorithm has a better robustness against common signal processing such as noise, filtering, resampling and lossy compression.

108
Soft IP Protection: Watermarking HDL Codes Leverage the unique feature of Verilog HDL design to develop watermarking techniques. These techniques can protect both new and existing Verilog designs. This paper presents a secure (tamperresistant) algorithm for watermarking images, and a methodology for digital watermarking that may be generalized to audio, video, and multimedia data. . We are currently collecting and building more Verilog and VHDL circuits to test our approach. We are also planning to develop CAD tools for HDL protection. We watermark SCU-RTL & ISCAS benchmark Verilog circuits, as well as a MP3 decoder. Both original and watermarked designs are implemented on asics & fpgas. Further, the use of Gaussian noise, ensures strong resilience to multiple-document, or collusional, attacks. The results show that the proposed techniques survive the commercial synthesis tools and cause little design overhead in terms of area/resources, delay and power.

Secure Spread Spectrum Watermarking for Multimedia

Digital Watermarking facing Attacks by Amplitude Scaling and Additive White Noise

A communications perspective on digital watermarking is used to compute upper performance limits on blind digital watermarking for simple AWGN attacks and attacks by amplitude scaling and additive white noise.

The experiments presented are preliminary, and should be expanded in order to validate the results. We are conducting ongoing work in this area. Further, the degree of precision of the registration procedures used in undoing a fine transforms must be characterized precisely across a large test set of images. An important results is that the practical ST-SCS watermarking scheme achieves at least 40 % of the capacity of ICS which can still be improved by further research.

Experimental results are provided to support these claims, along with an exposition of pending open problems.

Digital watermark mobile Agent

A digital watermark agent travels from host to host on a network and acts like a detective that detects watermarks and collects evidence of

The second component that we are developing is a datamining and data-fusion module to intelligently select the next migration hosts

We show that this case can be translated into effective AWGN attacks, which enables a straight forward capacity analysis based on the previously obtained watermark capacities for AWGN attacks. Watermark capacity for different theoretical and practical blind watermarking schemes is analyzed.. This system enables an agency to dispatch digital watermark agents to agent servers and agent can perform various tasks on the server. Once all the

Analysis shows that the practical STSCS watermarking achieves at least 40 % of the capacity of an ideal blind watermarking scheme.

Development of an active watermark method which allows the watermarked documents themselves to report their own usage to

109
any misuse. Furthermore, we developed an active watermark method which allows the watermarked documents themselves to report their own usage to an authority if detected. Analysis of Watermarking Techniques for Graph Coloring Problem Theoretical framework to evaluate watermarking techniques for intellectual property protection (IPP). Based on this framework, we analyze two watermarking techniques for the graph coloring(GC) problem . Since credibility and overhead are the most important criteria for any efcient watermarking technique, Theoretical analysis of watermark capacity Simplied watermark scheme is postulated. In the scheme, detection yields a multidimensional vector, in which each dimension is assumed to be i.i.d. (independent and identically distributed) and follow the Gaussian distribution. based on multiple sources of information such as related business categories and results of web search engines. actions have been taken, a report will be sent to an agencys database and an agent can continue to travel to another agent server. an authority if detected.

__

Formulae is derived that illustrate the tradeoff between credibility and overhead.

Asymptotically we prove that arbitrarily high credibility can be achieved with at most 1-coloroverhead for both proposed watermarking techniques.

Practical capacity of digital watermark as constrained by reliability

Some more experiments can be performed

Reliability is represented by three kinds of error rates: the false positive error rate, the false negative error rate, and the bit error rate

Experiments were performed to verify the theoretic analysis, and it was shown that this approach yields a good estimate of the capacity of a watermark

110
University of Science and Technology, 2002, Vol.30(5), pp.12-15 . [4] Hong-yi Zhao, Chang-nian Zhang, Digital signal processing and realization in MATLAB. Publishing company of chemical industry, Beijing, 2001, pp.129-131. [5] Secure and Robust Digital Watermarking on Grey Level Images, SERC Journals IJAST Vol 11, 1 [6] Secure Spread Spectrum Watermarking for Multimedia, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 12, DECEMBER 1997. [7] Digital Watermarking by Chaelynne M. Wolak wolakcha@scsi.nova.edu, DISS 780 Ass: Twelve, School of Computer and Information Sciences Nova Southeastern University July 2000 [8] Digital watermarking a technology overview, Hebah H.O. Nasereddin Middle East University, P.O. Box: 144378, Code 11814, AmmanJordan. IJRRAS 6 (1) January 2011, www.arpapress.com/Volumes/Vol6Issue1/IJRRAS_6 _1_10.pdf [9] Digital Watermarking facing Attacks by Amplitude Scaling and Additive White Noise, Joachim J. Eggers, Bernd Girod, Robert B auml, 4th Intl. ITG Conference on Source and Channel Coding Berlin, Jan. 28-30, 2002 [10] Digital Watermark Mobile Agents, Jian Zhao and Chenghui Luo Fraunhofer Center for Research in Computer Graphics, Inc. 321 South Main Street Providence, Digital Watermarks in Scientific Visualization, Wayne Pafko Term Paper (Final Version) SciC8011 Paul Morin May 8, 2000

Conclusion This paper concludes with a discussion on the relevance and applicability of the Innovative digital piracy and potential further research. The first point pertains to the requirement and the need of digital watermarking as an answer to digital piracy. The economic impact of digital piracy on the media industry is a credible threat to the sustainment of the industry. The advantages of digital watermarking are the following Content Verification, Determine rightful ownership, Track unlawful use, Avoid malicious removal. The disadvantages of it are Degrade image quality, May lead to unlawful ownership claims for images not yet watermarked, No standard system in place , May become obsolete. Refrences A Practical Method for Watermarking Java Programs, The 24th Computer Software and Applications Conference (compsac2000), Taipei, Taiwan, Oct. 2000. [2] A Digital Audio Watermark Embedding Algorithm, School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China [3] Yue Sun, Hong Sun, and Tian-ren Yao, Digital audio watermarking algorithm based on quantization in wavelet domain. Journal of Huazhong
[1]

111

Design of a Reconfigurable SDR Transceivers using LabVIEW

AbstractSoftware defined radio has assumed a lot of significance in the recent past. By now it is established fact that most of the future radios will be implemented using software. This offers the advantage of on the fly repair, maintenance and modifications. Also software defined radio is one of the best available approaches to global roaming problem currently faced in the field of information technology. Unless we over come global roaming problem we shall not be able to access any information any time anywhere. Because different modules of software can be switched on depending upon the availability of different types of radio transmissions, software defined radio offers the portability, miniaturization, easy software updates and global roaming. This paper presents the design and implementation of a QAM, QPSK and reconfigurable SDR based radio transmitter and radio receiver. Both the transmitter and receiver are implemented using Lab VIEW, a graphical language. The radio transceiver was tested on a PC by simulating a software defined transmission channel. The audio signal and music were first recorded using sound capture VI (Virtual instrument), these were then modulated, transmitted; received and demodulated at the receiver. We were able to receive the exact replica of the transmitted audio signals. Both transmitter and receiver were implemented on the same PC. The work is in progress where one PC will act as transmitter and the other PC will act as receiver. KeywordsSoftware Defined Radio, Virtual Instruments, MODEM INTRODUCTION One of the revolutionary technologies of the 21st century is the Software Defined Radio that is attempting to solve the seamless/global roaming problems and the all frequency radio transceivers. Of Late a need to design a reconfigurable radio that could switch over to any desired frequency especially in defense sector has been felt. Lab VIEW, QAM Modem, QPSK

Software defined radio (SDR) is defined as a radio whose some or all the physical layer function such as modulation or coding schemes are software defined. SDR further refers to a technology where in software modules running on a generic hardware platforms consisting of Digital Signal Processors (DSP),General purpose Microprocessors, Microcontrollers and Personal computers are used to implement radio functions such as generation of modulation modules in the transmitter, demodulation, tuning and amplification in the receiver. The SDR architecture evolved in three phases: Its phase1 architecture implements channel coding, source coding and control functionality in software on DSP/microcontroller programmable logic. This architecture is already in use for today's digital phones and allows a measure of new service introduction on to the phone in the field; in effect it allows reconfiguration at the very top of the protocol stack, the applications layer. In second phase the baseband modem functionality is implemented in software. This step allows the realization of new and adaptive modulation schemes under either selfadaptive or download control. Further extension of this i.e. phase three shown in figure 1, involving a major change to the overall architecture to implement the intermediate frequency (IF) signal processing digitally in software, will allow a single terminal to adapt to multiple radio interface standards by software reconfigurability. It is clear that the processing power required to implement a phase 3 handset exceeds that available from generic low-power DSPs in the near future by a large margin. The main interests of the scientific community in SDR are due to its following advantages: Reconfigurability Ubiquitous connectivity Interoperability On the air upload of software module to subscriber Faster deployment of new services and Quick remote diagnostic and defect rectification. Due to these advantages DRDO India has also launched ambitious projects of replacing almost all hardware based radios by their SDR counterparts. In a communiqu dated 18th Nov. 2010 DRDO has targeted to deploy SDR in Army Air force and Navy by the year 2013. A generic architecture of SDR is given as below

112

Software processing

D/A Digital IF Processing WBand RF Front

Base band Modem Processing

A/D

Bit stream

Data Interface Interface D/A A/D

D/A Man machine interface Interface


Front Panel: This is a user interface .where the program is controlled and executed Block Diagram: and indicators on the Front Panel. Icon/Connector: It is the internal circuit where the program code is written. Any information that is needed during the simulation can be found in the controls icon, which is a visual representation of the VI, has connectors for program inputs and outputs DIGITAL MODULATION Digital Modulation is used in many communications systems. The move to digital modulation provides: More information capacity Compatibility with digital data services Higher data security Better quality communications Quicker system availability. EXAMPLES OF DIGITAL MODULATION QPSK (Quadrature Phase Shift Keying) FSK (Frequency Shift Keying) MSK (Minimum Shift Keying) QAM (Quadrature Amplitude Modulation) DESIGN STEPS AND RESULTS The design steps for implementing SDR are as follows: To design, implement and simulate VIs for capturing and reproducing audio signals in Lab VIEW environment

Control

Figure 1.

Architecture of SDR

METHODOLOGY Lab VIEW(Laboratory Virtual Instrument Engineering Workbench) simulation will be used in order to design SDR Transceivers. LabVIEW, a graphical programming language from National Instruments which allows designing of systems in an intuitive block-based manner in shorter times as compared to the commonly used text-based programming languages. Lab VIEW programs are called virtual instruments or VIs because their appearance and operation imitates the physical instruments such as multimeters oscilloscope etc. LabVIEW is commonly used for data acquisition, instrument control, and industrial automation Simply put, a Virtual Instrument (VI) is a LabVIEW programming element. A VI consists of a two windows namely front panel, block diagram, and an icon that represents the program. The code is in the one window and the user interface (inputs and outputs) appear in the separate window as shown in figure.

Figure 2.

Parts of VI

Mux/ demux

Speech Codec

113
To design, implement and simulate software based QAM MODEM. To design, implement and simulate a software based QPSK MODEM. To implement the reconfigurable SDR Transceivers on a single PC using a simulated channel between them.

VOICE CAPTURE VI

MODULATOR VI

DEMODULATOR VI

VOICE REGENERATION VI

Figure 3.

Design

Steps

Figure 4.

Voice Capture VI
Figure 5.

The VI shown in figure 4 is used to capture the sound information in Lab VIEW. This VI inputs the sound in Lab VIEW without the use of I/O drivers, otherwise which are necessary to provide external inputs to Lab VIEW environment. The sound input can be provided using microphone. The response of this SDR transmitter was analyzed by giving different types of audio inputs such as speech and music etc. from the microphone. Following are the different observations and results obtained after giving the input to the system.

Sound Input Graph

Voice RegeneratorVI In the similar manner, the voice regeneration VI can be designed using the sub Vis for sound output to regenerate the original sound information. The originated sound can be heard using the speaker VI system. This VI again eliminates the need of external I/O devices required to provide Lab VIEW signal to external environment. QAM MODEM A modulation technique which exploits the amplitude and phase information of the carrier signal is Quadrature amplitude modulation (QAM) . The VI used to modulate the carrier using the captured sound is shown in figure 4. QAM modulation scheme is used because of its some of the merits over other modulation schemes. The received modulated information is demodulated using QAM demodulator to recover the original information. Demodulation is the process of extracting the original information from the modulated signal. Here the original sound information from the modulated sound signal is extracted by using QAM demodulator. The QAM demodulated signal s(t) is obtained by performing the multiplication between the complex

114
carrier consisting of a cosine and a sine waveform and the real part of the modulated signal. At the receiver, these two modulated signals can be demodulated using a coherent demodulator. Such a receiver multiplies the received signal separately with both a cosine and sine signal to produce the received estimates of I (t) and Q (t) respectively. Because of the orthogonal property of the carrier signals, it is possible to detect the modulating signals independently.

Figure 8.

QPSK MODEM

Figure 6.

QAM MODEM

Figure 9.

OUTPUT SIGNAL AT QPSK MODEM

Figure 7.

OUTPUT SIGNAL AT QAM MODEM

QPSK MODEM VI The VI used to modulate the carrier using the captured sound is shown in figure 8.

After designing the QPSK and QAM Modem separately in lab View now the next step is to implement the reconfigurability feature to our design. For this we have used a case structure which work in two cases true and false case. In true case we have used the QAM Modem whereas in false case the QPSK Modem is used. the selection criteria from the above two is based upon the frequency ranges of the two.if the input signal frequency ranges from 5Mhz to 42 Mhz then QPSK works otherwise for higher ranges of frequency QAM works. The other modulation scheme can also be implemented by selecting their frequency ranges. The above figures shows the output at QAM and QPSK Modem and the figure no.10& 11 shows the front panel of our complete reconfigurable SDR transceiver design.

115
was observed that the recovered audio signal was the exact replica of the transmitted signal. However, for the transmission of modulated signal the RF front end of SDR is still implemented by analog circuitry, this is the main limitation of our work as ideal SDRs differ from practical SDRs in that all radio components will be implemented by software, including the RF front end. However, because of technology limitations, ideal SDRs are unachievable yet. BIBLIOGRAPHY Mazen K. Alsliety, An Overview Of SDR Opportunities and Challenges in Telematics, Proceeding of the SDR 06 Technical Conference and Product Exposition. Sven G. Biln, Modulation Classification for Radio Interoperability Via SDR, Proceeding of the SDR07 Technical Conference and Product Exposition. Raymond J. Lackey and Donald W, Upmal (May 1995) Speakeasy: The Military Software Radio, IEEE Communications Magazine. Kumagai, Jean (Jan 2007) Radio Revolutionaries, IEEE spectrum Magazine, 44 .24-27. Mitola, J. (May 1995) The Software Radio Architecture IEEE Communication Magazine, 33, 2638. Digital Modulation in Communications Systems An Introduction Application Note 1298 September 2003 Introduction to LabVIEW Three-Hour Course, Edition Part Number 323668B-01 by National Instruments September 2000 LabVIEW Basics II Course Manual Edition Part Number 320629G-01 by National Instruments. August 2005 LabVIEW Fundamentals Edition Part No. 374029A-01 Instruction Manual, by National Instruments. Clark, Cory L. (2006) LabVIEW Digital Signal Processing and Digital Communication Tata McGraw Hill. Kim, N., Kehtarnavaz, N. & Torlak, M. LabVIEW-Based Software-Defined Radio: 4QAM Modem, Systemics, cybernetics and informatics 4 number 3. Kehtarnavaz, N. & Kim, N. (2005) Digital Signal Processing System-Level Design Using LabVIEW, Elsevier Shakti Kumar, Rajni Raghuvanshi, Pooja Pathak Design and Implementation of Software Defined Radio for IT Applications Proceeding of the international conference (2010)

[11]

[12]

[13]

[14]

Front transceiver
Figure 10.

panel

of

SDR

reconfigurable
[15]

[16] [17]

[18]

[19]

[20]

[21]

[22]

[23]

Front transceiver
Figure 11.

panel

of

SDR

reconfigurable

CONCLUSION This paper presented the design and implementation of a reconfigurable software-defined radio transceiver using LabVIEW. Using this design our audio signal was differently modulated by two modulation methods at different carrier frequencies and it was found that it worked efficiently. When the modulated signal was demodulated using receiver it

116

A Modified Zero Knowledge Identification Scheme using ECC

Abstract In this paper we present a Fiat-Shamir-like ZeroKnowledge identification scheme based on the elliptic curve Cryptography. As we know in an open networkcomputing environment, a workstation cannot be trusted to identify its users correctly to network services. Zero-knowledge (ZK) protocols are designed to address these concerns, by allowing a prover to demonstrate knowledge of a secret while revealing no information to be used by the verifier to convey the demonstration of knowledge to others. The reason that ECC has been chosen is that it provides methodology for obtaining higher speed implementations of authentication protocols and encryption/decryption techniques while using fewer bits for the keys. This means that ECC systems require smaller chip size and less power consumption. Key Words Identification, Security, Zero-Knowledge, Elliptic Curve.

have been purposed and implemented to limit the amount of information shared in order to provide positive identification. Several of these techniques have some weaknesses and are particularly susceptible to man-in-the-middle, off-line and impersonation attacks [1]. Zero-knowledge proofs techniques are powerful tools in such critical applications for providing both security and privacy at the same time.
II. ZERO KNOWLEDGE SCHEME

I. INTRODUCTION

Communication between the computer and a remote user is currently one of the most vulnerable aspects of a computer system. In order to secure this, cryptographic system must be built into the user terminal, and suitable protocols developed to allow the computer and the user to recognize each other upon initial contact and maintain continued security assurance of secret messages. In particular, zeroknowledge proofs (ZKP) can be used whenever there is a need to prove the possession of critical data without a real need to exchange the data itself. Examples of such applications include: credit card verification, digital cash system, digital watermarking, and authentication. Most of the messaging systems used, rely on secret sharing to provide identification. Unfortunately, once you tell a secret it is no longer a secret. This is how identity theft and credit card fraud happen. Authentication and key exchange protocols

A zero knowledge interactive proof system allows one person to convince another person of some fact without revealing the information about the proof. In particular, it does not enable the verifier to later convince anyone else that the prover has a proof of the theorem or even merely that the theorem is true [2]. A zero-knowledge proof is a two-party protocol between a prover and a verifier, which allows the prover to convince the verifier that he knows a secret value that satisfies a given relation (zero-knowledge property). Zero-knowledge protocols are instances of an interactive proof system, where prover and verifier exchange messages (typically depending on random events). 1. Security: An impostor can comply with the protocol only with overwhelmingly small probability. 2. Completeness: An interactive proof is complete if the protocol succeeds (for a honest proofer and a honest verifier) with overwhelming probability p > 1/2. (Typically, p ~ 1). 3. Soundness: An interactive proof is sound if there is an algorithm M with the following properties: i M is polynomial time. ii If a dishonest prover can with nonnegligible probability successfully execute

117
the protocol with the verifier, then M can be used to extract knowledge from this prover which with overwhelming probability allows successful subsequent protocol executions. (In effect, if someone can fake the scheme, then so can everyone observing the protocol e.g. by computing the secret of the true prover). 4. Zero-Knowledge (ZK) Property: There exists a simulator (an algorithm) that can simulate (upon input of the assertion to be proven, but without interacting with the real prover) an execution of the protocol that for an outside observer cannot be distinguished from an execution of the protocol with the real prover. The concept of zero-knowledge, first introduced by Goldwasser, Micali [4] and Rackoff is one approach to the design of such protocols. Particularly, in Feige, Fiat, and Shamir show an elegant method for using an interactive Zero-Knowledge proof to prove identity in [2] a cryptographic protocol. Fiat-Shamir Zero-Knowledge identification scheme is based on discrete logarithmic. In this paper, we modify Fiat-Shamir Zero-Knowledge identification scheme using Elliptic Curve Cryptography.
III. FIAT-SHAMIR PROTOCOL

Fig. 1 Fiat Shamir User Identification Process Fiat Shamir User Identification Process: The Process of user identification can be understood as. Key Generation Process: i Trusted centre choose two large prime numbers p & q. ii Then trusted center calculate n = p*q and publishes n as modulus. iii Each potential claimant (prover) selects a secret prime number s which should be coprime to n iv Each potential claimant (prover) calculates v = s2 mod n as its public key and publish it. Verifying Process: The following steps are performed to identify the authenticated user. i The prover choose a random number r and sends x= r2 mod n (the witness x) to the verifier. ii The verifier randomly selects a single bit c= 0 or c = 1, and sends c to the prover. iii The prover computes the response y = r sc mod n and sends it to the verifier. iv The verifier rejects the proof if y = 0 and accepts if y2 = xvc mod n . Informally, the challenge (or exam) c selects between two answers (0 or 1): the secret r (to keep the claimant honest) or one that can only be known from s. If a false claimant were to know that the challenge is c = 1, then he could provide an arbitrary number a, then sends witness a2/v. Upon receiving c = 1, he sends y = a. Then y2 = a2/v v. If the false claimant were to know that the challenge is c = 0, then he could select an arbitrary number a and send witness a2. This property allows us to simulate runs of the protocol that an outside observer cannot distinguish from real runs (where the challenges c is true random challenges).
IV. ELLIPTIC CURVE CRYPTOGRAPHY (ECC)

The Fiat Shamir protocol is based on the difficulty of calculating a square-root. The claimant proves knowledge of a square root modulo a large modulus n. Verification can be done in 4 steps as shown in figure 1.

Elliptic Curve Cryptography (ECC) is a public key cryptography. In public key cryptography each user or the device taking part in the communication generally have a pair of keys, a public key and a private key, and a set of operations associated with the keys to do the cryptographic operations. Only the particular user knows the private key whereas the public key is distributed to all users taking part in the communication. Some public key algorithm may require a set of predefined constants to be known by all the devices taking part in the communication. Domain parameters in ECC is an example of such constants. Public key cryptography, unlike private key cryptography, does not require any shared secret

118
between the communicating parties but it is much slower than the private key cryptography. The mathematical operations of ECC is defined over the elliptic curve y2 = x3 + ax + b, where 4a3 + 27b2 mod p 0. Each value of the a and b gives a different elliptic curve. All points (x, y) which satisfies the above equation plus a point at infinity lies on the elliptic curve. The public key is a point in the curve and the private key is a random number. The public key is obtained by multiplying the private key with the generator point G in the curve. The generator point G, the curve parameters a and b, together with few more constants constitutes the domain parameter of ECC. One main advantage of ECC is its small key size. A 160-bit key in ECC is considered to be as secured as 1024-bit key in RSA. The elliptic curve addition operation differs from general addition. Assuming that P and Q are two points on the elliptic curve, P = (x1, y1) and Q = (x2, y2); if P = Q, then the elliptic curve addition operation P + Q = (x3, y3) can be obtained through the following rules. x3 = (2 x1 x2) mod p ---[1] y3 = {(x1 x3) y1} mod p --- [2] Where = y2 - y1 x2 - x1 = 3x12 + a 2y1 The dominant operation in ECC cryptographic schemes is point multiplication. Point multiplication is simply calculating kP as shown in figure 2, where k is an integer and P is a point on the elliptic curve defined in the prime field. for P = Q for P Q make it an attractive alternative. In particular, for a given level of security, the size of the cryptographic keys and operands involved in the computation of EC cryptosystems are normally much shorter than other cryptosystems and, as the computational power available for cryptanalysis grows up, this difference gets more and more noticeable.
V. MODIFIED FIAT-SHAMIR PROTOCOL

Fiat-Shamir Zero-Knowledge identification scheme is based on discrete logarithmic. We modify FiatShamir Zero-Knowledge identification scheme using Elliptic Curve Cryptography as shown in figure 3. Modified Fiat Shamir User Identification Process: The Process of user identification can be understood as. Key Generation Process: i) Third party choose the value of a and p for the elliptic curve Ep (a,b). ii) The value of b is selected by claimant so the equation satisfied the condition 4a3 + 27b2 mod p 0. iii) The value of a and p are announced to be public where as b remains secret to the claimant. iv) The claimant chooses a secret point s on curve and calculates v=2s mod p. Claimant keeps s as its private key and registers v as public key with the third party. Verifying Process: The following steps are performed to identify the authenticated user. i) Alice the claimant, chooses a random point r (r is the point on the curve). She then calculate the value of x= (2r) mod p; is called the witness and send x to the Bob as the witness. ii) Bob, the verifier, sends the challenge C to Alice. The value of C is a prime number lies between 1 to p-1.

Fig. 2 Point Multiplication All reported methods for computing kP parse the scalar k and depending on the bit value, they perform either an ECC-ADD or a ECC-Double operation. In fact, ECC is no longer new, and has withstood in the last years a great deal of cryptanalysis and a long series of attacks, which makes it appear as a mature and robust cryptosystem at present. ECC has a number of advantages over other public-key cryptosystems, such as RSA, which

119
value of r ad s and public key is depend on the value of s. The absence of a sub-exponential time algorithm for the scheme means that significantly smaller parameters can be used in ECC than with DSA or RSA. This will have a significant impact on a communication system as the relative computational performance advantage of ECC versus RSA is not indicated by the key sizes but by the cube of the key sizes. The difference becomes even more dramatic as the greater increase in RSA key sizes leads to an even greater increase in computational cost
VII. CONCLUSIONS AND FUTURE WORK

Fig. 3 Fiat Shamir Scheme using ECC iii) Alice calculate the response y= r +c.s mod p. Note that r is the random point selected by the Alice in the first step, s is secret number and c is the challenge send by Bob and sends the response (y) to Bob. iv) Bob calculates x+(c v) mod n and 2y mod n. If these two values are congruent, then Alice knows the value of s and she is authenticated person.( she is honest ) . If not congruent that means she is not authenticated person and verifier can reject her request.
VI. RESULT

A unique feature of the new identification scheme is that it is based on Elliptic Curve Cryptography (ECC). In [8], they conclude that the Elliptic Curve Discrete Logarithm Problem is significantly more difficult than Integer Factorization Problem. For instance, it was found in that to achieve reasonable security, RSA should employ 1024-bit modulo, while a 160-bit modulus should be sufficient for ECC. Also our identification scheme is faster than Fiat-Shamir scheme [5] and Guillou-Quisquater [7], because our Scheme depends on addition operation while those schemes depend on exponential operation. In future few dominant proof techniques have emerged in security proofs. Among which are, probabilistic polynomial time reducibilitys between problems, simulation proofs, the hybrid method, and random self reducibility can be introduced and comparative performance study can be carried out.
REFERENCES Ali M. Allam, Ibrahim I., Ihab A. Ali, Abd ELrahman H. Elsawy Efficient Zero-knowledge Identification Scheme with Secret Key Exchange IEEE,2004 [2] Ali M. Allam ,Ibrahim I. Ibrahim ,Ihab A. Ali, Abdel Rahman H. Elsawy The Performance Of An Efficient ZeroKnowledge Identification Scheme IEEE,2004 [3] Sultan Almuhammadi, Nien T. Sui, and Dennis McLeod Better Privacy and Security in E-Commerce: Using Elliptic Curve-Based Zero-Knowledge Proofs IEEE,2004 [4] S. Goldwasser, S. Micali, and C. Rackoff, "The knowledge complexity of interactive proof systems.", Siam J. Comput., 18(1), pp. 186- 208, February 1989. [5] U. Feige, A. Fiat, and A. Shamir, "Zero knowledge proofs of identity.", Journal of Cryptology, 1(2), pp. 77-94, 1988. [6] Chengming Qi , Beijing Union university, A ZeroKnowledge Proof of Digital Signature Scheme Based on the Elliptic Curve Cryptosystem 2009 Third International Symposium on Intelligent Information Technology Application. [7] L. Guillou, and J. Quisquater, "A Paradoxical" IdentityBased Signature Scheme Resulting from ZeroKnowledge.",Proc. CRYPTO '88. [8] W. Stallings. Cryptography and network security", 3rd edition, Prentice Hall, 2003. [9] Behrouz A. Forouzan. Cryptography and network security. TMH [1]

The security of the system is directly tied to the relative hardness of the underlying mathematical equation. We can easily prove that 2y is the same as x+ (cv) in modulo n arithmetic as shown below. 2Y=2(r+cs) =2r+2cs= (x+cv) --- [3] The challenge (or exam) c selects between the value of 1 and p-1, the secret r (to keep the claimant honest) or one that can only be known the value of s. If a false claimant were to know that the challenge c, then he could provide an arbitrary number m and send witness , Since b is chosen by claimant and generate the points on the equation of Elliptic curve Ep(a,b). No other person can guess on which equation points are generated and which point is randomly selected by claimant. If false claimant sends m to witness then definitely it will not match the final verification, as only claimant knows the

120

Security and Privacy of Conserving Data in Information Technology

Abstract- In this paper, we present a security infrastructure design to ensure safety in the electronic government system: a combination of well-known security solutions, including Public Key Infrastructure, Shibboleth, Smart cards and Lightweight Directory Access Protocol. In this environment we give an overview in privacy preserving and security for Data Mining processes. The original target to supply services through the internet has evolved into the impact of e-Government programmers in delivering better services to their citizens, more efficient in an inclusive society which emphasizes on the quality of the services provided and the extent to which online services are meeting user needs.

access to services European Union wide by establishing secure systems for mutual recognition of national electronic identities for public administration websites and services (European Commission, 2006). The necessity of an interoperable and scalable security and identity infrastructure has been identified by all implicated parties focusing on the effectiveness of solutions provided. SECURITY GOVERNMENT AND ELECTRONIC

Keywords- Security, Integration, Single Sign On, Privacy, cryptography.

INTRODUCTION Member countries of the European Union are speeding into the digitalization of government services, with countries currently offering a surplus of interactive services which are increasing in availability and sophistication. International attempts to develop integrated customer oriented administrative services represent efforts to alleviate the problems of bureaucracy and improve the provision of administrative Services. Since the launch of the European Strategy for the development of eGovernment, with the e-Europe 2002 initiative presented in March 2000 at the Lisbon European Council, a change of focus has occurred. The original target to supply services through the internet has evolved into the impact of e-Government programmers in delivering better services to their citizens, more efficient in an inclusive society which emphasizes on the quality of the services provided and the extent to which online services are meeting user needs. Identified as a major aspect, is the safe

Electronic Government services are being rapidly deployed throughout Europe. Security is the main concern in this process, creating the need for an interoperable secure infrastructure that will meet all current and future needs. It is a necessity that such an infrastructure will provide a horizontal level of service for the entire system and must be accessible by all applications and sub-systems in the network. Delivering electronic services will largely depend upon the trust and confidence of citizens. For this aim, means have to be developed to achieve the same quality and trustworthiness of public services as provided by the traditional way. Regarding the level of systems design, some fundamental requirements, as far as security is concerned, have to be met: Identification of the sender of a digital message. Authenticity of a message and its verification. Non-repudiation of a message or a dataprocessing act. Avoiding risks related to the availability and reliability. Confidentiality of the existence and content of a Message. The best solution makes use of coexisting and complementary

121
Technologies which ensure safety throughout all interactions. Such a system provides assurances of its interoperability by using widely recognized standards and open source software. This evolutionary infrastructure design is based on a collaboration of existing cutting edge technologies in a unique manner. Public key infrastructure, Single sign on techniques and LDAP collaborate effectively guaranteeing efficient and secure communications and access to resources. A Public Key Infrastructure (PKI) based on asymmetric keys and digital certificates, is the fundamental architecture to enable the use of public key cryptography in order to achieve strong authentication of involved entities and secure communication. PKI have reached a stage of relative maturity due to extensive research that has occurred in the area over the past two decades, becoming the necessary trust infrastructure for every e-business (ecommerce, e-banking, e-cryptography). The main smart card reader and the Personal Identification Number (PIN) can use the smart card). Smart cards provide the means for performing secure communications with minimal human intervention. In addition smart cards are suitable for electronic identification schemes as they are engineered to be tamper proof. The lightweight directory access protocol, or LDAP, is the Internet standard way of accessing directory services that conform to the X.500 data model. LDAP has become the predominant protocol in support of PKIs accessing directory services for certificates and certificate revocation lists (CRLs) and is often used by other (web) services for authentication. A directory is a set of objects with similar attributes organized in a logical and hierarchical manner. An LDAP directory tree often reflects various political, geographic, and/or organizational boundaries, depending on the model chosen. LDAP deployments today tends to use Domain name system (DNS) names for structuring the topmost levels of the hierarchy. The directory contains entries representing people, organizational units, printers, documents, groups of people or anything else which represents a given tree entry (or multiple entries). Single Sign on (SSO) is a method of access control that enables a user to authenticate once and gain access to the resources of multiple independent software systems. Shibboleth is standards-based, open source middleware software which provides Web Single Sign on (SSO) across or

Fig1. Security and E-Government purpose of PKI is to bind a public key to an entity. The binding is performed by a certification authority (CA), which plays the role of a trusted third party. The user identity must be unique for each CA. The CA digitally signs a data structure, which contains the name of the entity and the corresponding public key besides other data. Such a pervasive security infrastructure has many and varied benefits, such as cost savings, interoperability (inter and intra enterprise) and consistency of a uniform solution. A PKI smart card is a hardware-based cryptographic device for securely generating and storing private and public keys, digital certificates and performing cryptographic operations. Implementing digital signatures in combination with advanced cryptographic smart cards minimizes user side complexity while maintaining reliability and security (Only an identity in possession of a smart card, a

Fig2. Public Key Encryption (PKI) within organizational boundaries. It allows sites to make informed authorization decisions for individual access of protected online resources in a privacy preserving manner. Shibboleth is a Security Assertion

122
Mark up Language with a focus on federating research and educational communities. Key concepts within Shibboleth include: Federated Administration: The origin campus (home to the browser user) provides attribute assertions about that user to the target site. A trust fabric exists between campuses, allowing each site to identify the other speaker, and assign a trust level. Origin sites are responsible for authenticating their users, but can use any reliable means to do this. Access Control Based On Attributes: Access control decisions are made using those assertions. The collection of assertions might include identity, but many situations will not require this (e.g. accessing a resource licensed for use by all active members of the campus community or accessing a resource available to students in a particular course). Active Management of Privacy: The origin site (and the browser user) controls what information is released to the target. A typical default is merely "member of community". Individuals can manage attribute release via a web-based user interface. Users are no longer at the mercy of the target's privacy policy. A collaboration of independent technologies presented previously leads to an evolutionary horizontal infrastructure. Introducing federations in e-government, in association with PKI and LDAP technology, will lead to efficient trust relationships between involved entities. A federation is a group of legal entities that share a set of agreed policies and rules for access to online resources. These policies enable the members to establish trust and shared understanding of language or terminology. A federation provides a structure and a legal framework that enables authentication and authorization across different organizations. In general the underlying trust relationships networks of the federation are based on Public Key Infrastructure (PKI) and certificates enable mutual authentication between involved entities. This is performed using SSL/TLS protocol and XML digital signatures using keys contained in X.509 certificates obtained from eschool Certification Authorities. An opaque client certificate can contain information about the user's home institution and, optionally, the user's pseudonymous identity. Shibboleth technology relies on a third party to provide the information about a user, named attributes. Attributes are used to refer to the characteristics of a user and not the user straightforward: a set of attributes about a user is what is actually needed rather than a name with respect to giving the user access to a resource. In the hypnotized architecture, this is performed by the LDAP repository which is also responsible for the association of user attributes. Additionally LDAP contains a list of all valid certificates and revoked certificates. Digital signatures are used to secure all information in transit between the various subsystems. This infrastructure leverages a system of certificate distribution and a mechanism for associating these certificates with known origin and target sites at each participating server. User side complexity is guaranteed to be minimum without any cutbacks on the overall security and reliability. The model presented in this paper offers the advantages of each single technology used and deals with their deficiencies through their combined implementation: Hybrid PKI hierarchical infrastructure delegates the trust to subordinate CASs permitting the creation of trust meshes, under a central CA, between independent organizations. Interoperability is simply addressed. PKI supports single sign on with the use of Shibboleth. Shibboleth coordinates with PKI to develop enhanced, complex free, authorization and authentication processes. The user becomes part of the designed system using Single Sign on (SSO) technology that simplifies the access to multiple resources with only one gain access procedure. In practice this results in enhancing the security of the whole infrastructure, among other evident technical issues, because a sufficient level of usability is assured. Providing security infrastructure is not enough, the user must also be able to make use of the security features. Otherwise, the designed service will fail due to the fact that users behavior is often the weakest link in a security chain. The combination of the above mentioned techniques creates strong trust relationships between users and eGovernment services, by implementing a zeroknowledge procedure of a very strong authorization. Zero-Knowledge is an interactive method for one entity to prove the possession of a secret without actually revealing it, resulting eventually in not

123
revealing anything about the entitys personal information. The combined techniques mitigate the problem of memorizing many passwords and reduce the vulnerability of using the same password to access many web services. (AA), the Handle Service (HS), attribute sources, and the local sign-on system (SSO). Shibboleth interacts with the Ldap infrastructure to retrieve user credentials. From the Identity Providers point of view, the first contact will be the redirection of a user to the handle service, which will then consult the SSO system to determine whether the user has already been authenticated. If not, then the browser user will be asked to authenticate, and then sent back to the SP URL with a handle bundled in an attribute assertion. Next, a request from the Service Provider's Attribute Requester (AR) will arrive at the AA which will include the previously mentioned handle. The AA then consults the ARP's for the directory entry corresponding to the handle, queries the directory for these attributes, and releases to the AR all attributes the requesting application is entitled to know about that user. PRIVACY PRESERVING DATA MINING In large intra-organizational environments, data are usually shared among a number of distributed databases, for security or practicality reasons, or due to the organizational structure of the business. Data can be partitioned either horizontally, where each database contains a subset of complete transactions or vertically, where each database contains shares of each transaction. The role of a data warehouse is to collect and transform the dispersed data to an acceptable format, before they will be forwarded to the Data Mining (DM) subsystem. Such central repository raises privacy concerns, especially if it used in an inter-organizational. Setting where several entities, mutually untreated, may desire to mine their private inputs, both securely and accurately. Alternatively, data mining can be performed locally, at each database (or intranet), and then the sub results be combined to extract knowledge, although this will most likely affect the quality of the output. If a general discussion was to be made about protecting privacy in distributed databases, we would point to the literature for access control and audit policies, authorization and information flow control (e.g., multilevel and multilateral security strategies), security in the application layer (e.g., database views), and Operating Systems security Among others. However in this paper we assume that appropriate security and access control exist in the intraorganizational setting, and we mainly focus on the interorganizational setting where a set of mutually untreated entities wish to execute a miner on their private databases. As an alternative layer of protection, original data can be suitably altered or anonym zed before given as an

Fig3. Single Sign on (SSO)

It is essential to distinguish the authentication process from the authorization process. During the authentication process a user is required to navigate to his home site and authenticate him. During this phase information is exchanged between the user and his home site only; with all information on the wire being encrypted. After the successful authentication of a user, according to the user attributes/credentials, permission to access resources is either granted or rejected. The process in which the user exchanges his attributes with the resource server is the authorization process during which no personal information is leaked and can only be performed after successful authentication. User Authentication is performed only once when the user identifies himself inside the trust mesh. Once authenticated inside the trust mesh, users are not required to re-authenticate themselves. When a user navigates To a resource store inside the trust mesh, the authorization process is executed. During this process the service provider requires from the users Identity Provider to present the users access credentials. The Identity provider, after successfully identifying the user and checking if he is previously authenticated, retrieves user credentials for the required resource. If user has not previously been authenticated, the authentication process is initialized. The Shibboleth Identity provider contains four primary components the Attribute Authority

124
input to a miner, or queries in statistical databases may be. The problem with data perturbation is that in highly distributed environments, preventing the inference of unauthorized information by combining authorized information is not an easy problem. Furthermore, in most perturbation techniques lies a tradeoff between protecting privacy of the individual records and at the same time establishing. Accuracy of the DM results. At a high abstraction level, the problem of privacy preserving data mining between mutually untrusted parties can be reduced to the following problem for a two-party protocol: Each party owns some private data and both parties wish to execute a function F on the union of their data without sacrificing the privacy of their inputs. In a DM environment, for example, the function F could be a classification function that outputs the class of a set of transactions with specific attributes, a function that identifies association rules in partitioned databases, or a function that outputs aggregate results over the union of two statistical databases. In the above distributed computing scenario, an ideal protocol would require a trusted third party who would accept both inputs and announce the output. However, the goal of cryptography is to relax or even destroy the need for trusted parties. Contrary to other strategies, crypto mechanisms usually do not pose dilemmas between the privacy of the inputs and the accuracy of the output. borrow knowledge from the vast body of literature on secure e-auction and e-voting systems. These systems are not strictly related to data mining but, they exemplify some of the difficulties of the multiparty case (this has been pointed out first by but it only concerned e-auctions, While we extend it to include e-voting systems as well). Such systems also tend to balance well the efficiency and security criteria, in order to be implemental in medium to large scale environments. Furthermore, such systems fall within our distributed computing scenario and have similar architecture and security requirements, at least at our abstraction level. In a sealed bid e-auction for example, the function F, represented by an auctioneer, receives several encrypted ids and declares the winning bid. In a secure auction, there s a need to protect the privacy of the losing bidders, while establishing accuracy of the auction outcome and verifiability or all participants. Or, in an Internet election, the function, represented by an election authority, receives several encrypted votes and declares the winning candidate. Here he goal is to protect the privacy of the voters (i.e., unlink ability between the identity of the voter and the vote that has even cast), while also establishing eligibility of the veterans verifiability for the election result. During the last decade, a few cryptographic schemes for conducting online e-auctions and e-elections have been proposed in the literature. Research has shown that it is possible to provide both privacy end accuracy assurances in a distributed computing scenario, where all participants may be mutually untreated, without the presence of an unconditionally trusted third arty. CONCLUSIONS Internationally numerous governments are becoming available Online every day. As unattached efforts of addressing electronic government are implemented globally, the need or an interoperable horizontal security infrastructure is tressed. He effective security infrastructure design presented in this per is a solution which makes use of coexisting and complementary pen source technologies and standards. Provides secure and effective communication supported by ase of use for the end user. Scalability and interoperability an advantage of this design suitable to meet the needs of electronic government. n this environment we studied the context of DM security; furze, further research is needed to choose and then adapt the specific cryptographic techniques to the DM environment, asking into account the kind of databases to worksite,

Fig4. Kinds of SMC Solutions In the academic literature for privacy preserving data mining, following the line of work that begun with Yao, most theoretical results are based on the Secure Multiparty Computation (SMC) approach .SMC protocols are interactive protocols, run in a distributed network by a set of entities with private inputs, who wish to compute a function of their inputs in a privacy preserving manner.We believe that research for privacy preserving DM could

125
the kind of knowledge to be mined, as well as the kind specific DM technique to be used. [4] Murat Kantarcioglu, Chris Clifton: PrivacyPreserving Distributed Mining of Association Rules on Horizontally Partitioned Data. DMKD 2002 [5] Yehuda Lindell, Benny Pinkas: Privacy Preserving Data Mining. J. Cryptology 15(3): 177-206 (2002) [6] M. Naor and B. Pinkas, Computationally Secure Oblivious Transfer, Advances in Cryptology: Proceedings of Crypto 1999. [7] Benny Pinkas: Cryptographic Techniques for PrivacyPreserving Data Mining. SIGKDD Explorations 4(2): 12-19 (2002) [8] Roland Traunmller: Electronic Government, Second International Conference, EGOV 2003, Prague, Czech Republic, September 1-5, 2003, Proceedings Springer 2003

REFERENCES [1] Agrawal, Ramakrishnan Srikant: PrivacyPreserving Data Mining. SIGMOD Conference 2000: 439450 [2] Domingo-Ferrer, Antoni Martnez-Ballest, Francesc Seb: MICROCAST: Smart Card Based (Micro)PayperView for Multicast Services. CARDIS 2002: 125134 [3] Pho Duc Giang, Le Xuan Hung, Sungyoung Lee, Young-Koo Lee, Heejo Lee: A Flexible TrustBased Access Control Mechanism for Security and Privacy Enhancement in Ubiquitous Systems. MUE 2007: 698-703

126

Barriers to Entrepreneurship - An analysis of Management students

Abstract Management education has been at the vanguard of higher education in India. With the booming economy and ever increasing job opportunities, MBA have become the most preferred post graduate degree in the country. The only area of concern is that management institutes are creating more job seekers than job creators. Entrepreneurship is still not being considered as a serious career option by most of the management graduates due to availability of job and hefty package in some of

the industries. In the present study it has been observed that mostly management students in the Sant Longowal Institute of Engineering & Technology having service parental background and do not prefer to take risk and preferring jobs in the MNCs or the corporate sectors.

Introduction: The third world countries are still facing some socio-economic problems like unemployment, poverty, inflation, low productivity etc. In India, despite sixty two years of development about 30 percent of the total population i.e. around 30 million are still living below poverty line. In order to improve their living standards, they are to be productive employed. There is a growing worldwide appreciation of fact that micro enterprises may be vibrant option by creating employment generation and our intuitions of higher learning can crate entrepreneur( job creator) which may further create job for others. Entrepreneurship has been acknowledged as one of the essential dynamic factor determining socio-economic growth of any country, because it increases employment, efficiency, productivity, GNP and standard of living. Keeping in mind significance of institutions of higher education augment manifold when it is able to produce not only skilful and employable human resource but also help to develop attitude among its students to opt entrepreneurship as a career choice. Because institutions of education system are distinctive place of knowledge transfer and innovation which help towards nurturing entrepreneurial activities. Some experts envisage that India and China to rule the world in the 21st century. For over a century the United States has been the largest economy in the world but major developments have taken place in the world economy since then, leading to the shift of focus from the US and the rich countries of Europe to the two Asian giants- India and China. In the recent years the rich countries of Europe have seen the greatest decline in global GDP share by 4.9 percentage points, followed by the US and Japan with a decline of about 1 percentage point each. Within Asia, the rising share of China and India has more than made up the declining global share of Japan since 1990. During the seventies and the eighties, ASEAN countries and during the eighties South Korea, along with China and India, contributed to the rising share of Asia in world GDP. On the

other side according to some experts, the share of the US in world GDP is expected to fall (from 21 per cent to 18 per cent) and that of India to rise (from 6 per cent to 11 per cent in 2025), and hence the latter will emerge as the third pole in the global economy after the US and China. By 2025 the Indian economy is projected to be about 60 per cent the size of the US economy. The transformation into a tri-polar economy will be complete by 2035, with the Indian economy only a little smaller than the US economy but larger than that of Western Europe. By 2035, India is likely to be a larger growth driver than the six largest countries in the EU, though its impact will be a little over half that of the US. India, which is now the fourth largest economy in terms of purchasing power parity, will overtake Japan and become third major economic power within 10 years. Therefore there is paramount need to promote entrepreneurial attitude among the masses in this direction; as the expansion of industry is at faster rate and to integrate this agriculture labour force, for attainable growth of all the sectors of economy; a world class infrastructure is required. The Foreign Direct Investment is coming up in number of areas of the economy. Entrepreneur is one who always takes challenges, built new products through innovative ideas and responds to the changes quickly. Entrepreneurs are normally considered as agents of change in the socio-economic development of a country. They are also seen as innovators, risk takers, decision makers and people with definite vision. They have different characteristics as compared to the people accepting jobs or wage employment etc.The misconception that entrepreneurship is a monopoly of some communities but is proved that entrepreneurs are not born but could be identified, trained and developed through proper environment. Peter.F.Drucker. defines an entrepreneur as one who always searches for change, respond to it and exploits it as an opportunity. Innovation is the specific tool of entrepreneurs, the means by which they exploit change as an opportunity for a different business or service. Entrepreneurship Development is a complex phenomenon. Productive activity undertaken by

127
him and constant endeavour to sustain and impose it are the outward expression of this process of development of his personality. Such process is the crystallization of social milieu from which he comes, family imbibes, make-up of his mind, personal attitudes, educational level, parental occupation and so on. Employment has been the biggest indicator of any economy and this employment is governed by many factors like Industrial culture, market opportunities and most importantly the development of businesses. Its a cyclic process. More business, more employment, improved buying power, more purchases are leading to more business opportunities and so on. On the other Indias needs are very clear: To remove the poverty of our millions as speedily as possible, say before 2010, to provide health for all; to provide good education and skill for all; to provide employment opportunities for all; to be a net exporter; and to be self reliant in natural security and build up capabilities to sustain and improve on all these in the future. Keeping in mind country like ours need to emphasize on the technical education and training as it is an essential element in capacity building for the socio -economic growth and development. Management and technical education has assumed greater importance due to globalization, international competition and for sustainable economic growth. Increased brain drain is another reason of increased importance of technical education. The size of our technical manpower is about six million and is third largest in the world. In the year 1947-48, the country had 38 degree level institutes with intake capacity of 3670. The intake for post graduate was 70. As per AICTE Annual report 2003-2004 there are approximate 1500 colleges of Engineering around the country enrolling approximately 4 Lacs students at undergraduate level. Whereas post -graduate studies and research have been limited to 26203 candidates in 268 AICTE approved institutions of engineering, in addition to 54167 candidates were enrolled in MCA in 1012 institutions. Our professional institutions are the major source of manpower for employment by industry; they provide technological and managerial solutions to the problems arising in industry-both short term as well as long term- through consultancy and R&D. Both industry and institutions are depend on each other and drive benefit from mutual interaction which can be possible through giving a burst in the establishing industries and by inculcating entrepreneurial competencies among the students. Ministry of Human Resource Development ( M.H.RD. ) , Government of India ( GOI) has implemented the scheme of Strengthening Existing Institutions and Establishment of New Institutions for Non-corporate and Unorganised Sectors. To cater the needs of these sectors the New Education Policy of 1986 has emphasized the following: To encourage students to consider self-employment as a career option, training in entrepreneurship will be provided through modular or optional courses, in degree and diploma programmes. ( NPE 6.10 ) In order to increase the relevance of management education, particularly in the non-corporate and under managed sectors, the management education system will study and document the Indian experiences and create a body of knowledge and specific educational programmes suited to these sectors .( NPE.6.7 ) Continuing education, covering established as well as emerging technologies, will be promoted. ( NPE 6.4 ) The number of unemployed graduates in engineering and management discipline has been increasing of late due to mushrooming of institutions. Such an increase has been causing quite some concern in the country primarily of these factors. Firstly, a large amount of money is invested in teaching a student to become a graduate and post graduate in management/ engineering. Secondly, modern aspects of technology taught to such students go to naught when such students are not provided with adequate opportunities to utilize their skills. Recognizing the problems posed by such an increase in unemployed graduates, Government has been concerned with promoting enterprises which would exploit the know-how, talent, and skill available in the scientific and technological population emerging from this educational system. Realizing the need for mitigating the problem of unemployment among science, engineering and technology persons, government of India established a National Science and Technology Entrepreneurship Development Board (NSTEDB) in the Department of Science and Technology in January 1982 to promote avenues of gainful self/ wage employment for Science and Technology person with an emphasis on entrepreneurship development.

Objectives:The study has been undertaken with the following objectives: 1 .To know about the entrepreneurship among the students of technical institutions 2. To study whether education play any role in the development of entrepreneurship

Review of related studies: The studies conducted in India and abroad on entrepreneurship development and role of education have stressed that in the era of knowledge intensive work environment, it is significant to establish knowledge infrastructure in engineering institutions to foster technology innovations and technology incubation. And these institutions can play a vital role dissemination of knowledge pool among the various sections of the society. The present paper focuses on the role of institutions in the development of entrepreneurship development with a special reference to Sant Longowal Institute of Engineering and Technology (SLIET) a central Government Institution of district Sangrur. Some of the studies conducted in India and abroad highlights the importance of entrepreneurship.

128
As described by David McClelland (1961) the entrepreneurs are primarily motivated by an overwhelming need to achieve and strong urges to build. Therefore, it can be viewed that entrepreneurs are high achiever and they have strong desire to create new things. Collins and Moore (1970) argued that entrepreneurs are tough, pragmatic driven needs of independence and achievement. They do not easily submit to the authority. Cooper, Woo and Dunkelberg (1970) argued that entrepreneurs exhibit extreme optimism in their decisionmaking processes. As described by Bird (1992) that entrepreneurs as mercurial, that is , prone to insights ,brainstorms ,deceptions ,ingeniousness and resourcefulness. They are cunning, opportunistic, creative and unsentimental. of which 29.55 population live in urban areas and 70.45 percent in villages. And it has 19 districts namelyAmritsar,Fatehgarhsahib,Gurdaspur,Forozpur, Ludhiana,Jallandhar,Kapurthala,Hosiarpur,Mansa,Moga,Mukt sar,Nawanshar,Rupnagar,Faridkot Patiala,Bhatinda, Mohali , Barnala and Sangrur.The literacy rate of district Sangrur is 45.99 percent only out of which 41.25 percent rural and 60.42 urban people are literate. The higher educational professional institution are three in number in the district Two institutions of Engineering and Technology in government sector one in private sector are catering the need of the district and other areas of the country. Therefore to understand the nature of road look for entrepreneurship and also to analyze the attitude of the youth of entering in to business, the researcher prepared a questionnaire in which various parameters as family background, parents occupation, future planning, schooling, income level, and need for achievement in keeping in view entrepreneurship as career option were taken into account and a sample 53 MBA students of final and pre-final of Sant Longowal Institute of Engineering and Technology.(SLIET) had been taken in to consideration as first hand information. We contacted all the 60 students but only 53 came forward and shared the information. After collecting data it has been analyzed in percentages

Sample Selection and Methodology and findings:The present paper is an attempt to find out the role of technical institutions in promoting techno- entrepreneurship in the District Sangrur , ( Punjab.) Punjab is situated in the northwest of India. It is bordered by Pakistan on the west, the Indian states of Jammu & Kashmir on the north, Himachal Pradesh on its northest and Haryana and Rajasthan to its south. The total population of the state is 2, 42, 89000 (2001 census) out

Table-1 S-Service, B-Business, A-Agriculture, H-Higher Studies, A-Settled Abroad Source: Personal Investigation

129

Table no. 1 -2 depicts that out of total 53 surveyed students in Sant Longowal institute of engineering and technology 50.94 percent of the students fathers occupation was service,18.86 percent business, and 30.18 percent were engaged in agriculture sector. Whereas mothers support to the family is concerned 13.20 percent were in service, 1.88 percent in business and 80.94 were house wives and were engaged in homely chores etc. In case of familys income of the students is concerned 1.88 percent had Rs.1 lac per annum, 22.64.18 had Rs.2-4 lac, and 26.41 percent had annual income 4-6 lac per annum respectively . As regards educational background of the surveyed students is concerned 37.73 percent had passed their matric examination from Punjab School Education Board( P.S.E.B) TAB LE - 2 Fathers Occupation Mothers Occupation SLIET Service 50.94% Business 18.86% Agriculture SLIET Service 13.20% Business 1.88% House Wife 84.90% 30.18%

P.S.E.B 37.73% C.B.S.E 33.96% I.C.S.E Nil Other 28.30%

Future Planning SLIET Business 39.62% Service 56.60% H.Study 13.20% Abroad 9.43%

Matric Education Family Income (P.A) SLIET 1 Lac 1.88%

33.96 percent from Central Board of School Education ( C.B.S.E), and 28.30 percent had passed their matriculation examination from other State Boards. In case of future planning of the students is concerned 39.62 percent of the students were interested in business, 56.60 percent wanted to join service, 13.20 percent were interested to pursue higher studies and 9.43 percent were interested to go in abroad respectively. No one is interested to pursue career in agriculture sector. Mostly students want to do job due to job security. Furthermore the information elicited from the students reveal that the students are interested in higher studies is more interested in landing that coveted dream jobs . Thus we can drive conclusion that students wants job security and hesitate to take risk while starting small ventures. Suggestions: 1. The management institutions should focus on improving overall personality of the students and individual motivation and counselling is need of the hour apart from class room teaching. 2. Government should make mandatory to teach entrepreneurship subject in all management institutions and every institutions should fix target

1-2 Lac 22.64% 2-4 Lac 54.71% 4-6 lac 26.41%

SLIET

130
to generate entrepreneur and they should signed MOU with the financial institutions so that students may not to have move from pillar to post for getting financial assistance. Conclusion: Going global India emphasized the need for accelerating the development of small and medium enterprises. The target articulated in the preamble of the Small and Medium Enterprises Development Bill, 2005 affirms to provide for facilitating the promotion and development and enhancing the competitiveness of small and medium enterprises..The findings of the present study indicates that majority of the students are more interested in getting service as secured profession rather than risk taking occupation. Some inputs relating to entreprership are being given to the students but these are not sufficient and more proactive steps are required in this direction. 2. Dr. Abhilasa Singh, Role of Entrepreneurship in Small Scale Business Management, A Bi-annual Journal of management & Technology. Vol-1 Issue-1. July-Dec,2006. Pp.147151. 3. Kurato Donald.F. The Emergence of Entrepreneurship Education :Development,trends and challanges. Entrepreneurship theory and practice Sept. Issue. 4. Mohit A Parekh,Devagni Devashragi,Entry barrier to Entrepreneurship. 9th Binnial conference on Entrepreneurship Feb-16-18.2011. EDI Ahmedabad 5. Prof.P.B.Sharma, Role of Technical Institutions in Technology Innovation and their impacts on national security. The Journal of Engineering Education, Jan.2006. p.16 6. S.G. Patil,Lata S Patil Rabindra D Patil, Role of Women in socio economic Development of the Region-with special refrence to North Maharastra 9th Binnial conference on Entrepreneurship Feb-1618.2011. EDI Ahmedabad 7. Vasant Desai, Dynamics of Entrepreneurial Development and Management. Himalaya Publishing House. New Delhi.1999.

References: 1. Ajay Singh, Shashi Singh& Girish Tyagi,Role of economic status, secondry & Engineering level education in development of entrepreneurial attitudes & activities of Technical undergraduates students. 9th Binnial conference on Entrepreneurship Feb-16-18.2011. EDI Ahmedabad

131

132

133

134

135

136

137

Design of data link layer using WiFi MAC protocols


K.Srinivas (M.Tech) (C R Reddy College of Engg) CRS Murthy (M.Tech) (C R Reddy College of Engg)

Abstract: The name of a popular wireless networking technology that uses radio waves to provide wireless highspeed Internet and network connections.The Wi-Fi Alliance, the organization that owns the Wi-Fi (registered trademark) term specifically defines Wi-Fi as any "wireless local area network (WLAN) products that are based on the Institute of Electrical and Electronics Engineers' (IEEE) 802.11 standards

bands at 2.4GHz and 5GHz. Further hardware design, the communication of hardware design more, operation in these bands entails a strict regulatory data and the maintenance, modification and procurement transmit power constraint, thus limiting range and even bit of hardware. It is a common language for electronics rates beyond a certain distance [5] design and development prototyping Section 1 Users expect Internet connectivity wherever they travel and many of their devices, such as iPods and wireless cameras, rely on local area Wi-Fi access points (APs) to obtain connectivity. Even smart phone users may employ Wi-Fi instead of 3G and WiMAX to improve the performance of bandwidth intensive applications or to avoid data charges. Fortunately, there is often a large selection of commercial APs to choose from. For example, JiWire [6], a hotspot directory, reports 395 to 1,071 commercial AP the designed data link layer is capable of mac layer and physical layer .the data link layer communicates with the other three layers .the layers designed are capable of transmitting 1 and 2 mbits \sec i.e frequency hopping spread spectrum in the 2,4 ghz band and infrared .beyond the standard functionality usally peromed by mac layers the 802.11 mac performance .the protocols consisting of fragmentation ,packet retransmission and acknowledgements .tha mac layer defines two access methods the distribution coordination function [dcf]and point coordination function[pcf]

The main objective of the ieee 802.11 are standard the csma /ca ,physical and mac layer for transmitter and receiver is modleded in this paper .the vhdl (bery high speed hardware description language ) is defined in ieee as a tool of cretation of electronics systems because it supports the development verification sysnthasis and testing The main core of the IEEE 802.1 lb standard are an 802.11 and another 802 LAN [3]. The CSMA\CA, Physical and MAC layers. But only MAC However, all is not prefect in the WLAN world. layer for transmitter is modeled in this paper using the Offering nominal bit rates of 11Mbps [802.1 lb] and VHDL. The VHDL (Very High Speed Hardware 54Mbps(802.1la and 802.11g) the effective throughputs Description Language ) is defined in IEEE as a tool of are actually much lower owing to packet collisions, creation of electronics system because it supports the protocol overhead , and interference in the increasingly development , verification , synthesis and testing of congested unlicensed

138

Results:

Various individual modules of Wi-Fi Transmitter have been designed, verified functionally using VHDL - simulator, synthesized by the synthesis tool .This design of the WiFi transmitter is capable of transmitting the frame formats. The formats include all 802.11 frames i.e. MAC frame, RTS frame , CTS frame and ACK frame. The transmitter is also capable of generating errorchecking codes like HEC and CRC. It can handle variable data transfer

References:
[1] Blind deconvolution of spatially invariant image blurs with phase

[2] Identification of image an blur parameters for the restoration of non-causal blurs [3] Total variation blind deconvolution [4] Maximum-likelihood parametric blur identification based on a continuous spatial domain model [5] Out-of-focus blur estimation and restoration for digital auto-focusing system, [6] Simultaneous out-of-focus blur estimation and restoration fordigital autofocusing system

139

Leveraging Innovation For Successful Entrepreneurship

Abstract Innovation has become the most hyped word in the dictionary of business models of present century. New ideas, concepts and products have provided the firm with a new tool to counter the dynamic forces of change that disrupt its establishment and force it to recreate based on new rules set by change. Innovation has given the firms the power to lead the change and create an impetus to transform the industry structure rather than sitting and waiting for the storm to arrive and cast its effects on the surroundings. Innovation is a powerful tool in the hands of firms. It helps them create longevity as it opens wide horizons for to expand their reach and be one step ahead. The key process in economic change is the introduction of innovations, and the central innovator is the entrepreneur. There will always be continuous winners and losers in this system. The entrepreneur is the initiator in this change process. The entrepreneur, seeking profit through innovation, transforms the static equilibrium, circular market flow, into the dynamic process of economic development. He interrupts the circular flow and diverts labor and land to investment. "The function of entrepreneurs is to reform or revolutionize the pattern of production by exploiting an invention or, more generally, an untried technological possibility for producing a new commodity or producing an old one in a new way, opening a new source of supply of materials or a new outlet for products, by reorganizing a new industry. The notion of entrepreneurship is typically associated with new business creation, new product development and offerings by individuals. With the onset of intensifying global competition there is an increasing need for business organizations to become more entrepreneurial to not only survive but to thrive and prosper. Hence corporate entrepreneurship has become an important paradigm in todays business environment. Corporate Entrepreneurship is a much broader concept encompassing innovation, creativity, change and regeneration within the corporate climate or entire organization. Corporate Entrepreneurship is a

concept by which corporate employees at any level of the company identify and construct a unique business model that offers significant growth opportunities for their company.

I.

INNOVATION AND ENTREPRENEURSHIP

Throughout history, innovators and entrepreneurs have had a tremendous impact on development, exploration, trade, education, science, and integration. During the 20th century, innovation and entrepreneurship have been regarded as key drivers in technological progress and productivity development worldwide. New radical innovations from new fields of knowledge such as information and communication technologies and biotechnology have emerged to influence everyday life for most people. Realizing this, policy makers as well as individuals argue that innovative and entrepreneurial change processes need to be further implemented on the micro as well as macro levels in society .
II. INNOVATION

Innovation refers to radically new or incremental changes in ideas, products, processes or services. Following Joseph Schumpeter's (1934) original work, an invention is related to a new idea or concept, while an innovation refers to such ideas applied in practice. In The Theory of Economic Development (1934), Schumpeter defined innovation from an economic perspective as the introduction of a new good or of a new quality

140 of a good, the introduction of a new method of production, the opening of a new market, the conquest of a new source of supply of raw materials or half-manufactured goods, and the carrying out of a new organization of an industry. On the individual level, innovation comprises the origination of an idea through to its implementation, at which point it can be transformed into something useful. Since innovation is also considered a major driver of the economy, especially when it leads to new product or service categories, or to increasing productivity, the factors that stimulate individuals or groups to innovate should be of major interest to policy makers. In particular, public-policy incentives could be implemented to spur innovation and growth. On the organizational level, innovation may be used to improve performance and growth through new concepts and methods that increase efficiency, productivity, quality, competitive positioning, and market share. Innovation policies and practices may be implemented in a variety of organizations, such as industries, hospitals, universities, as well as local governments. While most forms and practices of innovation aim to add value, radical innovation may also result in a negative or destructive effect for some. Many new developments clear away or change aging practices, and those organizations that do not innovate effectively may be substituted by new organizations and firms that do. It is not only our understanding of the importance of innovations for development that is changing, but also the concept of how innovations are formed. New models of innovation are emerging that are shifting the concept of innovation from being shaped by a closed to an open paradigm (Hedner, Maack, Abouzeedan, & Klofsten, 2010). Such forms of innovation include, for example, user innovation, open innovation, crowd-sourcing, and crowd-casting, which all represent novel and interesting phenomena that may change our conception of how innovation of use, innovation in services, innovation in configuration of technologies, as well as innovation of novel technologies themselves are formed. In agreement with such open concepts of innovation, loosely formed groups of customers, users, scientific communities, or experts/researchers may collectively shape product or process innovations within a variety of sectors.
III. SCIENCE,
POLICY TECHNOLOGY, AND INNOVATION

Technology is often attributed as one of the driving forces behind globalization (Bartlett & more of a demand-led view based on interaction between users and producers of innovation in, what is referred to as, national innovation systems (NIS). Since then, innovation policy has shifted toward an innovation systems perspective, including demand-pull and interaction between users and producers of innovation. Innovation policy plays an important role in influencing innovation performance, but must be closely tailored to specific needs, capabilities, and institutional structures of each country (OECD, 2005a), i.e. the national innovation system.
IV. THE INNOVATION SYSTEM CONCEPT

There is no common definition of the innovation system concept. Typically the concept includes activities of private as well as public actors; linkages; the role of policy and institutions. The analysis is carried out at the national level: R&D activities and the role played by the universities, research institutes, government agencies, and government policies are viewed as components of a single national system, and the linkages among these are viewed at the aggregate level (Carlsson, Jacobsson, Holmn, & Rickne, 2002). Lundvall, Johnson, Andersen, and Dalum (2002, p. 220) find it useful to think about innovation systems in two dimensions. One refers to the structure of the system what is produced in the system and what competences are most developed? The second refers to the institutional set-up how does production, innovation, and learning take place? The innovation system concept can be understood in a narrow as well as a broad sense (Lundvall, 1992). The narrow sense concentrates on those institutions that deliberately promote the acquisition and dissemination of knowledge and are the main sources of innovation. The broad sense recognizes that these narrow institutions are

141 embedded in a much wider socio-economic system. The concept has become popular among several important policymaking organizations, for example, both the OECD and EU have absorbed the concept as an integral part of their analytical perspective.2 Much of the literature on innovation systems insists on the central importance of national systems, but a number of authors have argued that globalization has greatly diminished or even eliminated the importance of the nation state (Freeman, 2002). As a result, there have been several new concepts emphasizing the systemic characteristics of innovation, but related to levels other than the nation state. Sometimes the focus is on a particular country or region which then determines the spatial boundaries of the system. The literature on regional systems of innovation has grown rapidly since the mid 1990s (e.g. Cooke, 1996; Maskell & Malmberg, 1999). In other cases, the main dimension of interest is a sector or technology. Carlsson and Jacobsson (1997) developed the concept technological systems while Breschi & Malerba (1997) uses the notion of sectoral systems of innovation. Usually these different concepts and dimensions reinforce each other and are not in conflict. Despite this growing interest in systems of innovation there have been few attempts to include entrepreneurship as a central component (Golden, Higgins, & Lee, 2003). In Europe, all European Union (EU) Member States and candidate countries have committed to the Lisbon Agenda and increased their public R&D expenditure. Thus, in the 2000s, European innovation policy has become somewhat biased toward a science push or linear model, in which R&D is supposed to lead to increased innovation and entrepreneurship. The third generation of innovation policy thinking calls for more horizontality, coordination and integration of innovation, and other policy domains (OECD, 2006) and stronger linkages with entrepreneurship as a component of the NIS (Golden et al., 2003) and through development of indicators to measure its importance as a driver of innovation (Arundel & Hollanders, 2006). Carlsson (2006) argues that in order to understand how successful innovation systems are in generating economic growth, one would have to include an assessment of the level of entrepreneurial activity and business formation outputs.
V. INNOVATION AS A POLICY AREA

Innovation as a policy area is primarily concerned with a few key objectives: ensuring the generation of new knowledge and making government investment in innovation more effective; improving the interaction between the main actors in the innovation system (universities, research institutes, and firms) to enhance knowledge and technology diffusion; and establishing the right incentives for private sector innovation to transform knowledge into economic value and commercial success (Commission of the European Communities, 2005c; OECD, 2002c). A review of innovation policy documents compiled by the OECD and the EU suggests that the framework for innovation policy could be illustrated as in Fig. 2. Here we find policy objectives for the increase of R&D intensity, the stimulation of climate and culture of innovation, as well as for the commercialization of technology. The last of these includes instruments and support which are important for many innovative start-ups, e.g. a support innovation infrastructure (such as technology transfer offices, science parks, and business/technology incubators), encourage the uptake of strategic technologies among SMEs; improve access to pre-commercialization funding and venture capital; and provide tax (e.g. R&D tax credits, favorable capital cost allowances) and other incentives and supports to accelerate the commercialization of new technologies and products. As we shall demonstrate later in this paper, this framework is not dissimilar from that for entrepreneurship policy. The major difference may be the types of policy measures included within each of the framework boxes.
VI. ENTREPRENEURSHIP

Entrepreneurship is the act of being an entrepreneur. According to the French tradition, this implies one who undertakes innovations, finance and business acumen in an effort to transform innovations into economic goods.

142 Entrepreneurs undertake such tasks in response to a perceived opportunity which in its most obvious form may be a new start-up company. However, the entrepreneurship concept has in recent years been extended to also include other forms of activity, such as social, political, and international entrepreneurship. Some of these new fields of entrepreneurship research and practice are to a large extent driven by eglobalization processes which are facilitated by new information technology tools. Social entrepreneurship, focusing on non-profit entrepreneurial activities, is a new area which is currently attracting more research. Other, developing perspectives include academic entrepreneurship, women entrepreneurship . Thompson & Jones-Evans, 2009), as well as ethnic entrepreneurship, the latter focusing on the role of immigrants as entrepreneurs in their new home countries (cf. Clark & Drinkwater, 2010; Smallbone, Kitching, & Athaya, 2010). In addition, there is also an increasing emphasis on specific sectors where entrepreneurs are active, such as in the medical, life sciences, services, and technology areas, with new paradigms emerging as a result. Needless to say, other new paradigms and concepts within the field of entrepreneurship will appear in the future as the concept of entrepreneurship takes on new forms and shifts into new frontiers. Certainly, it is within the nature of the metaphor entrepreneurship that such creativity and development should be anticipated. As such, the research in the entrepreneurship field needs to develop a better understanding of the important relationship between innovation, entrepreneurial activities, and economic development (Acs & Storey, 2004; Acs & Szerb, 2007; Carlsson, Acs, Audretsch, & Braunerhjelm, 2009; Reynolds, 1997; Reynolds, Carter, Gartner, & Greene, 2004; Stough, Haynes, & Campbell, 1998). The entrepreneur is an actor in microeconomics and, according to Schumpeter (1934), is a person who is willing and able to convert a new idea or invention into a successful innovation. In the classical sense, entrepreneurship employs what Schumpeter called the gale of creative destruction which means that entrepreneurial activities may partly or fully replace inferior practices across markets and industries, while new products or new business models are created simultaneously. According to this perspective, creative destruction is a driver of the dynamism of industries and long-term economic growth. A vital ingredient in entrepreneurship is therefore risk-taking. Knight (1961) classified three types of uncertainty facing an entrepreneur: risk, which could be measured statistically; ambiguity, which is difficult to measure statistically; and true uncertainty or Knightian uncertainty, which is impossible to statistically estimate or predict. Entrepreneurship is often associated with true uncertainty, in particular when it involves newto-the-world innovations. Innovation and technological change is developed and implemented more rapidly today than ever before. Entrepreneurs across the globe implement the process of commercialization resulting from innovation and technological change. Over several decades, our concepts of the innovation process have transitioned from being based on a technology push and need pull model of the 1960s and early 1970s, through the coupling model of the late 1970s to early 1980s, to today's integrated model. Thus, our concept of the innovation process has shifted from from one that presented innovation as a linear sequential process to our current perception of innovation as a shifting, parallel, networking and open phenomenon. As a result of internetization, communication, and eglobalization, innovation is moving more rapidly, is more dispersed and increasingly involves inter-company and inter-personal networking (Abouzeedan et al., 2009; Hedner et al., 2010). As a result, entrepreneurs are needed to develop and implement innovation. Needless to say, innovation and entrepreneurship policies need to be supported and firmly embedded in society (Norrman & Klofsten, 2009). Since entrepreneurship may be translated into economic growth, governments increasingly support the development of an entrepreneurial culture by integrating entrepreneurship into educational systems, encouraging business risktaking in start-ups, as well as national campaigns supporting a range of public entrepreneurship incentives. Over the last century, Alfred Nobel, the famous Swedish inventor and philanthropist, has personified the concept of innovation and

143 entrepreneurship on the individual level (Jorpes, 1959; Schck & Sohlman, 1929). Nobel (1833 1896) pursued a career as a chemist, engineer, innovator, and entrepreneur and became one of the great philanthropists of our time. Nobel held 355 different patents, including that of dynamite. He created an enormous fortune during his lifetime, and in his final will and testament he instituted the Nobel Prizes, the most prestigious scientific prizes of all time.
VII. SMALL- AND MEDIUM-SIZED ENTERPRISES (SMES) AND ENTREPRENEURSHIP

There is a long debate tracing back to the economist Josef Schumpeter about the role of small and large firms with respect to technological progress and innovation. While during the 1970s and early part of the 1980s the leading role of large enterprises was stressed amongst academics and policymakers, in the late 1980s and throughout the 1990s, the role and impact of SMEs was rediscovered. It is now well established that SMEs and entrepreneurship are important for economic growth and renewal (Acs, Audretsch, Braunerhjelm, & Carlsson, 2004; Birch, 1981; Davidsson, Lindmark, & Olofsson, 1994; Reynolds, Bygrave, Autio, Cox, & Hay, 2002; Wennekers and Thurik, 1999). As mentioned, entrepreneurship policy has emerged primarily from SME policy, becoming particularly evident as a policy area in the late 1990s and early 2000s (European Commission, 1998, 2004a; Hart, 2003; OECD, 1998, 2001a; Stevenson & Lundstrm, 2002). Although it is the company's size that is the crucial criterion to distinguish SMEs from other enterprises, when considering SMEs, in particular, there is much more that matters, like the applied business model, occupied market segment, sector alignment, growth-orientation, etc. However, there is an aspect that is obviously more closely linked to the company's size than all the others: the age of the company or more precisely the stage of the firm's life cycle (Ortega-Argils & Voigt, 2009). In fact, it makes a difference whether an enterprise is classified as an SME because it is a very recently established firm (entrepreneurial start-up) or because the company's size is rather the result of a market adjustment process (e.g. a limited niche market). Since the majority of new firms are born small,

it is natural that SMEs and entrepreneurial firms would, at least for a period of time, be seen as synonymous entities, and that SME policy and entrepreneurship policy would have overlapping domains, as illustrated in Lundstrm and Stevenson (2005). However, it is important to remember that there are also differences between SMEs and entrepreneurial firms; not all entrepreneurial firms stay small. Just as there are differences between SME policy and entrepreneurship policy. Whereas the main objective of SME policy is to protect and strengthen existing SMEs (i.e. firms), entrepreneurship policy emphasizes the individual person or entrepreneur. Thus, entrepreneurship policy encompasses a broader range of policy issues geared to creating a favorable environment for the emergence of entrepreneurial individuals and the start-up and growth of new firms. A critical issue for entrepreneurship policy is how to encourage the emergence of more new entrepreneurs and growing firms. .
VIII. ENTREPRENEURSHIP AS A POLICY AREA

Entrepreneurship policy, then, is primarily concerned with creating an environment and support system that will foster the emergence of new entrepreneurs and the start-up and earlystage growth of new firms (Lundstrm & Stevenson, 2005; Stevenson & Lundstrm, 2002). The framework of entrepreneurship policy measures includes policy actions in six areas: (1) promotion of entrepreneurship; (2) reduction of entry/exit barriers; (3) entrepreneurship education; (4) start-up support; (5) start-up financing; and (6) target group measures (Stevenson & Lundstrm, 2002). Major policy instruments and measures in this policy area include those to remove administrative and regulatory to new firm entry and growth,4 improve access to financing5 and to information, and other support infrastructure and services.6 To promote a culture of entrepreneurship, expose more students to entrepreneurship in the education system, and remove barriers to entrepreneurship among specific target groups within the population are further examples of major policy instruments

144 (Gabr & Hoffman, 2006; Lundstrm & Stevenson, 2005). Goshal, 1996). With each wave of technological change the bar of knowledge required to obtain a level of sophistication changes. The result is generally a greater need for human capital, which has given rise to the increase in knowledge workers (Gilbert et al., 2004). An economic landscape, characterized by the rise of international production, innovation networks, and the emergence of science-based technologies, has emerged. With the technologydriven boom of the 1990s, Germany and Japan have been replaced by the USA as the innovation policy exemplar. Innovation policy, which has been growing in interest and emphasis since the mid-1990s, has largely evolved from S&T policy (OECD, 2006). The first generation of innovation policy, based on the science push or linear model, focused primarily on funding of science-based research in universities and government laboratories. The second generation of innovation policy adopted which are under-represented as business owners (e.g. women, youth, ethnic minorities, unemployed, etc.) where the objective is to address identified social, systemic, or other particular barriers to entry; and (b) technostarters where the objective is to encourage high-growth potential businesses based on R&D, technology or knowledge. Finally, the Holistic Entrepreneurship Policy is a comprehensive policy approach encompassing the full range of entrepreneurship policy objectives and measures. Clearly, the Niche Entrepreneurship Policy addressing techno-starters is highly relevant when discussing Innovative entrepreneurship policy. However, which will be further discussed in the next section, the effectiveness of a niche policy as a stand-alone policy may be impeded if the entrepreneurial culture is underdeveloped. Bear in mind that, similar to what was earlier argued about the role of an Innovation policy (in the section on Science, Technology, and Innovation), Entrepreneurship policy plays an important role in influencing entrepreneurial performance, but the policy should be closely tailored to the specific needs, capabilities, and institutional structures of each country/region and innovation system.

Fig. 3. Framework of entrepreneurship policy areas. When presenting their Entrepreneurship Policy typology, Stevenson and Lundstrm (2002) included four different categories of entrepreneurship policy. The first of these is the SME Policy Add-on, in which case initiatives to respond to the needs of starting firms or the broader stimulation of entrepreneurship are added-on to existing SME programs and services, but at a somewhat marginalized and weakly resourced level. The second is the New Firm Creation Policy, in which case the government focuses on measures to reduce administrative and regulatory (government) barriers to business entry and exit, and generally simplify the startup process so more people are able to pursue that path. In the Niche Entrepreneurship Policy the government formulates targeted measures to stimulate the level of business ownership and entrepreneurial activity around specified groups of the population. There are two types of targets for niche policies, (a) segments of the population

IX.

ENTREPRENEURSHIP
DEVELOPMENT

AND

ECONOMIC

Joseph Alois Schumpeter pointed out over one hundred years ago that entrepreneurship is crucial for understanding economic development. Today, despite the global downturn, entrepreneurs are enjoying a renaissance the world over according to a recent survey in the Economist magazine (Woolridge, 2009). The dynamics of the process can be vastly different depending on the institutional context and level of development within an economy. As Baumol (1990) classified, entrepreneurship within any country can be productive, destructive or unproductive. If one is interested in studying entrepreneurship within or across countries, the broad nexus between

145 entrepreneurship, institutions, and economic development is a critical area of inquiry and one which can determine the eventual impact of that entrepreneurial activity. The interdependence between incentives and institutions, affect other characteristics, such as quality of governance, access to capital and other resources, and the perceptions of what entrepreneurs perceive. Institutions are critical determinants of economic behavior and economic transactions in general, and they can have both direct and indirect effects on the supply and demand of entrepreneurs (Busenitz & Spencer, 2000). Historically, all societies may have a constant supply of entrepreneurial activity, but that activity is distributed unevenly between productive, unproductive, and destructive entrepreneurship because of the incentive structure. To change the incentive structure you need to strengthen institutions, and to strengthen institutions you need to fix government. The role incentives play in economic development has become increasingly clear to economists and policymakers alike. People need incentives to invest and prosper. They need to know that if they work hard, they can make money and actually keep that money. As incentive structures change, more and more entrepreneurial activity is shifted toward productive entrepreneurship that strengthens economic development (Acemoglu & Johnson, 2005). This entrepreneurial activity tends to explode during the innovation-driven stage that culminates in a high level of innovation, with entrepreneurship leveling out as institutions are fully developed (Fukuyama, 1989).
X. PRODUCTIVE ENTREPRENEURSHIP

course the static interpretation was subject to much criticism. Solow (1957) at MIT updated the date, wages and capital returns, and improved on Douglas's simple estimation regressions by bringing in yearly data on profit/wages sharing. Now for the 19091949 time-span, Solow modified Douglas's earlier findings by a kind of exponential growth factor suggested by Schumpeter early on in the century. As the Nobel Laureate Samuelson (2009, p. 76) recently pointed out, This residual Solow proclaimed, demonstrated that much of postNewtonian enhanced real income had to be attributed to innovational change (rather than, as Douglas believed, being due to deepening of the capital/labor K/L ratio). Fig. 1 shows the relationship between entrepreneurship and economic development. Entrepreneurship differs from innovation because it involves an organizational process. Schumpeter provided an early statement on this. In recent years, economists have come to recognize what Liebenstein (1968) termed the input-competing and gap-filling capacities of potential entrepreneurial activity in innovation and development. Entrepreneurship is considered to be an important mechanism for economic development through employment, innovation, and welfare. The intersection of the S-curve on the vertical axis is consistent with Baumol's (1990) observation that entrepreneurship is also a resource, and that all societies have some amount of economic activity, but that activity is distributed between productive, unproductive, and destructive entrepreneurship. As institutions are strengthened, more and more entrepreneurial activity is shifted toward productive entrepreneurship strengthening economic development (Acemoglu & Johnson, 2005). This entrepreneurial activity explodes through the efficiency-driven stage and culminates in a high level of innovation with entrepreneurship leveling out.

Technical change and economic development for most of the first part of the twentieth century was assumed to be a function of capital and labor inputs. Douglas (1934) at the University of Chicago compiled a time series of US labor supply (L) and a series of capital-plant and equipment (K) for the time period 18991922. The results suggested that labor received about 0.75% of output and capital of 0.25%, and that K/L ratio deepening (more capital per worker) was important to technological change. Of

Fig. 1. The relationship between entrepreneurship and economic development

146 and the corresponding stages of developed as found in Porter et al. (2002).
XI. ENTREPRENEURSHIP
CREATION

AND

SOCIAL

VALUE

Baumol (1990) proposed a theory of the allocation of entrepreneurial talent in a seminal article, titled Entrepreneurship: Productive, Unproductive and Destructive. He makes an important observation that although entrepreneurship is typically associated with higher incomes, innovation and growth, the entrepreneur is fundamentally engaged only in activity aimed at increasing wealth, power and prestige (1990, p. 898). Therefore, entrepreneurship is not inherently economically healthy and can be allocated among productive, unproductive, and destructive forms. The framework presented by Baumol is useful in that it brings to attention the importance of the full range of entrepreneurial activity. The tradeoff between productive and unproductive activity has been studied, typically in developed countries, most often from the perspective of economic organization. Strong regulatory regimes often mean that policies typically oversee the direction of entrepreneurship in the economy. In contrast, many developed countries have designed economic policies specifically to minimize the ability of entrepreneurs to engage in unproductive activities, and to support productive entrepreneurship. In many developing countries, unproductive and destructive activities are substantial components, if not the substantial components in the economy. Even in rapidly developing countries, opportunities for profit can outpace the evolution of institutions, and this mismatch widens the scope of rent-seeking or worse activities. In the underdeveloped countries, economic activities are found to be predatory and extractive. Baumol originally proposed a framework to understand the allocation, rather than the supply, of entrepreneurship. He assumes that a certain proportion of entrepreneurs exist across and within societies. Baumol hypothesizes that the allocation of entrepreneurial talent is influenced by a structure of rewards in the economy. He suggests that the rules of the game determine the

outcome of entrepreneurial activity for the economy, rather than the objectives or supply of the entrepreneurs. According to Baumol (1990, p. 897), Schumpeter's analysis was not elaborate enough because it did not place value on moving between these forms of entrepreneurship. If activities are chosen based on perceived opportunity for profit (or other personal gain), it should not be assumed that the activities will be of a certain type. For this reason, Baumol (1990, p. 897) extends Schumpeter's list of entrepreneurial activities to include activities of questionable value to society, such as innovative new practices of rent-seeking. These activities of questionable value form Baumol's conception of unproductive entrepreneurship. Unproductive entrepreneurship is what Baumol refers to as a range of activities that threaten productive entrepreneurship. Specifically, he notes rent-seeking, tax evasion, and avoidance as the dominant forms of unproductive entrepreneurship. Within rent-seeking, he includes excessive legal engagement; within taxation, he notes that high-tax societies host a certain set of incentives for entrepreneurial effort. Baumol makes several useful propositions about productive and unproductive entrepreneurship, but he offers no insight into destructive entrepreneurship. In order to shed light on destructive entrepreneurship that is not captured in his existing framework, Acs and Desai proposed the theory of destructive entrepreneurship. They assume entrepreneurs operate to maximize utility and accept Baumol's proposition that the supply of entrepreneurs remains relatively constant. Acs and Desai then find most treatments of entrepreneurship allocation assuming the existence of occupational choice and limiting applicability (Desai & Acs, 2007; Desai, Acs, & Weitzel et al., 2010).
XII. LINKING
INNOVATION

ENTREPRENEURSHIP

AND

Entrepreneurship and innovation are closely linked. Much of entrepreneurial activity most assuredly involves innovation, and, likewise, entrepreneurs are critical to the innovation

147 process. In addition, the turbulence (creative destruction) produced by a high rate of business entry and exit activity is in itself associated with higher levels of innovation in an economy. It is possible to observe convergence between innovation and entrepreneurship policy, particularly when the policy goal is to foster new high-growth innovative firms. In this section we will discuss the start-up of innovative and rapidly growing firms, as well as how public policy can be deployed to promote innovative entrepreneurship. Entrepreneurship and innovation policy as derivatives of other policy areas Entrepreneurship and innovation policy are both derivations of other policy areas. While entrepreneurship policy has emerged primarily from SME policy, innovation policy has largely evolved from science and technology (S&T) or research and development (R&D) policy. contribute to society as a whole by introducing new products/services that often are contradictory to institutional norms Innovations in a form of a new product, process or service are an important factor in providing competitive advantage for SMEs. Continuous creation and recognition of new ideas and opportunities are common characteristics for innovation activity and entrepreneurship. At the best, innovation facilitates small companies to overcome resource restrictions needed for growth. To sum up, it has been argued that both entrepreneurship and innovation are linked to economic growth and industrial renewal. But it is not entirely evident exactly how. Often the relationships between growth, entrepreneurship and innovation tend to be indirect rather than direct. Today it is a well-established fact that SMEs are important for economic growth and renewal. The carrying out of new combinations may, however, have less to do with the size of a firm or organization; instead newness in the form of innovation and entrepreneurship has again caught the attention of many academics and policymakers. Much of entrepreneurial activity most assuredly involves innovation. Likewise, entrepreneurs are critical to the innovation process and entrepreneurial capacity is a key element in the transfer of knowledge in the commercialization process. The turbulence produced by a high rate of business entry and exit activity is in itself associated with higher levels of innovation in an economy The examination of existing work on Entrepreneurship and Innovation policy, suggests that an important direction for the future is to link the two to each other. It is argued that public policy promoting innovation and economic growth must also involve instruments promoting entrepreneurship. For innovative entrepreneurship to be able to fully contribute to economic growth and development it is recommended that its importance will need to be further acknowledged in innovation as well as entrepreneurship policies. The combination of entrepreneurship and innovation results in innovative entrepreneurship: new firms based on new (inventive) ideas, and sometimes, but not

Fig. 2. Illustrative framework of innovation policy areas. However, it is noted that policy measures to stimulate innovative entrepreneurship are often of a different form than those to foster general entrepreneurial activity as are the target groups they seek to influence, and the composition of system members (Lundstrm & Stevenson, 2005; Stevenson, 2002). Of course, innovation policy is broader than policy to foster innovative entrepreneurship, especially regarding objectives such as those to increase R&D investments or encourage the uptake of strategic technologies. As Lundstrm and Stevenson (2005) observed, it is possible for governments to have policies for innovation that do not incorporate much consideration for policies to foster entrepreneurial capacity, not even for innovative entrepreneurship.
XIII. CONCLUSION

The importance of innovations by entrepreneurs is even more important as global competition offers more entrepreneurial opportunities from a greater pool of people. The challenge is for firms to find and make use of these individuals for their survival. Innovation and entrepreneurship are the essence of the capitalist society. The entrepreneurs

148 always, research-based. Such firms often have relatively high-growth potential and may become future gazelles. Thus, the encouragement of innovative entrepreneurship has caught the attention of both policymakers and academics. In this paper it is, however, argued that policies in favor of innovative entrepreneurship should be considered in the context of a holistic entrepreneurship policy framework. The effectiveness of innovative entrepreneurship policy as a stand-alone policy may be impeded if the culture for entrepreneurship is under-developed, the density of business owners too thin, the full range of education support missing, and so on. Thus, in order to increase economic growth through innovative entrepreneurship it is suggested that at least three alternatives are considered. The first of these includes the encouragement of entrepreneurship in general. Not only does the establishment and expansion of new firms create additional new jobs, an increased general entrepreneurial activity is also likely to result in a higher number of innovative high-growth firms. The two other options discussed are niche policies focusing on either increasing the frequency of high-growth firms among the innovative ones or on increasing R&D and innovative activities among low growing firms. All three alternatives can increase innovative entrepreneurship, but the actual policy instruments are very different. There are many examples of highly successful innovations stemming from small enterprises, which have revolutionized entire industries. Start-up companies, young entrepreneurs, university spin-offs, and small highly innovative firms more often than not produce the major technological breakthroughs and innovations, leaving behind the R&D efforts and innovation strategies of large global corporations. It has been argued that entrepreneurship takes on new importance in a knowledge economy because it serves as a key mechanism by which knowledge created in one organization can become commercialized in another enterprise. New and small firms also serve as important vehicles for knowledge spill-overs when their ideas, competencies, products, strategies, innovations, and technologies are acquired, accessed, and commercialized by larger enterprises. Smalland medium-sized enterprises (SMEs) and entrepreneurship continue to be a key source of dynamism, innovation, and flexibility in advanced industrialized countries, as well as in emerging and developing economies. For innovative entrepreneurship to be able to fully contribute to economic growth and development, its importance will need to be further acknowledged in innovation as well as entrepreneurship policies.

References
Abouzeedan, A., Busler, M., & Hedner, T. (2009). Managing innovation in a globalized economy defining the open capital. Acemoglu, D., & Johnson, S. (2005). Unbundling institutions. Journal of Political Economy. Acemoglu, D., Johnson, S., & Robinson, J. (2001). The colonial origins of comparative development: An empirical investigation. American Economic Review. Ahmed (Ed.), World sustainable development outlook 2009, Part VII, knowledge management and education. Brighton, UK: World Association for Sustainable Development University of Sussex. Acs, Z., & Storey, D. (2004). Introduction: Entrepreneurship and economic development. Regional Studies. Acs, Z., & Szerb, L. (2007). Entrepreneurship, economic growth and public policy. Small Business Economics. Acs, Z. J., Audretsch, D. B., & Evans, D. S. ( 1994 ). Why does the self-employment rate vary across countries and over time? Discussion Paper no. 871, Centre for Economic Policy Research. Acs, Z., Braunerhjelm, P., Audretsch, D. B., & Carlsson, B. (2009). The knowledge spillover theory of entrepreneurship. Small Business Economics. Acs, Z. J., & Varga, A. (2005). Entrepreneurship, agglomeration and technological change. Small Business Economics.

149 Ahmad, N., & Hoffman, A. (2007). A framework for addressing and measuring entrepreneurship. Paris: OECD Entrepreneurship Indicators Steering Group. Aquilina, M., Klump, R., & Pietrobelli, C. (2006). Factor substitution, average firm size and economic growth. Small Business Economics. Audretsch, D. ( 2002 ). Entrepreneurship: A survey of the literature. Report for European Commission, Enterprise Directorate General. European Commission, Enterprise and Industry. Autio, E. (2007). GEM 2007 high-growth entrepreneurship report. Global Entrepreneurship Monitor. Bates, T. (1990). Entrepreneur human capital inputs and small business longevity. The Review of Economics and Statistics. Baumol, W. (1990). Entrepreneurship: Productive, unproductive and destructive. Journal of Political Economy. Baumol, W., Litan, R., & Schramm, C. (2007). Good capitalism, bad capitalism, and the economics of growth and prosperity. New Haven, CT: Yale University Press. Bhola, R., Verheul, I., Thurik, R., & Grilo, I. ( 2006 ). Explaining engagement levels of opportunity and necessity entrepreneurs. Birch, D. L., & Medoff, J. (1994). Gazelles. In L. C. Solmon, & A. R. Levenson (Eds.), Labor markets, employment policy and job creation (pp. 159167). Boulder, CO and London: Westview Press. Blanchflower, D. (2000). Self-employment in OECD countries. Labour Economics. Blanchflower, D., Oswald, A., & Stutzer, A. (2001). Latent entrepreneurship across nations. European Economic Review. Block, J., & Wagner, M. ( 2006 ). Necessity and opportunity entrepreneurs in Germany: Characteristics and earnings differentials. Bosma, N., Acs, Z. J., Autio, E., Coduras, A., & Levie, J. ( 2009 ). GEM executive report. Babson College, Universidad del Desarrollo, and Global Entrepreneurship Research Consortium . Busenitz, L., & Spencer, J. W. (2000). Country institutional profiles: Unlocking entrepreneurial phenomena. Academy of Management Journal. Bygrave, W., Hay, M., Ng, E., & Reynolds, P. (2003). Executive forum: A study of informal investing in 29 nations composing the global entrepreneurship monitor. Venture Capital, Carlsson, B., Acs, Z., Audretsch, D. B., & Braunerhjelm, P. (2009). Knowledge creation, entrepreneurship, and economic growth: A historical review. Industrial & Corporate Change. Clark, K., & Drinkwater, S. (2010). Recent tends in minority ethnic entrepreneurship in Britain. International Small Business Journal. Caliendo, M., Fossen, F. M., & Kritikos, A. S. (2009). Risk attitudes of nascent entrepreneurs new evidence from an experimentally validated survey. Small Business Economics, Davidsson, P. (2004). Researching entrepreneurship. New York: Springer. De Clercq, D., Sapienza, H. J., & Crijns, H. (2005). The internationalization of small and medium firms. Small Business Economics Desai, S., & Acs, Z. J. ( 2007 ). A theory of destructive entrepreneurship. JENA Economic Research paper No. 2007-085. Desai, S., Acs, Z. J., & Weitzel, U. ( 2010 ). A model of destructive entrepreneurship. United Nations University (UNU) WIDER Working Paper No. 2010/34. Djankov, S., La Porta, R., LopezdeSilanes, F., & Shleifer, A. (2002). The regulation of entry. Quarterly Journal of Economics, Douglas, P. H. (1934). The theory of wages. New York: Macmillan. Dreher, A. (2006). Does globalization affect growth? Evidence from a new index of globalization. Applied Economics, Fukuyama, F. (1989). The end of history?. The National Interest, 16, 318. Godin, K., Clemens, J., & Veldhuis, N. (2008). Measuring entrepreneurship conceptual frameworks and empirical indicators. Studies in Entrepreneurship Markets, 7. Gompers, P., & Lerner, J. (2004). The venture capital cycle. Cambridge, MA: MIT Press. Grilo, I., & Thurik, R. A. (2008). Determinants of entrepreneurship in Europe and the U.S. Industrial and Corporate Change. Guiso, L., Sapienza, P., & Zingales, L. ( 2006 ). Does culture affect economic outcomes? Hindle, K. (2006). A measurement framework for international entrepreneurship policy research: From impossible index to malleable

150 matrix. International Journal of Entrepreneurship and Small Business. Jorgenson, D. W. (2001). Information technology and the US economy. American Economic Review. Liebenstein, H. (1968). Entrepreneurship and development. American Economic Review.. Miller, T., & Holmes, K. R. eds, (2010). 2010 index of economic freedom: The link between entrepreneurial opportunity and prosperity. The Heritage Foundation and The Wall Street Journal. Minniti, M. (2005). Entrepreneurship and network externalities. Journal of Economic Behavior and Organization. Mueller, S., & Thomas, A. (2001). Culture and entrepreneurial potential: A nine country study of locus of control and innovativeness. Journal of Business Venturing. Murphy, K. M., Schleifer, A., & Vishny, R.W. (1993). Why is rent seeking so costly to growth. American Economic Review Papers and Proceedings, 83(2), 409414. OECD ( 2006 ). Understanding entrepreneurship: Developing indicators for international comparisons and assessments. Papagiannidis, S., & Li, F. (2005). Skills brokerage: A new model for business start-ups in the networked economy. European Management Journal. Porter, M., Sachs, J., & McArthur, J. (2002). Executive summary: Competitiveness and stages of economic development. Oxford University Press. Porter, M., Ketels, C. & Delgado, M. ( 2007 ) The microeconomic foundations of prosperity: Findings from the Business Competitiveness Index, Chapter 1.2. From The Global Competitiveness Report 20072008. World Economic Forum, Geneva Switzerland. Porter, M., & Schwab, K. (2008). The global competitiveness report 20082009. Geneva: World Economic Forum. Romn, Z. (2006). Small and medium-sized enterprises and entrepreneurship. Hungarian Central Statistical Office. Romer, P. (1990). Endogenous technological change. Journal of Political Economy. Rostow, W. W. (1960). The stages of economic growth: A non-communist manifesto. Cambridge: Cambridge University Press. Sala-I-Martin, X., Blanke, J., Hanouz, M., Geiger, T., Mia, I., & Paua, F. (2007). The Global Competitiveness Index: Measuring the productive potential of nations. In M. E. Samuelson, P. (2009). Advances in total factor productivity and entrepreneurial innovation. Schumpeter, J. (1934). The theory of economic development. Cambridge, MA: Harvard University Press. Shane, S., & Cable, D. (2003). Network ties, reputation, and the financing of new ventures. Management Science. Solow, R. M. (1957). Technical change and the aggregate production function. The Review of Economics and Statistics. Srensen, J. B., & Sorenson, O. (2003). From conception to birth: Opportunity perception and resource mobilization in entrepreneurship. Advances in Strategic Management. Weitzel, U., Urbig, D., Desai, S., Sanders, M., & Acs, Z. (2010). The good, the bad and the talented: Entrepreneurial talent and selfish behavior. Journal of Economic Behavior and Organization. Weitzman, M. (1970). Soviet post war economic growth and factor substitution. American Economic Review. Woolridge, A. (2009, March 14). Global heroes: A special report on entrepreneurship. The Economist. *****

* Head, Deptt. Of Management, Raj Kumar Goel Institute of technology For Women, Ghaziabad, U.P., Pin 201306 ** Head, Deptt. Of Management, Rishi Chadha Viswas Girls Institute of Technology, Ghaziabad, U.P., Pin 201306 ***Lecturer, Deptt. Of Management, Raj Kumar Goel Institute of technology For Women, Ghaziabad, U.P., Pin 201306

151

Performance evaluation of cache replacement Algorithms for Cluster Based Cross layer design for Cooperative Caching (CBCC) in Mobile-Ad Hoc Networks
1

Madhavarao Boddu, 2Suresh Joseph k

Department of Computer Science, School of Engineering and Technology, Pondicherry University {1madhav.eee@gmail.com, 2sureshjosephk@yahoo.co.in}

Abstract Cluster Based cross layer design for Cooperative Caching (CBCC) approach is used for improving data accessibility and to reduce query delay in MANETs. An efficient cache replacement algorithm plays a major role in reducing query delay and improving data accessibility in MANETs. The comparative evaluation of cache-replacement algorithms (LRU, LRU-MIN and LNC-R-W3-U) is done based on Hit Ratio (HR) and Delay Savings Ratio (DSR) with respect to variable cache sizes. Here AODV routing protocol is used for path determination. The experimental results show that LNC-R-W3-U outperforms LRU and LRU-MIN in HR and DSR under variable cache sizes.
Key terms: Ad hoc networks, cross-layer design, clustering, cooperative caching, prefetching

III. INTRODUCTION

A mobile ad hoc network (MANETs) is a collection of wireless mobile nodes dynamically forming a network without the aid of any network infrastructure. In MANETs mobility of nodes and wireless transmission effect on attenuation, interference and multipath propagation due to the mobility nature of nodes in MANETs the topology changes dynamically. As the topology changes the route must be updated immediately by sending control messages. It causes overhead for route discovery and maintenance. Mobile nodes are resource constrained in terms of power supply and storage space. First, accessing remote information station via multi hop communication leads to longer query latency and causes high energy consumption. Second, when many clients frequently access the database server they cause a high load on the server and reduce server response time. Third, multi hop communication causes the network capacity degrades when network partition occurs. To overcome the

above limitations data caching is an efficient methodology to reduce query delay and bandwidth. To further enhance the performance of data caching cluster based cross layer and perfecting techniques are used. The focus of our research will be to improve the overall network performance by reducing the client query delay and response time. In this paper the comparative evaluation of LRU, LRUMIN and LNC-R-W3-U cache replacement algorithms are done for cluster based cross later design for cooperative caching in MANETs. The rest of the paper organized as follows: section II describes the related work. Section III describes the overview of CBCC approach. Section IV describes the proposed cache replacement algorithm for CBCC approach. Section V concludes the and suggests possible future work.
IV. BACKGROUND AND RELATED WORK

Caching has been widely used in the wired area networks such as the internet, to increase the performance of web services. However the existing cooperative caching schemes cannot be implemented directly in MANETs due to the resource constraints that characterize the networks as a result new approaches have been proposed to tackle the challenges. Many cooperative caching proposals are available for wireless networks. The proposals are grouped based on the usage of underlying routing protocol, cache consistency management and cache replacement mechanism. In [1, 2] different approaches have been introduced to increase data accessibility and to reduce query delay. In cooperative cache based data access in ad hoc networks [1], a scheme is proposed in this for caching they used cached data and cached path etc. more over

152

the used cache replacement algorithm is only based on least recently used information. The used LRU as cache replacement algorithm has some limitations and the above proposed approach doesnt considered Prefetching technique. In [2], a similar approach is proposed for the network integrating ad hoc networks with the internet. Cache replacement algorithms have direct impact on the cache performance. In [3-7], a considerable number of proposals give much higher priority to data accessibility as opposed to accessed latency. So both factors are largely influenced by the caching scheme that the cache management adopted, In [3-7] new cache replacement algorithms are used to make the best use of cache space. But the used traditional replacement algorithms like LRU, LFU, and LRFU have problems. However caching alone is not sufficient to guarantee high data accessibility and low communication latency in dynamic system. To overcome these draw backs, a new approach is proposed in cluster based cross layer design for cooperative caching in MANETs [8]. In the above proposal [8] for cache replacement mechanism they used LRU-MIN as the cache replacement algorithm. But the used LRU-MIN has certain limitations. Among those first, it prefers only small objects to raise the hit ratio. Second, it doesnt exploit the frequency information of memory accesses. Third, the overhead cost of moving cache blocks into the most recently used position each time when a cache block is accessed. In order to address these limitations we are going to do a comparative evaluation of different cache replacement algorithms (LRU, LRU-MIN, LNC-R-W3-U) based on recency, cost based functions respectively. The cost based greedy algorithm(LNC-R-W3-U) which makes use of frequency of information while evicting the objects from the cache consistently provides better performance over LRU, LRU-MIN when compares the results in terms of cache hit ratio and delay savings ratio and further enhances the performance of cluster based cross layer design for cooperative caching in MANETs.
V. SYSTEM ARCHITECTURE

upper layer applications in MANETs environment. The instances of CBCC run in each mobile host. The network traffic information which is in the data link layer can be retrieved by the middleware layer for Prefetching purposes.
Application Layer App 1 App 2

- - - ------

App n

Middleware Layer Cache Management Cache Consistency Prefetching Information search Local Hit Cache Replacement Cache admission control Global Hit

Cluster Hit Remote Hit

Clustering

Cache Path

Hybrid Cache

Cache Data

Transport Layer TCP UDP

Network Layer Routing Protocols (AODV) Data Link Layer Bluetooth 802.11 Hiper LAN

Fig1: System architecture for Cluster-Based Cooperative (CBCC) Caching (CBCC)

Application layer: It is responsible for providing an interface for users to interact with application services or networking services. Application layer uses HTTP, FTP, TFTP, TELNET etc..

Middleware layer: It is responsible for service location, group communications shared memory. CBCC is a cluster-based middleware which stays Middleware layer consists of various blocks such as on top of the underlying network stack and provides cache management, information search, Prefetching caching and other data management services to the and clustering.
A. CBCC Architecture

153

Cache management: Cache management includes determination (routing). The current system cache admission control, cache consistency architecture uses AODV protocol for path maintenance, and cache replacement. determination. a. Cache admission control: In this a node will cache all received data items until its cache space full after the cache space becomes full, the received data item will not be cache if the data item has a copy within the cluster. Data link Layer: Provides apparent network services so that network layer can be ignorant about the network topology and provides access to physical networking media. It includes error checking and flow control mechanism.
B. Cluster Formation and Maintenance

b. Cache replacement: When fresh data item is arrived for caching and if cache space is full then the cache replacement algorithm is used to locate one or more cached data items to take out from the cache place. The cache replacement process involves two steps: First, if some of the cached data items become obsolete, these items will be detached to make 2 Cluster Head space for the newly arrived data item. 2 If there is still no enough cache space Cluster Member after all obsolete items are removed, Gateway cache replacement will go to the Fig. 2. Clustering Architecture second step, which is that one or more cached data items will be expelled Clustering is a method used to partition the from the cache space according to network into several virtual groups based on the some some criteria. predefined method. For the cluster formation we use c. Cache consistency: The cache consistency least cluster change algorithm [8] which is an strategy keeps the cached data items improvement of lowest ID algorithm. Each mobile synchronized with the original data items node has a unique id. The node which has least id in in the data source. the group is elected as a cluster head. Cluster head Information search: Deals with locating and maintains a list which maintains the information of all other nodes in the group. In a cluster, the number fetching the data item requested by the client. of hops between any two nodes is not more than two. Prefetching: Responsible for determining the data In the whole network there is no direct connection item to be prefetched from the Data Centre for future between the cluster heads. Fig. 2 is an illustration of use. clustering architecture. In fig. 2, nodes which have Transport layer: It is responsible for providing pink color are cluster heads, nodes which have green data delivery transportation between the applications color are gateways, and the rest are cluster members. in the network by using the protocols like TCP, UDP. Cluster member is just like a mobile node it does not It includes the functionalities like Identifying the have any extra functionality. The node which is services, segmentation, sequencing and reassembling common to two cluster heads is elected as a gateway. Gateway is used for providing the communication and error correction. between two cluster heads. Whenever a node requests Network layer: It is responsible for providing for the data, first it has to be checked in the cluster logical addressing and path determination (routing). head list. If it is not available in the list of cluster The routing protocols such as AODV, DSR, DYMO, head then the cluster head forwards the requested etc. are responsible for performing path data item to the other cluster via gateway.

154

By using LCC we can reduce the frequent changes of cluster head formation. LCC adopts LID to create clusters. If a cluster member moves out of the cluster it wont affect the existing clustering architecture. If two cluster heads exist within the cluster, the lowest id mobile node is elected as a cluster head and if more number of nodes moves out of the cluster will form a new cluster.
C. Information Search Operation

It mainly deals with locating and fetching the data item requested by the client from the cache. This Information search includes 4 cases. Case 1: Local hit: When a copy of the requested data item is ordered inside the hard disk of the requester, the data item is retrieved to serve the query and no cooperation is necessary.

Case 2: Cluster hit: When the requested data item is stored in a client within the cluster of the requester, the requester sends a request to the Cluster head and the Cluster head returns the address of a client that has cached the data item. Case 3: Remote hit: When the data is found with a client belonging to a cluster, other than home cluster of the requester, along the routing path to the data source. Case 4: Global hit: Data item is retrieved from the server. When the client data request comes to the mobile node, first it checks in the local hard disk of mobile node i.e. local cache of mobile node. If it is available in the local cache it sends back the reply to the client. Otherwise the request is forwarded to the neighbors based on the cache current state information in the cluster head. If the cluster head has the requested cache state information cluster head gives back a

Client request

Local cache check

Local Hit

Consistency check Not Valid

Valid

Local Miss Cluster Miss Neighbours search Cluster hit

Validate from client

Search in forwarding nodes Remote Hit Retrieve data from forwarding nodes

Retrieve data from neighbours Remote Miss Retrieve data from the data centre Fresh Copy

Cache admission mechanism request

Replacement (LNCR-W3-U)

Return data to the client

Fig. 3. Information Search Operation

request is processed the same way and sends back the to the requester by giving the cluster member id. If it reply to the requester. Otherwise the request is is not available within the cluster then the request is reached to the data center, the datacenter processes forwarded to the other cluster through gateway. The

155

the data request and sends backs the requested enough space has been unreserved. The cost function information to the client via multi hop can be calculated by using the formula. communication then the client uses the cache admission control for the consistency check in the cluster. If the same data is available within the cluster, then it wont cache the objects information. If it is not available it will cache the data objects and sends back the cached information to the cluster head for updating in the cluster cache state.
VI. CACHE REPLACEMENT ALGORITHMS FOR CBCC
profiti rri * di vri * vdi Si

- (1)

Where
rri : Mean reference rate of document i

di

: Mean delay to fetch document i into cache

vri : Mean validation rate of document i vd i : Mean validation delay for document i

S i : Size of document i

Least Recently Used (LRU): It is one of the most widely used cache replacement algorithm, which evicts the objects based on the least recently used information. LRU maintains a hash table for the fast accessing of the data. In the head of the table the most recently used information is placed and in the tail of the table the least recently used information is stored. When a new data item is added to the cache, it is added to the tail of the table. Whenever a cache hit occurs the access time of the requested data item is updated and it is moved into the head of the list. After the cache is full, it simply removes the tail element of the list. LRU_MIN: It uses a technique called least recently used information with minimal number of page replacements. LRU-MIN is also just like LRU. Like LRU, LRU-MIN also maintains and sorted list of documents in the hash table based on the least recently used information i.e. based on the time the document was last used. The only difference between LRU and LRU-MIN is the method of selecting the document for the replacement. Whenever cache needs to replace the document, it searches from the tail of the hash table and evicts the data items only by which have equal or greater size than newly arrived data item size. If all cached documents are smaller than new document, the search is repeated looking for the first two documents greater than half the size of the new document. The process of halving the size and doubling the number of documents to be removed is repeated if large enough documents can still not be found for replacement. LNC-R-W3-U: It is a cost based greedy algorithm [12] which consists of both cache replacement and cache consistency mechanisms. The algorithm selects documents for replacement with least cost until the

We assume that the values rri , d i

vri ,

vd i are priory known

and are not functions of time. The cache consistency algorithm is a TTL- based algorithm. If a document with an expired TTL is referenced and found in the cache, its content is validated by sending a conditional GET to the server owning the document.

In order to get adequate cache space, the algorithm first considers for replacement all documents having just one reference sample in increasing profit order, then all documents with two reference samples in increasing profit order. The cache consistency algorithm sets TTL for a newly received document i as: - (2)

vri
Otherwise, If the expires timestamp is not available then, Whenever a referenced document i has been cached longer than TTLi units, the consistency algorithm validates the document by sending a conditional GET to the server specified in the documents URL. Whenever a new version of document i is received, algorithm updates the sliding windows containing the last K distinct Last-Modified timestamps and the last K validation delays and recalculates vri , vd i and TTLi.
VII. PERFORMANCE EVALUATION OF CACHE REPLACEMENT ALGORITHMS FOR CBCC APPROACH A. Simulation Environment

In this section we have evaluated the performance of LRU, LRU-MIN and LNC-R-w3-U cache replacement algorithms in CBCC approach using ns2 simulation environment. The simulation parameters

156

used in the experiments are shown in table 1. The simulation was carried out in a grid of 4000300m with 50 to 100 nodes. The time interval between two consecutive queries generated from each node/client follows an exponential distribution with mean node query delay Tq is taken as 6 sec. The node density can be selected by selecting the number of nodes; here we considered the number of nodes as 70 by default. The bandwidth selected for the transmission is 2mbps and the total transmission range of 250m is considered for the simulation. Each client generates a single stream of read only queries. After a query is sent out, the client does not generate new query until the pending query is served. Each client generates accesses to the data items following Zipf distribution [9] with a skewness parameter ( ) 0.8. If = 0, clients uniformly access the data items. As is increasing, the access to the data items becomes more skewed. Similar to other studies [10][11] we choose to be 0.8 . The AODV routing protocol was used in the simulation. The nodes/clients move according to the random waypoint model. Initially, the clients are randomly distributed in the area. Each client selects a random destination and moves towards the destination with a speed selected TABLE 1 SIMULATION PARAMETERS
Parameter Simulation area Database size Cache size (KB) Size of the document(Smin) Size of the document(Smax) Transmission range Number of clients Zipf-like parameter Time-To-Live (TTL) Mean query delay(Tq) Bandwidth Node speed B. Performance Metrics Maximum Capacity 50 450 25-250M 50-100 0.5-1.0 200-1000 sec 2-100 sec 2-20 m/s Default value 4000*300 m 750 items 80 1kB 10kB 250M 70 0.7 500 6 sec 2mb/s 2 m/s

randomly from [ , ]. After the client reaches its destination, it pauses for a period of time and repeats this movement pattern. The data are updated only by the server. The server serves the requests on FCFS (first-come-first-serve) basis. When the server sends a data item to a client, it sends the TTL value along with the data. The TTL value is set exponentially with a mean value. After the TTL expires, the client has to get the new version of the data item either from the server or from other client (having maintained the data item in its cache) before serving the query. The zipf-like parameter [9] can be expressed as
PN (i) i

- (3)

Where
N i 1 1

1 i

Here N is the total number of data items. And is the skewness parameter.
Performance metrics are used to evaluate and to improve the efficiency of the process. The performance metrics Hit Ratio (HR), Delay Savings Ratio (DSR) are considered in the simulation experiment. Hit ratio: It is defined as the ratio of number of successful requests to the total number of requests.. Hit ratio = - (4)

Delay savings ratio: it is defined as the total time taken for the completion of successful requests..

(nri * d i DSR
i i

nvi * ci )
- (5)

fi * di

Where nri is the number of references to document i, fi is the total number of references to document i. nvi is the number of validations performed on document i. C. Cache Performance Comparison We compared the performance of LRU, LRU-MIN and LNC-RW3-U. As Fig. 4 indicates, LNC-R-W3-U consistently provides better performance than LRU, LRU-MIN for all the cache sizes, it improves the Delay Savings Ratio(DSR) on average by 26.5 percent compared with LRU and 8.12 percent compared with LRU-MIN. LNC-R-W3-U improves the cache hit ratio

157
performance when compared with LRU, LRU-MIN. it improves the cache hit ratio performance by 34.35 percent over LRU, and 6.7 percent over LRU-MIN. LNC-R-W3-U also improves the consistency of the cached documents in addition to improving performance of cache. The performance evaluations of various parameters are plotted by using the graphical representation. In Fig. 4, X-axis represents cache size and Y-axis represents DSR and for fig. 5, X-axis represents cache size and Y-axis represents Hit-Ratio.

VIII.

CONCLUSION

TABLE 2 OBTAINED SIMULATED VALUES


LRU Cache Size (KB) 80 100 150 280 340 400 DSR 0.14 0.19 0.22 0.26 0.32 0.36 HR 0.08 0.12 0.15 0.19 0.25 0.3 LRU-MIN DSR 0.22 0.25 0.28 0.32 0.38 0.41 HR 0.11 0.18 0.23 0.28 0.34 0.39 LNC-R-W3-U DSR 0.25 0.28 0.32 0.35 0.39 0.42 HR 0.14 0.19 0.24 0.28 0.34 0.43

In this paper, comparative performance evaluation of LRU, LRU-MIN and LNC-R-W3-U cache replacement algorithms over cluster-based cooperative caching (CBCC) approach in MANETs is done using NS2 simulation environment. The experimental results shows that LNC-R-W3-U consistently provides better performance in terms of Delay Savings Ratio (DSR) and the Hit Ratio (HR) when compared to LRU, LRU-MIN for various cache sizes.
REFERENCES
G. Cao, L. Yin and C.R. Das. Cooperative cache-based data access in adhoc networks, IEEE Computer Society, vol.37, 2004, pp.32-39. M. K. Denko and J. Tian, Cross-layer design for cooperative caching in mobile adhoc networks, in Proc. 5th IEEE, Consumer Communications and Networking Conf. (CCNC), 2008, pp. 375380. L. Yin and G. Cao, Supporting cooperative caching in ad hoc networks, IEEE Trans. Mobile Comput., vol. 5, no. 1, pp. 77-89, Jan. 2006. J. Zhao, P.Zhang, and G. Cao, On cooperative caching in wireless P2P networks,inProc.28thInt.Conf.DistributedComputingSystems (ICDCS2008),2008. H.Artail, H.Safa, K.Mershad, Z.Abou-Atme, andN.Sulieman, COACS: A cooperative and adaptive caching system for MANETs,IEEE Trans. Mobile Comput., vol. 7, no. 8, pp. 961-977, Aug. 2008. N.Chand,R.C.Joshi,andM.Misra, Cooperative caching strategy in mobile ad hoc networks based on clusters, Wireless Person.Commun., pp. 41-63, Dec. 2006. J. Tian and M. K. Denko, Exploiting clustering and cross-layer design approaches for data caching in MANETs, in Proc. 3rd IEEE Int. Conf. Wireless and Mobile Computing, Networking and Communications, (WiMob), 2007, p. 52. Mieso K. Denko, Jun Tian, Thabo K. R. Nkwe, and Mohammad S. Obaidat, Cluster Based Cross-Layer Design for Cooperative Caching in Mobile Ad Hoc Networks, IEEE Systems Journal, vol. 3, no. 4, Dec 2009. L. Breslau, P. Cao, L. Fan, G. Phillips and S. Sheker, Web Caching and Zipf-Like Distributions: Evidence and Implications, IEEE INFOCOM, pp. 126-134, March 1999. L. Yin and G Cao, Supporting Cooperative Caching in Ad Hoc Networks, IEEE INFOCOM, pp. 2537-2547, March 2004. Huaping Shen, Sajal K. Das, Mohan Kumar and Zhijun Wang, Cooperative Caching with Optimal Radius in Hybrid Wireless Networks, NETWORKING, pp. 841-853, 2004. [12] Junho Shim, Peter Scheuermann, and Radek Vingralek, Proxy Cache Algorithms: Design, Implementation, and Performance, IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 4, July/August 1999, pp.549-562.

DSR- Delay Savings Ratio, HR- Hit Ratio

0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0

80 LRU 100 150 280 LRU-MIN 340 400 Cache Size (KB) LNC-R-W3-U Fig. 4: Performance comparison of DSR 0.45 0.4 0.35 Hit Ratio 0.3 0.25 0.2 0.15 0.1 0.05 0 LRU LRU-MIN 80 100 280 340 400 Cache Size (KB) Fig. 5: Performance comparison of HR 150

Delay Savings Ratio

158

Comparative Study of the phases of Wireless Intelligent Network


Rashid Ali Khan Computer Science, Mtech SRMS Bareilly Pradesh, India Rashidalikhan25@yahoo.co.in Ashok Kumar Verma Computer Science, Mtech SRMS Bareilly Uttar Uttar Pradesh, India Verma.ashok89@gmail.com

Abstract The primary weapon for empowering providers to deliver distinctive services with enhanced flexibility is Wireless Intelligent Networks (WINs). The Wireless Intelligent Network seeks to win and retain subscribers with a proven and scalable solution.Today's wireless subscribers are much more sophisticated telecommunications users than they were five years ago. No longer satisfied with just completing a clear call, today's subscribers demand innovative ways to use the wireless phone. Increasing complexity in telecommunications services requires ever more complex standards, and therefore the need for better means to write them. Over the years, scenario-driven approaches have been introduced in order to describe functional aspects of systems at several levels of abstraction. Their application to early stages of design and standardization processes raises new hopes in editing concise, descriptive, maintainable, and consistent documents that need to be understood by a variety of readers. In this context, this paper is a comparative study of the four phases of WIN and comments on their successive growth & services. Index Terms Wireless Intelligent Network (WIN), Telecommunications Industry Association (TIA) Interim Standard (IS), Personal Communications Service (PCS), Automatic Speech Recognition

Intelligent network (IN) solutions have revolutionized wireline networks creation and deployment of services has become the hallmark of a wireline network based on IN Intelligent Network Conceptual Model

INTRODUCTION ireless intelligent network (WIN) is a concept being Wdeveloped by the Telecommunications Industry Association (TIA) Standards Committee TR45.2. The charter of this committee is to drive intelligent network (IN) capabilities, based on interim standard (IS)-41, into wireless networks. IS41 is a standard currently being embraced by wireless providers because it facilitates roaming. Wireless service providers are deploying Intelligent Network technology in their networks to facilitate mobility management and to offer a variety of enhanced services to subscribers. Technical Marketing Services has recently completed a study showing that spending on Wireless Intelligent Networks is likely to be a good investment.

WIN is an evolving network architecture . It enhances the mobile services and creates new capabilities . It performs intelligent networking resulting in customer need fulfillment. The features are controlled outside the switch. It is purposely defined for wireline and wireless networks. Enhanced services will also entice potentially new subscribers to sign up for service and will drive up airtime through increased usage of PCS or cellular services. As the wireless market becomes increasingly competitive, rapid deployment of enhanced services becomes critical to a successful wireless strategy. Thus far, the telecommunications industry has deployed mobile networks that have focused mainly on the needs of retail consumers. These networks have advanced considerably from their analogue origins to encompass 3G mobile networks, broadband wireless networks such as WiFi and WiMax, and are now progressing towards LTE 4G networks. While wireless networks have evolved to support the needs of the mobile user, new applications for mobile data are emerging. Wireless intelligent network (WIN) has brought successful strategies into the wireless networks.

159
Services WIN protocol facilitates the development of platformindependent, transport-independent and vendor-independent WIN services such as: A) Hands-Free, Voice-Controlled Services Voice-controlled services employ voice-recognition technology to allow the wireless user to control features and services using spoken commands, names, and numbers. There are two main types of automatic speech recognition (ASR). Speaker-dependent requires specific spoken phrases unique to an individual user. B) Voice-Controlled Dialing (VCD) VCD allows a subscriber to originate calls by dialing digits using spoken commands instead of the keypad. VCD may be used during call origination or during the call itself. C) Voice-Controlled Feature Control (VCFC) VCFC directory number, identify the calling party as an authorized subscriber with a mobile directory number and personal identification number (PIN), and specify feature operations via one or more feature-control strings. D) Voice-Based User Identification (VUI) VUI permits a subscriber to place restrictions on access to services by using VUI to validate the identity of the speaker. VUI employs a form of ASR technology to validate the identity of the speaker rather than determine what was said by the speaker. E) Calling Name Presentation (CNAP) CNAP provides the name identification of the calling party (e.g., personal name, company name, restricted, not available) to the called subscriber. F) Password Call Acceptance (PCA) PCA is a callscreening feature that allows a subscriber to limit incoming calls to only those calling parties who are able to provide a valid password (a series of digits). Calls from parties who cannot provide a valid password will be given call refusal while PCA is active. G) Selective Call Acceptance (SCA) SCA is a call-screening service that allows a subscriber to receive incoming calls only from parties whose calling party numbers (CPNs) are in an SCA screening list. Calls without a CPN will be given call-refusal treatment while SCA is active. A. Some basic ruling factors of WIN services 2.5G CDMA2000's 1xRTT is the first technology for the evolution of cdmaOne 2G networks to 2.5G networks. The major impetus for 2.5G is the "always-on" capability. Being packet based, 2.5G technologies allow for the use of infrastructure and facilities only when a transaction is required, rather than maintaining facilities in a session-like manner. This provides tremendous infrastructure efficiency and service delivery improvements. 3G Third generation (3G) networks were conceived from the Universal MobileTelecommunications Service (UMTS) concept for high speed networks for enabling a variety of data intensive applications. 3G systems consist of the two main standards, CDMA2000 and W-CDMA, as well as other 3G variants such as NTT DoCoMo's Freedom of Mobile Multimedia Access (FOMA) and Time Division Synchronous Code Division Multiple Access (TD-SCDMA) used primarily in China. AAA Sometimes referred to as "triple-A" or just AAA, authentication, authorization, and accounting represent the "big three" in terms of IP based network management and policy administration. Authentication provides a vehicle to identify a client that requires access to some system and logically precedes authorization. The mechanism for authentication is typically undertaken through the exchange of logical keys or certificates between the client and the server. Authorization follows authentication and entails the process of determining whether the client is allowed to perform and/or request certain tasks or operations. Authorization is therefore at the heart of policy administration. Accounting is the process of measuring resource consumption, allowing monitoring and reporting of events and usage for various purposes including billing, analysis, and ongoing policy management. Advanced Messaging Advanced messaging technologies will provide advanced capabilities beyond those provided by SMS. In fact, many believe that messaging is the single most important application to exploit the capabilities of 3G (and beyond) networks Billing Billing systems collect, rate, and calculate charges for use of telecommunications services. For post-paid services, a collector at the switch gathers data and builds a call detail record (CDR). For prepay systems, prepay processing system determines the appropriate charges and decrements the account accordingly. Both systems utilize a guiding process to match calls to customers plans and a rating engine to rate individual calls. General Packet Radio Service General Packet Radio Service (GPRS) is a 2.5 generation packet based network technology for GSM networks. The major impetus for GPRS and other packet based mobile data technologies is the "always-on" capability. Being packet based, GPRS allows for the use of infrastructure and facilities only when a transaction is required, rather than maintaining facilities in a session-like manner. This provides tremendous infrastructure efficiency and service delivery improvements. Calling Party Pays Calling Party Pays (CPP) is the arrangement in which the mobile subscriber does not pay for incoming calls. Instead, the calling party pays for those calls. CPP is offered in many places, but has not been regulated in the United States where Mobile Party Pays (MPP) is still predominant. Electronic Billing Presentation and Payment Electronic Billing Presentation and Payment (EBPP) is the use of electronic means, such as email or a short message, for rending a bill.The advantage of EBPP over traditional means is primarily the savings to the operator in terms of the cost to produce, distribute,

160
and collect bills. EBPP may be used in lieu of a standard paper bill as a means to reduce operational costs. GETS The Government Emergency Telecommunications Service (GETS) is an organization established to support the United States National Communications System (NCS). The role of GETS is to provide specialized call processing in the event of congestion and/or outages during an emergency, crisis, or war. GETS has already established capabilities to facilitate priority call treatment for wireline/fixed networks, including the local exchange and long distance networks. In the event of an emergency, the authorities would be able to gain faster access to telecommunications resources than the every-day citizen. Intelligent Agents A key enabling technology for personalization, Intelligent agent technology provides a mechanism for information systems to act on behalf of their users. Specifically, intelligent agents can be programmed to search, acquire, and store information on behalf of the wants and needs of users. Intelligent agents are task-oriented. Inter-operator Messages Inter-carrier Messaging (ICM) sometimes referred to as inter-operator or inter-network messaging - refers to the ability to transmit messages between mobile communications networks regardless of technologies involved (CDMA, GSM, iDen, PDC, or TDMA) and regardless of SMSC protocols deployed (CIMD, SMPP,UCP). IWF The Interworking Function (IWF) acts as a gateway between the mobile network and data network infrastructure such as a WAP gateway. The IWF is used to facilitate a circuit switched connection from the MSC to the WAP gateway. In addition the IWF can be used to support mobile originated and terminated calls for asynchronous data and fax. Lawful Intercept Lawful Intercept (LI) or CALEA (Communications Assistance to Law Enforcement Act) represent regulation requiring mobile network operators to enable legally authorized surveillance of communications. This means a "wireless tap" of the communications channel for voice and/or data communications. LI is being considered in Europe and is being mandated in the USA in 2002. LDAP LDAP is an important protocol to IP networking and is therefore important to the development and administration of mobile data applications. An important evolution of LDAP will involve the migration to DENs, which have the potential to considerably improve directory environments. Mobile Instant Messaging Simply put, Mobile Instant Messaging (MIM) is the ability to engage in IM from a mobile handset via various bearer technologies, which may include SMS, WAP, or GPRS.In a mobile environment, the user is constrained by bandwidth and the UI. Mobile IN This module provides a brief introduction to the concepts and technologies associated with intelligent networks for mobile communications. All intelligent networking for telecommunications involves the concept of a "query/response" system. This system entails the notion of distributed intelligence wherein a database is queried for information necessary for call processing. Mobile IP Mobile IP is the underlying technology for support of various mobile data and wireless networking applications. For example, GPRS depends on mobile IP to enable the relay of messages to a GPRS phone via the SGSN from the GGSN without the sending needing to know the serving node IP address MVNO A Mobile Virtual Network Operator (MVNO) is a mobile operator that does not own its own spectrum and usually does not have its own network infrastructure. Instead, MVNO's have business arrangements with traditional mobile operators to buy minutes of use (MOU) for sale to their own customers. Personal Area Networks Personal Area Networks (PAN) are formed by wireless communications between devices by way of technologies such as Bluetooth and UWB. PAN standards are embodied by the IEEE 802.15 family of "Home Wireless" standards, which superseded older infrared standards and HomeRF for dominance in this area of wireless communications. Prepay Technology While there are many technologies involved in deploying mobile prepay (prepaid wireless service), this paper provides and introduction to the various types of mobile prepay deployments and the associated technologies. Point Solutions, ISUP based Solutions, Intelligent Network based Solutions, Handset Solutions, Hybrid Solutions, Call Detail Record (CDR) based Solutions. Presence & Availability Presence and availability technologies provide the ability to determine the event in which a mobile user is present in a certain location and/or available for certain events to take place such as mobile messaging, games, and other location based services Mobile Positioning The terms mobile positioning and mobile location are sometimes used interchangeably in conversation, but they are really two different things. Mobile positioning refers to determining the position of the mobile device. Mobile location refers to the location estimate derived from the mobile positioning operation. There are various means of mobile positioning, which can be divided into two major categories network based and handset based positioning Personalization The goal of mobile operators is to increasingly make their service offerings more personalized towards their customers. This movement is led by the need to differentiate products against fierce competition while driving improved revenue per unit customer. Most of the emphasis on personalized services today is placed on mobile data services enabled by technologies such as GPRS. Service Bureaus A telecommunications service bureau is an organization or business that offers outsourced telecommunications services on a wholesale basis to other service providers, which typically offer retail services, directly or indirectly, to end-users. Many services may be obtained from a service bureau solutions provider. The typical service will be new, unproven services, applications that require economies of scale, and regulation-driven applications. Softswitch Simply put, Softswitch is the concept of separating the network hardware from network software. In traditional circuit switched networks, hardware and software is not independent. Circuit switched networks rely on dedicated facilities for inter-connection and are designed primarily for

161
voice communications. The more efficient packet based networks use the Internet Protocol (IP) to efficiently route voice and data over diverse routes and shared facilities. Smart Cards Smart cards in the wireless marketplace provide: improved network security through user identification, a facility for storing user data, and a mechanism for recording various service data events. These capabilities enable improved service customization and portability in a secure environment, especially suited for various transaction based services. Smart cards are tamper resistant and utilize ISO-standardized Application Protocol Data Units (APDU) to communicate with host devices via PIN codes and cryptographic keys. SMS Short Message Service (SMS) is a mobile data service that allows alphanumeric messaging between mobile phones and other equipment such as voice mail systems and email. SMS is a store-and-forward system. Messages are sent to a Short Message Service Center (SMSC) from various devices such as another mobile phone or via email. The SMSC interacts with the mobile network to determine the availability of a user and the user's location to receive a short message. SS7 SS7 is a critical component of modern telecommunications systems. SS7 is a communications protocol that provides signaling and control for various network services and capabilities. While the Internet, wireless data, and related technology have captured the attention of millions, many forget or don't realize the importance of SS7. Every call in every network is dependent on SS7. Likewise, every mobile phone user is dependent on SS7 to allow inter-network roaming. SS7 is also the "glue" that sticks together circuit switched (traditional) networks with Internet protocol based networks. Unified Messaging Unified messaging (UM) is the concept of bringing together all messaging media such as voice messaging, SMS and other mobile text messaging, email, and facsimile into a combined communications experience. Minimally, the communications experience will take the form of a unified mailbox and/or alert service, allowing the end-user to have a single source for message delivery, repository, access, and notification. USSD Unstructured Supplementary Service Data (USSD) is a technology unique to GSM. It is a capability built into the GSM standard for support of transmitting information over the signaling channels of the GSM network. USSD provides sessionbased communication, enabling a variety of applications. USSD is defined within the GSM standard in the documents GSM 02.90 (USSD Stage 1) and GSM 03.90 (USSD Stage 2). VAS Value-added services (VAS) are unlike core services. They have unique characteristics and they relate to other services in a completely different way. They also provide benefits that core services can not. WAP Wireless Application Protocol (WAP) is an enabling technology based on the Internet client server architecture model, for transmission and presentation of information from the World Wide Web (WWW) and other applications utilizing the Internet Protocol (IP) to a mobile phone or other wireless terminal. Wireless 911/112 Wireless Emergency Services (WES) refers to the use of mobile positioning technology to pinpoint mobile users for purposes of providing enhanced wireless emergency dispatch services (including fire, ambulance, and police) to mobile phone users. Wireless Testing The complexity of wireless networks will increase more quickly in the next five years than it did in the previous fifteen due to the rapid advent of broadband service. These networks will need to quickly move from supporting voice-centric to implementing data-intensive applications. This urgency comes from the need to defray the steep entry costs paid by wireless operators worldwide. WLAN Roaming Wireless Local Area Networks (WLAN) are increasing becoming an attractive alternative to licensed spectrum. Initially thought of strictly as a competitor to 3G, WLAN is now thought of as a complement to cellular based data services, providing a high bandwidth alternative to 3G at a fraction of the cost.

Comparative Study One of the vital solutions for this highly competitive and increasingly demanding market is to build a sophisticated Wireless Intelligent Network infrastructure that can flexibly support existing and new services. This approach can reduce the load on the wireless switches. Thus eventually WIN phases are constantly in progress enhancing customer requirements. A Telecommunications Industry Association/American National Standards Institute (TIA/ANSI) standard messaging protocol that enables subscribers in ANSI-41 based wireless networks to use intelligent network services. WIN also supports the network capabilities to provide wireless activities such as automatic roaming, incoming call screening, and voice-controlled services. CAMEL A European Telecommunications Standards Institute (ETSI) standard messaging protocol for including IN functions into GSM mobile networks. CAMEL is used when roaming between networks, allowing the home network to monitor and control calls made by its subscribers. CAMEL API allows roaming subscribers access to their full portfolio of Intelligent Network (IN) services. CAMEL is a relatively inexpensive method of allowing telecom operators to add new services to the existing network infrastructure. A few typical applications include: PrePaid Calling Personal Numbering Location dependent services UMTS/GSM-MAP An ETSI standard messaging protocol used in UMTS/GSM wireless networks to communicate among network elements to support user authentication, equipment identification, and roaming: Mobile Switching Center (MSC), Home Location Register (HLR), Visitor Location Register (VLR), Equipment Identity Register (EIR), Short Message Service Center (SMSC) Authentication Center (AuC) Typical applications include: Intelligent Peripheral (IP) Service Control Point (SCP) Enhanced Services Platform

162
ANSI - 41 A TIA/ANSI standard messaging protocol used in CodeDivision Multiple Access (CDMA) and Time Division Multiple Access (TDMA) wireless networks primarily in the Americas and parts of Asia to communicate among network elements (MSC, HLR, VLR, EIR, SMSC) to support inter-system handoff, automatic roaming, authentication, and supplementary call features. The ANSI 41D specification (formerly known as IS-41) is primarily used in the wireless network to provide services such as automatic roaming, authentication, intersystem hand-off, and short message service. All wireless network elements use this messaging protocol to communicate. Typical applications include: Intelligent Peripheral (IP) Service Control Point (SCP) Enhanced Services Platform INAP Intelligent Network Application Protocol (INAP), an ITU-T specification, allows applications to communicate between various nodes/functional entities of a wireline Intelligent Network. The protocol defines the operations required to be performed between nodes/functional entities for providing Intelligent Network services. A few typical applications include: Call Center solutions requiring handling of special service numbers (800 & 900 services) Local Number Portability Calling Card registration and authentication including charging and fraud management capabilities Interactive Voice Response (IVR) systems for small and large business segments Calling Name delivery Service Management systems for study of traffic patterns as well as generating call reports and billing records at central administration and billing center. AIN protocol AIN applications include: A) Toll-free dialing and FreePhone facilities for subscribers B) Virtual Private Network Services for closed user groups operating over geographically distributed facilities Universal Access Number (UAN) Split Charging capability enabling the subscriber to separately charge personal and business calls made from the same instrument Call Rerouting and Redistribution based on traffic volume and/or time of day suitable for telemarketing businesses and reservation centers with multiple locations. Prepaid and Calling Card services Televoting, whereby franchisees may cast their choice over secure voice response systems, preserving privacy, possible travel time as well as avoiding human tampering of results and other malpractices. Comparitive study table of CAMEL in four phases: CAMEL Control of MO, MT and MF calls Phase 1 Any time interrogation CAMEL Additional EDPs Phase 2 interaction between a user and a service using announcements, voice prompting and information collection via in-band or USSD interaction; Control of call duration and transfer of AoC Information to the ms; The CSE can be informed about the invocation of the supplementary services ECT, CD and MPTY; For easier post-processing, charging information from a serving node can be integrated in normal call records. Support of facilities to avoid overload; Capabilities to support Dialed Services; Capabilities to handle mobility events, such as notreachable and roaming; Control of GPRS sessions and PDP contexts; Control of mobile originating SMS through both CS and PS serving network entities; Interworking with SoLSA (Support of Localized Service Area) (optional) The CSE can be informed about the invocation of the SS CCBS. Support of Optimal Routing for CS mobile to mobile calls; Capability of the CSE to create additional parties in an existing cal l; Capability for the CSE to create a new call unrelated to any other existing call; Capabilities for the enhanced handling of call party connections; Capability for the CSE to control sessions in the IP Multimedia Subsystem (IMS); Enhanced CSE capability for dialed services; The capability to report basic service changes during ongoing call; The CSE capability to select between preferred and less preferred bearer services; The capability for the CSE to control trunk originated calls; The capability for the CSE to request additional dialed digits;

CAMEL Phase 3

CAMEL Phase 4

Intelligent Network and Wireless Protocols Codec Name Customized Application for Mobile network Enhanced Logic (CAMEL) Universal Mobile Telecommunications System Mobile Application Part (UMTS-MAP) which includes Global System for Mobile communication - Mobile Application Part (GSM-MAP) Wireless Intelligent Network (WIN) ANSI 41 Descripti on Phase 2, 3, & 4 Phase 1,2,2+, &3 Standard 3GPP TS 29.078 (v5.1.0 Release 5) UMTS MAP 3GPP TS 29.002 V4.2.1 (200012) 3G TS 29 002 v3.4.0 (2000-3) TIA/EIA/IS 771 TIA/EIA/IS 826 TIA/EIA-41.(1-

Phase I & II ANSI -

163
Codec Name Descripti on 41D Intelligent Network Application Protocol (INAP) CS-1 CS-2 Standard 6) D Dec 1997 ITU-T Q.1218, release 1095 ITU-T Q.1228, release 997 additional service capabilities for wireless operators as well as greater harmonization of network capabilities and operations with emerging third-generation network requirements. WIN Phase 2 includes MSC triggers for an IN prepaid solution WIN Phase 3 incorporates enhancements to support location-based services These requirements are based on four service drivers: location-based charging, fleet and asset management service, enhanced call routing service, and location-based information service. WIN Phase 4 is currently in requirements review by the WIN standards group. Wireless Intelligent Networking allows the service provider to rapidly introduce new services. Mobile Pre-Pay is a common application. There are two overall standards employed today CAMEL and WIN. Maintaining and monitoring the Common Channel Signaling (CCS) network is critical to its success.Understanding and troubleshooting the SS7 protocol is a key part of that success. REFERENCES DR.S.S.RIAZ AHAMED, Journal of Theoretical and Applied Information Technology, Scibd.com Wireless Intelligent Network ppt IEEE Trans. on WIN, Lucent Technologies Bell Lab Service management pdf W. H. Tranter and K. L. Kosbar, "Simulation of Communication Systems," IEEE Communications Magazine, July 1994, Pp 22-28. W. Turin, "Simulation of Error Sources in Digital Channels,"IEEE J. on Selected Areas in Comm., Vol. 6, pp. 85-93 (January 1988). K. Walsh and E. G. Sirer, "Staged Simulation for Improving the Scale and Performance of Wireless Network Simulations," Proc. 2003 Winter Simulation Conference, New Orleans.

Matrix The matrix below shows the IN wireline and wireless protocols supported by telecommunications standards.

[1] [2] [3] [4] [5] Conclusion The first phase of WIN standards was published in 1999 and established the fundamental call models and operations required to support this flexible service architecture. Many service providers currently implement WIN Phase 1 in their networks. Examples of WIN Phase 1 services are calling name presentation and restriction, call screening, and voice-control services. Nearing completion are WIN Phase-2 standards that provide both [6]

164

owerment And Total Quality Management For Innovation And Success In Organisations

Abstract While struggling with the changing climate in technological communication with access to the Internet and electronic services, many organisations have brought in external experts who advise them on quality and restructuring. To survive in a competitive environment characterized by deregulation and converging markets, complex customer needs, corporate restructuring, and downsizing, todays organizational leaders are searching for innovative ways to enhance the creative potential of their workforce. As with total quality management and reengineering, empowerment has become one of the mantras of the 1990s.Employee empowerment has become a new topic that attracts research academics and practitioners. An empowerment approach encourages employees to have more discretion and autonomy in organizing their own work. It also involves a quality service delivery system in which employees can face the customers free of rulebooks and are encouraged to do whatever is necessary to satisfy them. However, many academics contend that implementation of the empowerment principle is a rather difficult task. In their promotional literature, earlier advocates of employee empowerment often prescribed simple, step-by-step procedures to be followed, and predicted certain success as a result. Recently, many researchers have challenged this view. They point out that property management empowerment, which was often omitted in earlier researches, is a critical factor in determining the success of customer satisfaction. Index Terms Empowerment, Responsibility, Total Quality Management

EMPOWERMENT
Empowerment is the process to give staff real authority in their work to achieve continuous improvement and job satisfaction in an organisations performance for better quality products and customer service in order to remain competitive. Empowerment encourages and allows individuals to take personal responsibility for quick response times to consumer needs and complaints with greater warmth and enthusiasm. In recent years, empowerment has become a separate discipline that has attracted widespread discussion.

According to Spreitzer (1992) there are four characteristics most empowered employees have in common: 1. sense of self-determination to choose how to do the work; 2. sense of competence to perform the work well; 3. sense of meaning to care about what they are doing; and finally 4. sense of impact to have influence on the larger environment. Empowerment is a mind-set that employees have an overall feeling of psychological empowerment about their role in the organisation. Greater job autonomy, increasing meaningfulness of the job, mentoring behaviours of the immediate supervisors and job satisfaction increases the organisational commitment of the employees and increase their psychological empowerment in the workplace. It is observed that employee empowerment strongly associates with the nature of the job and the leadership commitment in developing an empowered workforce. An empowered workplace should be structured to encourage front-line employees to exercise initiative and imagination in solving problems, improving processes and meeting customer needs. There is a to create enthusiasm and commitment by the development of organisational value and visions that are congruent with workers values and visions. Furthermore, the role of management from this perspective is to create a culture of participation by providing a compelling mission, a structure that emphasizes flexibility and autonomy, rewards for participation and a lack of punishment of risk taking as well as ongoing involvement programmes.

EMPOWERMENT AND TOTAL QUALITY MANAGEMENT


Empowerment programmes can transform a stagnant organisation into a vital one by creating a shared purpose among employees, encouraging greater collaboration

165 and, most importantly, delivering enhanced value to customers. It has been found that organisations with a commitment to employee involvement and empowerment also have a commitment to total quality. This concept stems from the current international strategy towards total quality management (TQM). It is often based on a desire to gain competitive advantage, increase productivity and improve customer relationships through quality assurance issues. As to the origins of empowerment, empowered groups have often resulted from organizational change such as downsizing or by adopting a flatter structure. Therefore, employees often perceive empowerment as receiving additional tasks. Effecting such organizational change is probably the hardest aspect of establishing TQM. However, effective empowerment can bring most organisations many successes and achievements as employees learn about the connection between their decisions, actions and customer value. In addition, they become self-directed decision makers aligned with the shared purpose. In order to stimulate employees to become involved and empowered in business improvement programmes, employees at all levels need to be given power, knowledge, information, and rewards that are relevant to business performance.

THE EMPOWERMENT PROCESS


The key to achieving empowerment for improved performance is for everyone in an organization to have a clear understanding of what they are trying to achieve by empowerment and what they must do to achieve their purpose. The empowerment process management model identifies the following six key steps in the planning, initiating and evaluating of an organisations initiative to extend and strengthen empowerment.

Figure 1: The Empowerment Process Management Model These steps make a closed loop process whose output is continuous improvement (Kinlaw, 1995): 1. Define and communicate the meaning of empowerment to every member of the organization. 2. Set goals and strategies that become the organizing framework for staff at every organizational level as they undertake their own efforts to extend and strengthen empowerment. 3. Train staff to fulfil their new roles and perform their functions in ways that are consistent with the organisations goals for extending and strengthening empowerment. 4. Adjust the organizations structure so that it demands a flatter format, creates greater autonomy and freedom to act. 5. Adjust the organizations systems (like planning, rewarding, promoting, training, and hiring) to support the empowerment of staff.

166 6. Evaluate and improve the process of empowerment by measuring improvement and the perceptions of the organizations members. These six elements in the model are linked together within a single rectangle to emphasize their relatedness. Around this large rectangle are a series of smaller rectangles which identify sources of critical inputs. The empowerment process can only be undertaken successfully if the following kinds of information and knowledge are well understood: A) meaning of empowerment; B) payoffs expected; C) targets for empowerment which provide a set of alternatives that everyone can use in targeting specific opportunities to empower themselves and others; D) strategies for empowerment which provide multiple alternatives for reaching the targets which individuals and organizations identify; E) how controls for empowerment differ from traditional controls and how these controls can be developed; F) New roles and functions in which property managers and other members of the organization must become competent their performance to be compatible with the meaning and purposes of empowerment. empowered work teams; for example, degree and use of group activity, manager-subordinate relationships, and decision-making authority, the Grid offered a logical starting point for the organization to engage in discussions about its current environment and human resource strategies by provoking questions such as the following: A) How interdependent are group members? B) Do we reinforce individual or group performance? C) Do members strongly identify with each other? D) Do demographically or culturally diverse groups look and feel substantially different? E) How much authority, control, and hierarchical trust do members of various groups have? Developing the grid as a framework for simultaneous assessment In order to perform simultaneous assessments of Multicorps team-building efforts and empowerment directives, we constructed the Grid from two continua. The horizontal continuum refers to the distinction described above between co-acting groups and real teams. For any organization, placement along this continuum involves analysis of group/team variables such as individual roles, nature of tasks, problem-solving and learning styles. The Grids vertical continuum illustrates possible team transitions from disempowered to empowered. To ensure an accurate assessment and avoid cross-organizational confusion, the disempowered-empowered continuum is derived from a firm-specific definition of empowerment. Specifically, the empowered end of Multicorps continuum was characterized first by what employees described as decision-making authority; that is, the permission to make a particular choice from a range of options. Organizational participants frequently used the words good and responsible to describe the kind of decisions they expected to make when empowered implying some alignment of personal values with organizational priorities. In contrast, at the disempowered end of the same continuum, Multicorp employees referred to a lack of decision-making autonomy and management control systems that were not only disempowering but also demoralizing. Lastly, trust at the disempowered end of the continuum referred to a fear of negative consequences, often related to direct repercussions experienced in the past. The intersection of the Grids two continua create four quadrants into which groups may fall following firmspecific evaluation: empowering managers (upper left), empowered work teams (upper right), platoons (lower

ACHIEVING EMPOWERMENT USING THE EMPOWERMENT STRATEGY GRID


Empowerment Strategy Grid is a management tool constructed and refined from research and intervention at several US corporations. Integrating the fundamental concepts of empowerment, team building, and diversity, the Empowerment Strategy Grid facilitates organizational assessment of work team development and progress towards achieving company empowerment strategies. What follows is a comprehensive description of the Empowerment Strategy Grid and its practical application in the context of designing and implementing effective, firm-specific, empowered work teams. Demonstrated corporate need to minimize the potential confusion and transform work groups into empowered work teams at the leading US multinational corporation, Multicorp, led to the development of the Empowerment Strategy Grid. Development of the Empowerment Strategy Grid is a holistic framework for analysing work groups and mapping team progress towards empowerment strategies. Capturing within its simple schematic the fundamental variables that structure and categorize

167 right), and automatons (lower left). By identifying the quadrant in which a group is anchored, we have found that organizational participants involved in the change process can better assess their team model, identify gaps between espoused human resource strategy and practice, design interventions to empower teams fully, and measure the transition path into their target area of the Grid. Overall, automatons are told how, when, and where to perform their work. They work in close proximity to one another, report to the same manager, and gain authority through function, level, or role Next, sliding along the co-actors-team continuum to the lower right quadrant of the Grid, platoons incorporates those individuals who identify with, trust, and respect their team members yet are not empowered. Named by several of Multicorps participants, platoon is intended to evoke imagery of a controlled or conditional work environment: teams in this quadrant see themselves as peers with varying talents who contribute synergistically to the teams performance, as competent and capable within their delineated sphere of action, and as constructively managing conflict. In contrast to empowered work teams, platoon members must report to and obey a higher authority. Ultimately, these teams are bounded by external rules and managerial controls which often creates an us versus them culture. Within the team, group performance is reinforced, recognized, and rewarded. Platoon-like teams can operate very efficiently under certain conditions. Similar to empowering managers, platoons may be a viable strategy: teams are motivated to learn, members perform well with one another, and work teams produce. These teams are simply not as effective as empowered work teams in rapid, autonomous, creative problem solving since they must comply with set rules and appeal to authority. Given possible benefits of this type of team and the firms competitive environment, organizational practitioners and beneficiaries must decide that empowered work teams are more effective and efficient prior to moving from platoons along the empowerment-disempowerment continuum to the empowered work teams quadrant. Finally, in Multicorps empowered work teams quadrant, individual roles are fluid and assigned relative to team needs and member competences, tasks are interdependent, and team members share decisionmaking authority based on member skills and the specific knowledge needed to manage problems. Team members may report to the same or to different managers - identification with other members may be more important than physical proximity. In this selfmanaged, collective autonomy, team members are often self-learners and rely on group problem solving. Team members manage conflict creatively and focus on producing integrative solutions. Members trust each other and demand fair organizational processes. In an environment in which team output is a collective goal, performance measurement, recognition, and rewards are based on the team. As a result of team member cohesion and synergistic performance, these empowered work teams often better sustain their effectiveness over time as

Figure 2: The Empowerment Strategy Grid According to Multicorp definitions of empowerment, work groups located in the empowering managers quadrant are characterized by both the attributes of a coacting group and a participatory management style of decision making. Group members are individuals who work in close proximity to one another, report to the same manager, or perform functionally similar, yet independent, tasks. These co-actors are empowered to contribute ideas to the decision-making process but not to influence, make, or implement final decisions. Moreover, given the dynamic of this group structure, group conflict is handled through compromise, individual domination, or authority intervention. Finally, in an environment in which output and performance are additive, compensation, rewards, and productivity measures are individually based. The pattern of empowered relations described by the empowering managers quadrant defined for Multicorp appears to be the strategy with which many managers in existing organizational structures are most adept. In response to empowerment-type initiatives, both managerial and individual employee roles shift to meet high-level expectations of empowerment. Moving counter-clockwise and down Multicorps empowerment-disempowerment continuum, automatons is attributed to Frederick II who said in describing his soldiers, in general, they are veritable machines with no other forward movement than that which you give them. In a state of machine-like, automatic operation, members of work groups in the automaton quadrant have clearly defined roles and tasks and limited autonomy.

168 well as drive potential improvements in productivity, satisfaction, turnover, and absenteeism. Given the potential intrinsic and extrinsic rewards for both company and employee, the empowered work team quadrant is clearly an ideal empowerment target for many companies. As a result, organizational members at various levels are often committed, or simply chartered by senior management, to build empowered work teams. Yet implementers and presumed beneficiaries are typically overwhelmed with the enormity of tasks, time, and resources required to develop truly empowered teams. Many seem to doubt the possibility of changing the assumptions, habits, and practices embedded in one quadrant with those that support empowered work teams. Challenges also exist in knowing where to start and in planning interventions which stimulate, monitor, and sustain the change process. At a minimum, by providing rich, firm-specific descriptors of four possible environments, the Grid can help an organization rapidly identify both its current group status and gaps between desired strategy and practice.

Figure 3: The Empowerment Strategy Matrix Grid Quadrant Descriptors The overall process stifles individual initiative and decision-making autonomy, group problem solving, and innovative ways of thinking about balancing work and family commitments. Moreover, organizational outcomes include systemic inconsistencies and confusion, individualization of manager-employee interaction, hindrance of team development, challenges to diversity, employee dissatisfaction, and turnover.

169

IMPLICATIONS FOR ORGANIZATIONS STRIVING TO ACHIEVE EMPOWERMENT POTENTIAL


Clearly, organizational transformation from one quadrant, such as automatons, to empowered work teams is no simple feat. To make these moves, up-front assessment is critical in order to avoid the root problems associated with empowerment programmes that have no organizational meaning. Equally important are understanding the gaps between the organizations current position and target quadrant and then measuring and monitoring progress to close these gaps. The Empowerment Strategy Grid allows organizational practitioners to identify relevant gaps by evaluating their team models and comparing outcomes with managerial intent. But the Grid can also give guidance in measuring and monitoring the effectiveness of interventions designed to promote change. Following team model assessment, gap identification, and development planning, practitioners can use the Grid to perform iterative analyses of team progress in response to specific team-building or empowerment interventions. To do this, we strongly recommend that companies identify relevant organizational measures at the beginning of their change process. This will help anchor the continua according to internally-defined goals and then determine initial placement of the organizations groups in one of the Grids quadrants.

CONCLUSION
Organizations, and people in them, face uncertainty, change, complexity and huge pressures. Among the factors causing this are: the demand for higher quality and value for money (more for less); higher expectations of quality of life at work and elsewhere; increasing globalization of the economy; efforts to contain growth in public expenditure and transfer services from the public to the private sector; the growing urgency of both equal opportunities and ecological issues and awareness of inequities in the global economic system; and, lately, international recession. How can we find our way through this complex situation which is at once exciting and daunting? It is likely that we shall find ways forward most successfully first, by releasing creative energy, intelligence and initiative at every level; and second, by learning how to unite people in solving common problems, achieving common purposes and respecting and valuing difference. Organizations which do this will have the best chance of surviving and prospering. They will

attract the most able people and have the best relationships with customers. This implies a different culture: leadership that is inspiring, empowering and nurturing, rather than controlling; an atmosphere of high expectations, appreciation and excitement; a balance between yin and yang; recognition that, normally, internal competition is destructive and there are elegant or win-win solutions; an attitude of wanting everyone to excel; acceptance that in todays conditions we are bound to have difficult feelings and that understanding how to deal with our feelings, and how to assist others with theirs, is a key skill. We also need to learn how to tap into the energy to improve things, so often expressed as complaint, criticism and blame, and help people deal with feelings of hopelessness, often masquerading as cynicism. It is believed that greater employee empowerment is that breakthrough opportunity for all businesses to leverage in improving their sustainable performance. TQM programmes that do not have management commitment and employee empowerment are bound to fail. Property managers believe that, with top management commitment, by involving employees in problem solving, decision making, and business operations, then performance and productivity will increase. To be able to participate in empowerment, employees need to be sufficiently educated. Employees should be encouraged to control their destiny and participate in the processes of the organisation. To be effective, employees should be given power, information, knowledge and rewards that are relevant to business performance. Successful empowerment can not only drive decisionmaking authority down to those employees closest to the product or customer, improving quality and customer service, but also inculcate a sense of job ownership, commitment, and efficacy among empowered individuals and work teams. TQM calls for a change of culture with the support of management that requires employee empowerment for quality improvement at all levels. Empowerment also leads to greater levels of satisfaction among the workforce, whereas empowered employees give faster and friendlier service to customers as well. There is a significant, positive relationship between success with organisational process improvement and the presence of all three cultural elements related to quality improvement: customer focus, employee empowerment, and continuous improvement. Given the potential financial and employee benefits associated with empowered work teams, empowerment may be a viable survival solution for companies facing intensifying competitive pressures and changing workforce dynamics. Yet, empowerment in practice is

170 more than just a 1990s buzzword. Instead, if leadership involves envisaging future strategy, attracting and enabling diverse organizational members at every level to embrace the leaders vision, and persistently operationalizing this vision in a meaningful and congruent manner throughout the entire organization, then empowerment is a significant leadership challenge. The Empowerment Strategy Grid helps companies like Multicorp avoid the implementation pitfalls associated with group differences, variations in the definition and degree of empowerment across an organization, and interventions which unintentionally disempower. By helping organizational practitioners assess their team development, interventions, and progress towards achieving corporate empowerment strategies, the Empowerment Strategy Grid can help companies reap the full potential from their empowerment programmes.
REFERENCES
[25] Alderfer, C.P. (1986), "An intergroup perspective on group dynamics", in Lorsch, J. (Eds),. [26] Bandura, A. (1977), "Self-efficacy: toward a unifying theory of behavior change", Psychological Review, Vol. 84 pp.191-215. [27] Bandura, A. (1982), "Self-efficacy mechanism in human agency", American Psychologist, Vol. 37 No.2, pp.122-47. [28] Conger, J.A., Kanungo, R.N. (1988), "The empowerment process: integrating theory and practice", Academy of Management Review, Vol. 13 No.3, pp.471-82. [29] Follet, M.P. (1940), H.C. and Urwick, L. (Eds), in Metcalf, . [30] Greenhaus, J.H., Parasuraman, S., Wormley, W.M. (1990), "Effects of race on organizational experiences, job performance evaluations, and career outcomes", Academy of Management Journal, Vol. 33 No.1, pp.64-86. [31] Hackman, J.R., Oldham, G.R. (1980), Work Redesign, Addison-Wesley, Reading, MA., . [32] Hill, L.A. (1992), Becoming a Manager: How New Managers Master the Challenges of Leadership, Harvard Business School Press, Boston, MA., . [33] Johnson, R.D. (1994), "Wheres the power in empowerment?: definition, differences, and dilemmas of empowerment in the context of work-family boundary management", Unpublished doctoral dissertation, Harvard University., . [34] Kotter, J.P. (1990), A Force for Change: How Leadership Differs from Management, Free Press, New York, NY., . [35] Kouzes, J.M., Posner, B.Z. (1987), The Leadership Challenge. How to Get Extraordinary Things Done in Organizations, Jossey-Bass, San Francisco, CA., . [36] Lawler, E.E., Mohrman, S.A., Ledford, G.E. (1992), Employee Involvement and Total Quality Management: Practice and Results in Fortune 1000 Companies, Jossey-Bass, San Francisco, CA., . [37] Maznevski, M.L. (1995), "Process and performance in multicultural teams", Submitted to the Organizational Behavior Division of the Academy of Management Annual Meetings, Vancouver, BC., . [38] Miller, W.H. (1995), "General Electric; Auburn, Maine", Industry Week, . [39] Nemeth, C. (1987), "Influence processes, problem solving, and creativity", in Zanna, M., Olson, J., Hermer, C. (Eds),. [40] "Powerless empowerment", . [41] Taifel, H., Turner, J.C. (1986), "The social identity of intergroup behavior", in Worchel, S., Austin, W.G. (Eds),. [42] Thomas, K.W., Velthouse, B.A. (1990), "Cognitive elements of empowerment: an interpretive model of intrinsic task motivation",Academy of Management Review, Vol. 15 No.4, pp.666-81. [43] Treitschke, H. von. (1915), The Confessions of Frederick the Great with Treitschkes Life of Frederick, G.P. Putnam and Sons, New York, NY., . [44] Tsui, A., Egan, T., OReilly, C. (1992), "Being different: relational demography and organizational attachment", Administrative Science Quarterly, Vol. 37 No.4, pp.549-79. [45] Watson, W.E., Kumar, K., Michaelson, L.K. (1993), "Cultural diversitys impact on interaction processes and performance: comparing homogeneous and diverse task groups", . [46] Morgan, G. (1988), Riding the Waves of Change, Jossey-Bass, San Francisco, CA, . [47] Nixon, B. (1992), "Developing an Empowering Culture in Organizations", Empowerment in Organizations, Vol. 2 pp.14-24. [48] Senge, P (1993), The Fifth Discipline, Century Business, London, . [49] Simmons, M. (1993), "Creating a New Leadership Initiative", Management Development Review, Vol. 6 No.5, . [50] Stacey, R. (1993), Strategic Management and Organizational Dynamics, Pitman, Marshfield, MA, . [51] Batten, J (1994), "A total quality culture", Management Review, Vol. 83 No.5, . [52] Becker, F. (1990), The Total Workplace, Van Nostrand Reinhold, USA,. [53] Brymer, R.A (1991), "Employee empowerment: a guest-driven leadership strategy", The Cornell HRA Quarterly, pp.58-68. [54] Honold, L (1997), "A review of the literature of employee empowerment", Empowerment in Organisation, Vol. 5 No.4, . [55] Huyton, J, Baker, S (1992), "Empowerment: a way to increase productivity and morale", Education Forum Proceedings on Direction 2000, Hong Kong, pp.511-18. [56] Jones, P, Davies, A (1992), "Empowerment: a study of general managers of four-star hotel properties in the UK", International Journal of Hospitality Management, Vol. 10 No.3, pp.211-17. [57] Kinlaw, D.C (1995), The Practice of Empowerment: Making the Most of Human Competence, Gower, Hampshire, . [58] Mohrman, S, Lawler, E, Ledford, G (1996), "Do employee involvement and TQM programmes work?", Journal of Quality and Participation, Vol. 19 No.1, pp.6-10. [59] Potterfield, T.A (1999), The Business of Employee Empowerment, Quorum Books, USA, . [60] Simmons, P, Teare, R (1993), "Evolving a total quality culture", International Journal of Contemporary Hospitality Management, Vol. 5 No.3, .

171

A New Multiple Snapshot Algorithm for Direction-of Arrival Estimation using Smart Antenna
Lokesh L , Sandesha Karanth , Vinay T, Roopesh , Aaquib Nawaz.

Abstract In this paper, a new Eigen Vector algorithm for direction of arrival (DOA) estimation is dMUSICeloped, based on eigen value decomposition and normalization of covariance matrix. Unlike the classical Maximum Likelihood Method (MLM) and Maximum Entrophy Method (MEM) algorithms the proposed method only involves the determination noise subspace eigen vectors which provides better resolution and bias as compared to existing DOA algorithms. The performance of the proposed method is demonstrated by numerical results for widely spaced and closely spaced sources. Keywords: array signal processing, direction of arrival, MUSIC, MLM, MEM

INTRODUCTION

ireless networks face MUSICer-changing demands on their spectrum and infrastructure resources. Increased minutes of use, capacityintensive data applications, and the steady growth of worldwide wireless subscribers mean carriers will have to find effective ways to accommodate increased wireless traffic in their networks. HowMUSICer, deploying new cell sites is not the most economical or efficient means of increasing capacity. Wireless carriers have begun to explore new ways to maximize thespectral efficiency of their networks and improve their return on investment. Smart antennas have emerged as one of the leading innovations for achiMUSICing highly efficient networks that maximize capacity and improve quality and coverage. Smart antennas provide greater capacity and performance benefits than standard antennas because they can be used to customize and fine-tune antenna coverage patterns to the changing traffic or radio frequency (RF) conditions.

Fig1: Smart Antenna System A smart antenna system at the base station of a cellular mobile system is depicted in Fig. 1. It consists of a uniform linear antenna array for which the current amplitudes are adjusted by a set of complex weights using an adaptive beamforming algorithm. The adaptive beamforming algorithm optimizes the array output beam pattern such that maximum radiated power is produced in the directions of desired mobile users and deep nulls are generated in the directions of undesired signals representing co-channel interference from mobile users in adjacent cells. Prior to adaptive beamforming, the directions of users and interferes must be obtained using a direction-ofarrival estimation algorithm. The paper is organized as follows: Section II develops the theory of smart antenna systems Section III describes Maximum Likelihood method. Section IV describes Maximum Entrophy Method , Section V describes MUSIC method. Section VI presents performance results for the smart antenna Direction of Arrival algorithms. Finally, conclusions are given in section VII.

PROBLEM FORMULATION
A) Signal Model

172
M 1

x k ( n)
i 0

bi (n) a ( i )
(2)

B)

Formation of Array Correlation Matrix

The spatial covariance matrix of the antenna array can be computed as follows. Assume that (signal) and (noise) are uncorrelated, is a vector of Gaussian white noise samples with zero mean. The spatial covariance matrix is given by Fig1:Uniform Linear Array Consider a uniform linear array geometry with L elements numbered 0, 1, ..., L - 1. Consider that the array elements have half-a-wavelength spacing between them. Because the array elements are closely spaced, we can assume that the signals received by the different elements are correlated. A propagating wave carries a baseband signal, s(t), that is received by each array element, but at a different time instant. It is assumed that the phase of the baseband signal, s(t), received at element 0 is zero. The phase of s(t) received at each of the other elements will be measured with respect to the phase of the signal received at the 0th element. To measure the phase difference, it is necessary to measure the difference in the time the signal s(t) arrives at element 0 and the time it arrives at element k. The Steering vector is such measure and for a L antenna elements array it is given by

R E[ xn xnH ] E[( A bn nn ) ( A bn nn ) H
(3)

The spatial covariance matrix is dived into signal and noise subspaces and hence we obtain

R
(4)

A Rss A H

Where, is Array Correlation Matrix or Spatial Correlation matrix, is Array Manifold Vector, is hermitian transpose of A, is noise variance.

Maximum Likelihood Method


The Capon AOA estimate is known as a Minimum Variance Distortionless Response (MVDR). It is also alternatively a maximum likelihood estimate of the power arriving from one direction while all other sources are considered as interference. Thus the goal is to maximize the Signal to Interference Ratio (SIR) while passing the signal of interest undistorted in phase and amplitude. The source correlation matrix is assumed to be diagonal. This maximized SIR is accomplished with a set of array weights given by

1 e S ei2
i 2 d sin


d ( L 1) sin

(1)

The combination of all possible steering vectors forms a matrix A known as the array manifold matrix Hence, the received signal vector x(t) of (1) can be expressed in terms of A as:

wMVDR

R xx1 a ( ) a H ( ) R xx1 a ( )

(5)

Where, is the inverse of un-weighted array correlation matrix and is the steering vector for an angle . The MLM pseudo spectrum is given by

173

PMLM

1 a H ( ) Rinv a( )

(6)

Where, is the hermitian transpose of and is the inverse of autocorrelation matrix.

Maximum Entrophy Method


This method [8] finds a power spectrum such that its Fourier transform equals the measured correlation subjected to the constraint that its entropy is maximized. The solution to this problem requires an infinite dimensional search. The problem has to be transformed to a finite dimensional search. One of the algorithms proposed by Lang and McClellan has power spectrum given by

Fig2: Proposed MUSIC Vector System As shown in fig2 the MUSIC algorithm estimates the covariance matrix and then performs MUSIC Decomposition to form the subspace. The Noise subspace is then normalized to obtain better resolution as compared to other DOA algorithms. One must know in advance the number of incoming signals or one must search the Eigen values to determine the number of incoming signals. If the number of signals is M, the number of signal Eigen values and eigenvectors is M and the number of noise Eigen values and eigenvectors are L-M (L is the number of array elements). Because Eigen Vector exploits the noise eigenvector subspace, it is sometimes referred to as a subspace method. The Eigen values and eigenvectors for correlation matrix is found. M eigenvectors associated with the signals and LM eigenvectors associated with the noise are separated. The eigenvectors associated with the smallest Eigen values are chosen to calculate power spectrum. For uncorrelated signals, the smallest Eigen values are equal to the variance of the noise. The L (L M) dimensional subspace spanned by the noise eigenvectors is given by

PME

1 [S
H

CC S ]

(7)

Where, C is column of R-1 and is the steering vector. PME() is based on selecting one of Lth array elements as a reference and attempting to find weights to be applied to the remaining L-1 received signals to permit their sum with a minimum mean square error fit to the reference. Since there are L possible references, there are L generally different PME() obtained from the L possible column selections of R-1

MUSIC Method or Eigen Vector Method


MUSIC Method promises to provide unbiased estimates of the number of signals, the angles of arrival and the strengths of the waveforms. MUSICmethod makes the assumption that the noise in each channel is uncorrelated making the noise correlation matrix diagonal. The incident signals may be correlated creating a non diagonal signal correlation matrix. HowMUSICer, under high Signal correlation the traditional MUSIC algorithm breaks down and MUSIC Method must be implemented to correct this weakness.

EN

e1 e2 e3 .......... ....... eL

(7)

Where, is the ith Eigen Value. The noise subspace Eigen vectors are orthogonal to the array steering vectors at the angles of arrival . Because of this orthogonality condition, one can show that the Euclidean distance for each and MUSICery angle of arrival .Placing this distance expression in the denominator creates sharp peaks at

174
the angles of arrival. The MUSIC pseudo spectrum is given by

PMUSIC

1
H a( ) E N E N a( ) H

(8)

Given a case when sources are widely apart and less number of antenna elements are used as shown in fig 3 , it is found that MEM, MNM and MUSIC method detect direction of sources and all DOA algorithms produces best output. Case2: Closely spaced sources with less number of antenna elements Table2: Input to Delay and Sum, MEM and MUSIC Method

Where, is steering vector for an angle , is a matrix comprising of noise Eigen vectors and is the ith eigen value.

Simulation Results
Here the DOA algorithms namely; MNM, MEM, and new MUSIC method are simulated using MATLAB. Assumptions 1.Distance between antenna elements is to avoid grating lobes. 2. Signal and Noise is un-correlated. Case1: Widely spaced source with less number of antenna elements Table1: Input to MNM, MEM and MUSIC Method

Number of array elements 8

Number of sources 2

Directions of Sources

Amplitude of sources

[250 ,300]

[1, 1]v

Number of array elements 8

Number of sources

Directions of Sources

Amplitude of sources

[100 , 500]

[1, 2]v
Fig4: Comparisons of MNM, MEM and MUSIC Method for closely spaced sources with less antenna elements Given a case when sources are closely spaced and less number of antenna elements are used as shown in fig4 , it is found that MNM and MEM perform badly and new MUSIC method yields better output. Case3: Widely spaced sources with more number of antenna elements Table3: Input to MNM, MEM and MUSIC Method

Fig3: Comparisons of MNM, MEM and MUSIC Method for widely spaced sources with less antenna elements

Number of array elements 50

Number of sources 2

Directions of Sources [200, 600]

Amplitude of sources

[1, 3 ]v

175 Given a case when sources are widely apart and more number of antenna elements are used as shown in fig 5, it is found that performance of MNM, MEM and MUSIC methods is good.
in fig 6, it is found that performance of MNM, MEM and MUSIC methods is good because EM wave strikes more number of antenna elements.

CONCLUSION
The Direction of Arrival (DOA) block of smart antenna systems based on classical and subspace methods are presented. The new Eigen Vector method is compared with existing MEM and Delay & Sum method. From the simulation results of MATLAB the conclusions are: when the sources are widely spaced and less number antenna elements are used performance of MNM, MEM and MUSICr method are good. When the sources are closely spaced and less number of antenna elements are used performance of MNM and MEM is worst and Eigen Vector algorithm is best suited in this case. When the sources are widely spaced and more number of antenna elements are used all algorithms perform well. When the sources are closely spaced and more number of antenna elements are used performance of MUSIC are improved. REFERENCES
[61] R. M. Shubair and A. Al-Merri, "Robust algorithms for direction finding and adaptive beamforming: performance and optimization," Proc. of IEEE Int. Midwest Symp. Circuits & Systems (MWSCAS'04), Hiroshima, Japan, July 25-28, 2004, pp. 589-592. [62] E. M. Al-Ardi, R. M. Shubair, and M. E. Al-Mualla, "Computationally efficient DOA estimation in a multipath environment using covariance differencing and iterative spatial smoothing," Proc. of IEEE Int. Symp. Circuits & Systems (ISCAS'05), Kobe, Japan, May 23-26, 2005, pp. 3805-3808. [63] E. M. Al-Ardi, R. M. Shubair, and M. E. Al-Mualla, "Investigation of high-resolution DOA estimation algorithms for optimal performance of smart antenna systems," Proc. of 4th IEE Int. Cory': 3G Mobile Communications (3G'03), London, UK, 25-27 June, 2003, pp. 460-464. [64] E. M. Al-Ardi, R. M. Shubair, and M. E. Al-Mualla, "Performance MUSICaluation of direction finding algorithms for adaptive antenna arrays,"Proc. of 10th IEEE Int. Coni Electronics, Circuits & Systems (ICECS'03),Sharjah, United Arab Emirates, 14-17 December, 2003, Vol. 2, pp. 735738. [65] R. M. Shubair and A. Al-Merri Convergence study of adaptive beam- forming algorithms for spatial interference rejection," Proc. of Int. Symp. Antenna Technology & Applied Electromagnetic s (ANTEM'05), Saint- Mato, France, June 15-17, 2005. [66] R. M. Shubair and W. Al-Jessmi, "Performance analysis of SMI adaptive beamforming arrays for smart antenna systems," Proc. of IEEE Int. Symp. Antennas & Propagation (AP-S'05), Washington, D.C., USA, July 3-8, 2005, pp. 311314. [67] R. M. Shubair, A. Al-Merri, and W. Al-Jessmi, "Improved adaptive beamforming using a hybrid LMS/SMI approach," Proc. of IEEE Int. Coni Wireless and Optical Communications Networks (WOCN'05), Dubai, UAE, March 6-8, 2005, pp. 603-606.

Fig5: Comparisons of MNM, MEM and MUSIC Method for widely spaced sources with less antenna elements Case4: Closely spaced sources with more number of antenna elements Table4: Input to MNM, MEM and MUSIC Method

Number of array elements 70

Number of sources 2

Directions of Sources [300 ,330]

Amplitude of sources

[1 ,2]v

Fig6: Comparisons of MNM, MEM and MUSIC Method for closely spaced sources with more antenna elements Given a case when sources are closely spaced and more number of antenna elements are used as shown

176

Quality Metrics for TTCN-3 and Mobile-Web Applications

Abstract Web-based application is essentially a client-server system, which combines traditional effort logic and functionality, usually server based. This paper has been designed to predict the Web metrics for evaluating the efficiency and maintainability of hyperdocuments in termes of Testing and Test Control Notation (TTCN-3)and mobile-wireless web application.In the modern era of Information and Communication Technology (ICT), Web and the Internet, have brought significant changes in Information Technology (IT) and their related scenarios. The quality of a web application could be measured from two perspectives: programmers view and users view and here maintainability perceived by the programmers, and efficiency experienced by the end-user. Index Terms Web-based effort estimation, Webbased design, Web metrics, E-commerce , web application.

effort measurement models for Web-based hypermedia applications based on implementation phase of development life cycle. For this work, we studied various size measures at different points in the development life cycle of Web-based systems, to estimate effort, and these have been compared based on several predictions. The main objective of design metrics is to provide basic feedback of the design being measured. In this paper we are introduced some new matrices for quality factors in terms of TTCN-3 and mobile-wireless web application. Here we are calculated only two quality-factor,Maintainability and Efficency.

QUALITY FACTORS
Many software quality factors have already defined and in this paper we have defined quality factor for web metrics for evaluating the efficiency and maintainability. It is already well established that a website should be treated as a set of components. Our interest is to consider the nature of these components, and how they affect the web site's quality. We are putting a lot of emphasis on maintainability and efficiency in this paper, since for most of the life of a web site; it is being actively maintained .Web sites differ from most software systems in a number of ways. They are changed and updated constantly after they are first developed. As a result of this, almost all of the effort involved in running a web site is maintenance. We will use the following criteria for estimating maintainability and efficiency (from ISO 9126):

INTRODUCTION

he diverse nature of web applications makes it difficult to measure these using existing quality measurement models. Web applications often use large numbers of reusable components which make traditional measurement models less relevant. Through a client Web browser, users are able to perform business operations and then to change the state of business data on the server. The range oWebbased applications varies enormously, from simple Web sites (Static Web sites) that are essentially hypertext document presentation applications, to sophisticated high volume e-commerce applications often involving supply, ordering,payment, tracking and delivery of goods or the provision of services (i.e. Dynamic and Active Web sites). We have focused on the implementation and comparison of

177
terms of analysability and changeability and for locating issues, an initial set of appropriate TTCN-3 metrics has been developed.To ensure that these metrics have a clear interpretation, their development was guided by the Goal Question Metric approach. First the goals to achieve were specified, e.g. Goal 1:Improve changeability of TTCN-3 source code or Goal 2: Improve analysability of TTCN-3. coupling metrics are used to answer the question of Goal 1 and counting the number of references for answering the questions of Goal 2. The resulting set of metrics not only uses well known metrics for general-purpose programming languages but also defines new TTCN-3-specific metrics. As a first step, some basic size metrics and one coupling metric are used: A) Number of lines of TTCN-3 source code including blank lines and comments, i.e. physical lines of code . B) Number of test cases C) Number of functions D) Number of altsteps, E) Number of port types F) Number of component types G) Number of data type definitions H) Number of templates. I) Template coupling, which is to be computed as follows: where stmt is the sequence of behavior entities referencing templates in a test suite, n is the number of statements in stmt, and stmt(i)denotes the i th statement in stmt. Template coupling measures the dependence of test behaviour and test data in the form of TTCN-3 template definitions. On the basis of Template coupling we have calculated the quality factors and its sub -factor of the maintainability and efficiency. A. MAINTAINABILITY Web-based software applications have a higher frequency of new releases, or update rate. Maintainability is a set of attributes that bear on the effort needed to make specified modified modifications (ISO 9126: 1991, 4.5).this is the ability to identify and fix a fault within a software component is what the maintainability characteristic addresses. Sub characteristics of the maintainability are-analysability,changeability,stability,testability. a. Analysability Analysability is measured as the attributes of the software that have a bearing on the effort needed for

A) Maintainability: i) Analysability ii) Changeability iii) Stability iv) Testability B) Efficiency: i) Time based efficiency ii) Resource based efficiency We now look at each of these criteria and their metrics in the following subsections which are suitable for TTCN-3 specific metrics and mobile web application. III. METRICS FOR TTCN-3

Testing and Test Control Notation (TTCN-3), has shown that this maintenance is a non-trivial task and its burden can be reduced by means of appropriate concepts and tool support. The test specification and test implementation language TTCN-3 has the look and feel of a typical general-purpose programming language, i.e. it is based on a textual syntax, referred to as the core notation.For assessing the overall quality of software, metrics can be used. Since this article treats quality characteristics such as maintainability of TTCN-3 test specifications, only internal product attributes are considered in the following. For assessing the quality of TTCN-3 test suites in

178
diagnosis and modification of deficiencies and causes of failures. For optimal Analysability most templates may be inline templates. - Metric 1.1: complexity violation := a. Time based efficiency The time behaviour describes for instance processing times and throughput rates. b. Resource based efficiency resource behaviour means the amount of resources used and the duration of use. IV. METRICS FOR MOBILE-WIRELESS WEB APPLICATION b. Changeability For changeability we are interested in how easily the data, formatting and program logic in the website can be changed. For good changeability a decoupling of test data and test behavior might be advantageous. Mobile and wireless devices and networks enable "any place, any time" use of information systems,providing advantages, such as productivity enhancement, flexibility, service improvements and information accuracy.This research develops a methodology to define and quantify the quality components of such systems. In this section we describes the metrics development process and presents examples of metrics.

c. Stability Stability is the tolerance of the application towards unexpected effects of modifications. This metric measures the number of all component variables and timers referenced by more than one function,testcase,or altstep and relates them to the overall number of component variables and timers. - metric : global variable and timer usage :=

d. Testability Testability is the effort for validating modification. There are only a few special considerations that should be made when measuring testability for a web site. Since the site can be tested through a web browser exactly like black box testing. While most of these metrics mainly describe the overall quality of test suites (an example is the Template coupling metric), some of them can also be used to improve a test suite by identifying the location of individual issues. B. EFFICIENCY Efficiency is a set of attributes that bear on the relationship between the level of performance of the software and the amount of resources used, under stated conditions(ISO 9126: 1991, 4.4). A set of attributes that bear on the relationship between the level of performance of the software and the amount of resources used, under stated conditions.. This characteristic is concerned with the system resources used when providing the required functionality. The amount of disk space, memory, network etc. provides a good indication of this characteristic. The subcharacteristics of the efficiency is time and resource

In this section we have to measure only two quality attributes,Maintainability and Efficiency. A. Maintainability Increasing the quality of the development processes and products/program code in the areas of maintainability will help to lower the cost when adding a new target platform .Maintainability (ISO9126) includes the analyzability, changeability, stability, and testability sub-characteristics.This set of characteristics reflects mainly the technical stakeholders' viewpoint, such as the developers and

179
maintenance people. a.Use of standard protocol B. Efficency Efficiency (ISO-9126) includes the time behavior and resource utilization sub-characteristics. a.Time efficiency Time behavior sub-characteristic is very important in the mobile-wireless applications because the price of each minute of data transferring is very high, and the users will avoid expensive systems. i) Response time to get information from server ii) Response time to get information from Client b.Resource efficiency Mobile devices include small memory and low processing resources, so applications must be aware of these restrictions and optimize resource utilization. i) Size of application in mobile device ii) Size of memory in mobile device iii) Device memory cleanup after completing the task iv) Network throughput Finally, the limited processing and network resources require efficient use of the available resources. V.CONCLUSION & FUTURE WORK The paper has discussed the ISO 9126 norm with respect to the development of mobile web applications and TTCN-3. This paper introduced two subjects with respect to quality attributes (ISO-9126). First, TTCN-3 described the metrics for quality factor (ISO-9126). Second, mobile-wireless information systems which also used for measuring the quality factors (ISO-9126). In this paper we have calculate only two quality factors, Maintainability and Efficiency in terms of TTCN-3 and mobile-wireless web application. The research can be expanded to calculate other quality factors such as Functionality,Reliability, Usability, Portability in term of TTCN-3 and mobilewireless web application. REFERENCES
[1] Stefan, M. Xenos: A model for assessing the quality of ecommerce systems,Proceedings of the PC-HCI 2001 Conference on Human Computer Interaction,2001. Asunmaa, P., Inkinen, S., Nyknen, P., Pivrinta, S., Sormunen, T., & Suoknuuti, M. (2002). Introduction to mobile internet technical architecture. Wireless Personal Communications, 22, 253259. Boehm, B.W., J.R. Brown, J.R. Kaspar, M. Lipow & G. Maccleod, Characteristics of Software Quality (Amsterdam: North Holland. 1978). Bache, R., Bazzana, G., Software Metrics for Product Assessment, Mcgraw-Hill, 1994. Calero, C., Ruiz, J., & Piattini, M. (2004). A web metrics survey using WQM. Proceedings ICWE 2004, LNCS 3140, Springer-Verlag Heidelberg, 147160. Coleman, D. Ash, B. Lowther, P. Oman, Using Metrics to Evaluate Software System Maintainability, Computer Vol. 27, No..8, pp. 44-49. Ejiogu, L., Software Engineering with Formal Metrics, QED Publishing, 1991. G. M. Weinberg: The Psychology of Computer Programming, 1979. Hordijk, W., & Wieringa, R. (2005). Surveying the factors that influence maintainability. Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering ESECFSE05, Lisbon, Portugal, 385-388. ISO9000. (2000). Quality management systems Requirements. Geneva, Switzerland: International Organization for Standardization. J. Eisenstein, J. Vanderdonckt, A. Puerta, Applying ModelBased Techniques to the Development of UIs for Mobile Computers, Proceedings on Intelligent User Interfaces, Santa Fe, 2001 J. Offutt, Quality Attributes of Web Software Applications, IEEE Software, IEEE, March/April 2002, pp. 25-32. M. Satyanarayanan: Fundamental Challenges in Mobile Computing, Symposium on Principles of Distributed Computing, 1996. Mccall, J.A., P.K. Richards & G.F. Walters, Factors in Software Quality, Vol. 1,2, AD/A-049-014/015/05, And Springfield, VA: National Technical Information Service, 1977. S McConnell, Real Quality for Real Engineers, IEEE Software, March/April 2002, pp. 5-7.

[2]

[3]

[4] [5]

[6]

[7] [8] [9]

[10]

[11]

[12] [13]

[14]

[15]

180

A Unique Pattern Matching Algorithm Using The Prime Number Approach


Nishtha Kesswani, Bhawani Shankar Gurjar

Abstract There are several patternmatching algorithms vailable in the literature. The Boyer-Moore algorithm performs pattern-matching from right to left and computes the bad character and good suffix [8].The Rabin-Karp patternmatching algorithm performs patternmatching on the basis of modulus of a number. But it suffers from the drawback that there may be spurious hits in the result. In this paper we have suggested a unique prime-factor based approach to pattern-matching. This algorithm gives better results in several cases as compared to the contemporary patternmatching algorithms. Introduction Pattern matching is an interesting problem of finding a pattern in a text. This problem becomes even more interesting when we are trying to solve it in optimum time and with minimum complexity. There can be many ways of finding a pattern in a text. The basic concept of finding a pattern is used to compare each element of pattern with the text. For example, for the following text T and pattern P we have

In this way we have a large number of comparisons. If the size of text is n and the size of pattern is m then in worst case the number of comparisons required can be described by the following equation: Number of Comparisons=(n m+1)m = nm - m2+ m So, the primary emphasis of every optimum algorithm for pattern matching is to reduce these comparisons. When the size of problem is increasing, it becomes hard and time consuming. Here in this algorithm we tried to make minimum comparisons so that it can be an optimum pattern finding algorithm. The Proposed Algorithm The main approach of this algorithm is the property of prime numbers. That is Each natural number can expressed in the unique form of prime product. First, we have to convert the original problem into the form of numbers ( we make one-to -one mapping with natural numbers). We make a unique number with help of pattern to be searched. The algorithm calculates the maximum prime factor of the pattern using

181

max_prime-factor and checks whether the text is fully divisible by the maximum prime-factor thus calculated. If it is so, then only the pattern is checked against the text, otherwise not. Algorithm1: Search a pattern P in the text T Algorithm Prime_pattern Search(P, T) 1. Let P[1..m] is pattern and T[1..n] is a text. 2. P1:= value ( p,m) 3. where P stands for pattern and m for pattern length. 4. T1 := value (T,m) 5. d := max_prime_factor( P1) 6. s := n-m, r := radix. 7. For(i := 1 to s) 8. if( Ti % d = 0) 9. For (j =0 to m 1) 10. P[ j ] = T [ s + j] 11. End For 12. Print pattern is found 13. i := i + 1 14. Ti := value [ T [i..i + m], m] 15. End For Algorithm2: Return the value corresponding to the Pattern/Text Algorithm Value( P1,r,m) 1. P0 := 0 2. For (i := 1 to m) 3. P0 := r.P0 + P[i] 4. Return P0 Algorithm3: Return maximum prime factor of n Algorithm max_prime_ factor (n ) 1. m = n, i = j = 0, p [ h/2 ] := {0} 2. If (n =1|| 2) 3. Return n 4. While ( i!= (n + 1)) 5. If ( n % i = 0) 6. P [ j ] := i 7. j + + , n := n / i 8. i := 2

9. 10. 11.

End while else i:= i + 1 return P [ j ]

Experimental Results The algorithm was tested on different patterns and texts and it was found that as compared to the contemporary pattern matching algorithms such as Rabin-Karp, this prime number based approach was found to be more efficient as it uses the maximum prime factor approach. Although, the worst case complexity of this algorithm is O(nm), the same as that of Rabin-Karp algorithm. But, it may produce better results in some cases, such as if the number is exactly divisible by the max prime factor, the pattern may match the given text. The result is that no further comparisons may be required and thus reducing the number of comparisons. With this algorithm we can solve a problem using very few comparisons as compared to other pattern algorithms like Rabin Karp algorithm, Boyer Moore pattern matching algorithm and Kruth Moris algorithm. But all these algorithms may require large number of comparisons, whereas the prime number-based approach requires lesser number of comparisons if the maximum prime is unique and that is the best case for this algorithm. The worst case arises if Ti is divisible by the maximum prime number each time. Conclusions and Future work Although this algorithm gives better results in case the maximum prime factor matches the text, but it may also generate some spurious results. As a future enhancement, this algorithm can be generalized to remove the spurious results as far as possible. One of the main drawbacks of the algorithm is that it can only be used for those patterns and text that can be expressed as numbers as

182

this algorithm primarily uses the prime number-based approach. References [1] Christian Lovis, Robert H. Baud, Fast Exact String Pattern-matching Algorithms Adapted to the Characteristics of the Medical Language, Journal of American Medical Association, Jul-Aug 2000, v.7(4),pp. 378-391 [2] Aho, A.V., 1990, Algorithms for finding patterns in strings. in Handbook of Theoretical Computer Science, Volume A, Algorithms and complexity, J. van Leeuwen ed., Chapter 5, pp 255-300, Elsevier, Amsterdam. [3] AOE, J.-I., 1994, Computer algorithms: string pattern matching strategies, IEEE Computer Society Press. [4] Baase, S., Van Gelder, A., 1999, Computer Algorithms: Introduction to Design and Analysis, 3rd Edition, Chapter

11, pp. ??-??, Addison-Wesley Publishing Company. [5] Baeza-Yates R., Navarro G., Ribeiro-Neto B., 1999, Indexing and Searching, in Modern Information Retrieval, Chapter 8, pp 191-228, Addison-Wesley. [6] Beaquier, D., Berstel, J., Chretienne, P., 1992, lments d'algorithmique, Chapter 10, pp 337-377, Masson, Paris. [7] Cole, R., 1994, Tight bounds on the complexity of the Boyer-Moore pattern matching algorithm, SIAM Journal on Computing 23(5):1075-1091. [8] Cormen, T.H., Leiserson, C.E., Rivest, R.L., 2010. Introduction to Algorithms, Chapter 34, pp 853-885, MIT Press. [9] Crochemore, M., Hancart, C., 1999, Pattern Matching in Strings, in Algorithms and Theory of Computation Handbook, M.J. Atallah ed., Chapter 11, pp 11-1--11-28, CRC Press Inc., Boca Raton, FL.

183

Study and Implementation of Power Control in Ad hoc Networks

Abstract An ad hoc network facilitates communication between nodes without the existence of an established infrastructure. Random nodes are connected to one another using Ad hoc networking and routing among the nodes is done by forwarding packets from one to another which is decided dynamically. The transmission of packets among the nodes is done on a specified power level. Power control is the method used for transmission of the packets at an optimized power level so as to increase the traffic carrying capacity, reduce the usage of battery power and minimize the interference to improve the overall performance of the system with regards to the usage of power. This paper tells us regarding COMPOW (Common Power) and CLUSTERPOW (Cluster Power) protocols, which are two existing protocols for power control in homogeneous and non-homogeneous networks respectively. We have implemented these two protocols in Java Platform and run it for different number of nodes. From the implementation we have come up with the power optimal route among the nodes and the routing table for each node for both homogeneous and nonhomogeneous networks. COMPOW (Common Power) protocol is an asynchronous, distributed and adaptive algorithm for calculating the common optimized power for communication among different nodes. CLUSTERPOW (Cluster

Power) protocol is a protocol designed for optimizing the transmit power and establish efficient clustering and routing in nonhomogeneous networks. INTRODUCTION 1.1WIRELESS NETWORKS Any type of computer network which is wireless is known as a wireless network. It is commonly used in telecommunication network where wire is not the mode of connectivity among the nodes[5]. The various types of wireless networks are: 1. Wireless LAN (Local Area Network) 2. Wireless PAN (Personal Area Network) 3. Wireless MAN (Metropolitan Area Network) 4. Wireless WAN (Wide Area Network) 5. Mobile Devices Network Wireless networks are basically used for sending information quickly with greater reliability and efficiency. The usage of wireless networks range from overseas communication to daily communication among people through cellular phones .One of the most extensive use of wireless network is Internet connectivity among the countries. However, wireless networks are more prone to outside threat from malicious hackers and are hence vulnerable. 1.2 WIRELESS AD HOC NETWORK

184

Ad hoc networks are a new paradigm of wireless communication for mobile hosts (which we call nodes). In an ad hoc network, there is no fixed infrastructure such as base stations or mobile switching centers. Mobile nodes that are within each other s radio range communicate directly via wireless links, while those that are far apart rely on other nodes to relay messages as routers. Node mobility in an ad hoc network causes frequent changes of the network topology [1]. In a wireless Ad hoc Network, routing is done at each node by forwarding the data to other nodes and the forwarding by nodes is decided dynamically based on the connectivity among the nodes. Being a decentralized network and easy to set up, wireless ad hoc network find their usage in a variety of applications where the central node is not reliable. However in most ad-hoc networks the competition among the nodes results in interference which can be reduced using various cooperative wireless communications. 1.3 POWER CONTROL IN AD HOC NETWORKS Power control basically deals with the performance within the system. The intelligent selection of the transmit power level in a network is very important for good performance. Power control aims at minimizing the traffic carrying capacity, reducing the interference and latency, and increasing the battery life. Power control helps combat long term fading effects and interference. When power control is administered, a transmitter will use the minimum transmit power level that is required to communicate with the desired receiver. This ensures that the necessary and sufficient transmit power is used to establish link closure. This minimizes interference caused by this transmission to others in the vicinity. This improves both bandwidth and

energy consumption. However, unlike in cellular networks where base stations make centralized decisions about power control settings, in ad-hoc networks power control needs to be managed in a distributed fashion [2]. Power control is a cross-layer design problem .In the physical layer it can enhance the quality of transmission. In the network layer it can increase the range of transmission and the number of simultaneous transmissions. In the transport layer it can reduce the magnitude of interference. 1.4 PROJECT DESCRIPTION The aim of the thesis is to find the lowest common power level for an ad hoc network in which the network is connected for both homogeneous and non-homogeneous networks. For this there are two existing protocols [3][4] COMPOW protocol and CLUSTERPOW protocol. These power control protocols find the lowest common power levels for homogeneous and non homogeneous networks respectively. We found the routing table for each node in the network and then found the optimized route using Bellman Ford Algorithm that considers power as metric. From the optimized route we show the connectivity among the nodes at different power levels and compare the connectivity and efficiency of transmission among the nodes at different power levels. POWER CONTROL 2.1 INTRODUCTION Power control is the intelligent selection of lowest common power level in an ad hoc network in which the network remains connected. The power optimal route for a sender receiver pair is calculated and the power level used for this transmission is set

185

as the lowest power level for that particular transmission. In case of multiple nodes, power optimal route for each transmission is calculated. The importance of power control arises from the fact that it has a major impact on the battery life and the traffic carrying capacity of the network. In the subsequent topics we discuss how power control affects various layers. 2.2 TRANSMIT POWER Transmit power level is the power level at which the transmission among the nodes take place. Increasing the transmit power level has its own advantages. A higher transmit power level means a higher signal power at the receiver end .So the signal to noise ratio is significantly increased and the error in the link is thus reduced. When the signals in a network keep on fading, it is advantageous to use a high transmit power so that the signals received at the receivers end is not that weak. However high transmit power has quite a few disadvantages. The overall consumption of battery by the transmitting device will be high. Interference in the same frequency band increases drastically. Hence the need for an algorithm arises that can select an optimum transmit power level in a network.

COMPOW PROTOCOL 3.1 INTRODUCTION COMPOW (Common Power) protocol provides an asynchronous, distributive and adaptive algorithm which finds the smallest power level at which at which the network remains connected. The protocol provides bidirectionality of links and connectivity of the network, asymptotically maximizes the traffic carrying capacity, provides power aware routes and reduces MAC layer contention. [3] 3.2 CONNECTIVITY We generate nodes randomly on a surface area S square meters and assign them with specific x and y coordinates for each node generated randomly. For each source destination pair we check that whether the distance between them is less then m (the range in meters of each node). Then we check for the interference in transmission among the nodes .We take up an interference parameter n (assumed to be

186

much less than the range m). We check that the distance between the nodes of two simultaneous transmission is less than (1+n)*m. Suppose the rate at which the sender wants to send a data packet to receiver is a bits per second. There is a reciprocal dependency between the rate a and the range m.[3] Hence we need to decrease the m value .However very low values of m may result in a disconnected network. So we need to choose a r value at which the network remains connected and this suffices our aim of finding lowest common power level at which the network remains connected. 3.3 ADVANTAGES The COMPOW protocol increases the traffic carrying capacity, reduces the battery consumption i.e. increases the battery life, reduces the latency, reduces interference, guarantees bidirectional links ,provides power aware routes and can be used with any proactive routing protocol.[3] Another feature of the COMPOW protocol is the plug and play capability. It is among the very few protocols that has been implemented and tested in a real wireless test bed. [3]

3.4 LIMITATIONS When the nodes in a network are clustered the COMPOW protocol may settle for an unnecessarily high power level [3] .Even a single node outside the cluster may result in a high power level selection for the whole network. COMPOW protocol works only for homogeneous networks. 3.5 ENHANCEMENT we describe how to find the lowest common power level in a cluster using an existing protocol CLUSTERPOW that fills the loopholes of the COMPOW protocol.

CLUSTERPOW 4.1 INTRODUCTION CLUSTERPOW protocol provides us with implicit, adaptive, loop free and distributed clustering based on transmit power level. The routes discovered in this protocol consist of a non-increasing sequence of transmit power levels. CLUSTERPOW is an enhanced version of COMPOW as CLUSTERPOW is used in non-homogenous network whereas COMPOW is used where the network is homogenous. We can use CLUSTERPOW with both reactive and proactive routing protocol. It finds the lowest transmit power at which the network is connected.[4] 4.2 CONNECTIVITY We generate nodes randomly on a surface area S square meters and assign them with specific x and y coordinates for each node generated randomly. The next hop in CLUSTERPOW is found by consulting the lowest power routing table where the destination is reachable. As we go from the

187

source toward the destination the power level at every intermediate node is nonincreasing. That is, for every destination D, the entry (row) in the kernel routing table is copied from the lowest power routing table where D is reachable, i.e., has a finite metric. The kernel routing table has an additional field called the transmit power (txpower) for every entry, which indicates the power level to be used when routing packets to the next hop for that destination. [4] 4.3 ADVANTAGES

CLUSTERPOW recursive lookup scheme can be modified so that it is indeed free of infinite loops. This is done by tunneling the packet to its next hop using lower power levels, instead of sending the packet directly. One mechanism to achieve this is by using IP in IP encapsulation. Thus, while doing a recursive lookup for the next hop, we also recursively encapsulate the packet with the address of the node for which the recursive lookup is being done. The decapsulation is also done recursively when the packet reaches the corresponding destination [4]. CONCLUSION

The CLUSTERPOW protocol increases the network capacity, reduces the battery consumption i.e. increases the battery life, reduces interference and it is loop free. It takes care of non-homogenous networks. The traffic-carrying capacity of the network can be shown by taking into consideration the additional relaying burden of using small hops versus the interference caused by long hops, it is optimal to reduce the transmit power level [4]. 4.4 LIMITATIONS CLUSTERPOW does not take care of consumption of energy in transmitting the packets in the network. While CLUSTERPOW takes care of the network capacity, the power consumption in processing while transmitting and receiving is typically higher than the radiative power required to actually transmit the packet [4]. 4.5 ENHANCEMENTS The limitations of CLUSTERPOW protocol can be overcome using two existing protocols: Tunneled CLUSTERPOW and MINPOW protocol. The MINPOW protocol reduces the energy consumption in sending packets over the network. In Tunneled

In this paper we have implemented two existing protocols for Power Control in ad hoc networks: COMPOW and CLUSTERPOW. We have done the simulation in JAVA. Simulation of the protocols was done in constant nodes and power level was taken as the metric to compare the performance. We have constructed the routing tables for the transmission of data among the nodes and calculated the minimum transmit power required for the transmission. The results of simulation confirm that COMPOW protocol is better for homogeneous networks and it is not suitable for clusters whereas the CLUSTERPOW is better for nonhomogeneous networks. So we can conclude that no single protocol supersede any other protocol. The performance of the protocols depends upon the different scenarios it is subjected to. REFERENCES
[1]Zhou Lidong and Haas Zygmunt J., Securing Ad Hoc Networks, In IEEE Network magazine, special issue on networking security, Vol. 13, No. 6, November/December, (1999), pages 2430. [2]Agarwal Sharad, Krishnamurthy Srikanth V., Katz Randy H.,Dao Son K., Distributed Power Control in Ad-hoc Wireless Networks,Proc. of IEEE International Symposium on Personal, Indoor and

188
Mobile Radio Communications, San Diego, CA, vol. 2,(2001),pp. 5966 . [3]Narayanaswamy Swetha, Kawadia Vikas, Sreenivas R.S. and Kumar P. R., Power Control in Ad-hoc Networks: Theory, Architecture, Algorithm and Implementation of the COMPOW Protocol, Proc. of European Wireless Conference, (2002), pp. 156-162. [4]Kawadia Vikas and Kumar P.R., Power Control and Clustering in Ad Hoc Networks, Proc. of IEEE INFOCOM, (2003), pp. 459-469. [5]Goldsmith Andrea,Wireless Communications,California:Cambridge University Press,2005 [6]Tanenbaum Andrew S.,Computer Networks,New Jersey:Prentice Hall Publisher,2002 [7]Bellman Richard, On a Routing Problem, in Quarterly of Applied Mathematics, (1958), 16(1), pp.87-90. [8]Toh C.K, Ad Hoc Mobile Wireless Networks,New Jersey:Prentice Hall Publisher,2002 8] Ankit Saha , Chirag Hota Study and Implementation of Power Control in Ad hoc Networks. National Institute of Technology Rourkela

189

Improving The Performance Of Web Log Mining By Using K-Mean Clustering With Neural Network
Vinita Shrivastava

Abstract The World Wide Web has evolved in less than two decades as the major source of data and Information for all domains. Web has become today not only accessible and searchable information source but also one of the most important communication channels, almost a virtual society. Web mining is a challenging activity that aims to discover new, relevant and reliable information and knowledge by investigating the web structure, its content and its usage. Though the web mining process is similar to data mining, the techniques, algorithms, and methodologies used to mine the web encompass those specific to data mining, mainly because the web has a great amount of unstructured data and the changes are frequent and rapid. In the present work, we propose a new technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multilayered neural network using the K-means clustering algorithm. The proposed model use multi-layered network architecture with a back propagation learning mechanism to discover and analyse useful knowledge from the available Web log data. Index Terms Clustering algorithms, data mining,and Unsupervised Learning algorithm, Online Learning Algorithm, Neural network, k-mean clustering, web usage mining.

INTRODUCTION Web mining the application of machine learning techniques to web-based data for the purpose of learning or extracting knowledge. Web mining encompasses wide variety techniques, including soft computing. Web mining methodologies can generally be classified into one of three distinct categories: web usage mining, web structure mining, and web content mining examine web page usage patterns in order to learn about a web system's users or the relationships between the documents. In web usage mining the goal is to examine web page usage patterns in order to learn about a web system's users or the relationships between the documents. For example, the tool presented and creates association rules from

web access logs, which store the identity of pages accessed by users along with other information such as when the pages were accessed and by whom; these logs are the focus of the data mining effort, rather than the actual web pages themselves. Rules created by their method could include, for example, "70% of the users that visited page A also visited page B examines web access logs. Web usage mining is useful for providing personalized web services, an area of web mining research that has lately become active. It promises t o help tailor web services, such as web search engines, to the preferences of each individual user. In the second category of web mining methodologies, web structure mining, we examine only the relationships between web documents by utilizing the information conveyed by each document's hyperlinks. Data mining is a set of techniques and tools used to the no trivial process of extracting and present implicit knowledge, no knowledge before, this information is useful and human reliable; this is processing from a great set of data; with the object of describing in automatic way models, no knowledge before; to detect tendencies and patterns [1,2] The Web Mining are the set of techniques of Data Mining applied to Web [7]. The Web Usage Mining is the process of applying techniques to detect patterns of usage to Web Page [3,5]. The Web Usage Mining use the data storage in the Log files of Web server as first resource; in this file the Web server register the access at each resource in the server by the users [4,6].

NEURAL NETWORK An Artificial Neural Network (ANN) is an information-processing paradigm that is inspired by the way biological nervous systems, such as the

190
brain, process information. The key element of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements (neurons) working in unison to solve specific problems. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process [9]. 2.1 Architecture of neural networks 2.1.1 Feed-forward networks Feed-forward ANNs allow signals to travel one way only; from input to output. There is no feedback (loops) i.e. the output of any layer does not affect that same layer. Feed-forward ANNs tend to be straightforward networks that associate inputs with outputs. They are extensively used in pattern recognition. This type of organization is also referred to as bottom-up or top-down. 2.1.2 Feedback networks Feedback networks can have signals traveling in both directions by introducing loops in the network. Feedback networks are very powerful and can get extremely complicated. Feedback networks are dynamic; their 'state' is changing continuously until they reach an equilibrium point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be found. Feedback architectures are also referred to as interactive or recurrent, although the latter term is often used to denote feedback connections in single-layer organizations. pre-processing, pattern discovery, and pattern analysis [3, 8]. Pre-processing consists of converting usage information contained in the various available data sources into the data abstractions necessary for pattern discovery. Another task is the treatment of outliers, errors, and incomplete data that can easily occur due reasons inherent to web browsing. The data recorded in server logs reflects the (possibly concurrent) access of a Web site by multiple users, and only the IP address, agent, and server side clickstream are available to Identify users and server sessions. The Web server can also store other kinds of usage information such as cookies, which are markers generated by the Web server for individual client browsers to automatically track the site visitors [3, 4]. After each user has been identified (through cookies, logins, or IP/agent analysis), the clickstream for each user must be divided into sessions. As we cannot know when the user has left the Web site, a timeout is often used as the default method of breaking a users click-stream into sessions [2]. The next phase is the pattern discovery phase. Methods and algorithms used in this phase have been developed from several fields such as statistics, machine learning, and databases. This phase of Web usage mining has three main operations of interest: association (i.e. which pages tend to be accessed together), clustering (i.e. finding groups of users, transactions, pages, etc.), and sequential analysis (the order in which web pages tend to be accessed) [3, 5]. The first two are the focus of our ongoing work. Pattern analysis is the last phase in the overall process of Web usage mining. In this phase the motivation is to filter out uninteresting rules or patterns found in the previous phase. Visualization techniques are useful to help application domains expert analyze the discovered patterns.

MINING WEB USAGE DATA In Web mining, data can be collected at the serverside, client-side and proxy servers. The information provided by the data sources described above can be used to construct several data abstractions, namely users, page-views, click-streams, and server sessions. A user is defined as a single individual that is accessing file web servers through a browser. In practice, it is very difficult to uniquely and repeatedly identify users. A page-view consists of every file that contributes to the display on a users browser at one time and is usually associated with a single user action such as a mouse-click. A click-stream is a sequential series of page-views requests. A server session (or visit) is the click-stream for a single user for a particular Web site. The end of a server session is defined as the point when the users browsing session at that site has ended [3, 10]. The process of Web usage mining can be divided into three phases:

CONVENTIONAL METHOD USED IN WEB MINING Clustering Clustering the process of partition a set of data in a set of meaning full subclasses known as clusters. It helps users understand the natural grouping or structure in a data set. Clustering is an unsupervised learning technique which aim is to find structure in a collection of unlabeled data. It is being used in many fields such as data mining, knowledge discovery, pattern recognition and classification [3]. A good clustering method will produce high

191
quality clusters in which similarity is high known as intra-classes and inter-classes where similarity is low. The quality of clustering depends upon both the similarly measure used by the method and it, s implementation and it is also measured by the its ability to discover hidden patterns. Generally speaking, clustering techniques can be divided into two categories pair wise clustering and central clustering. The former also called similaritybased clustering, groups similar data instances together based on a data-pair wise proximity measure. Examples of this category include graph partitioning-type methods. The latter, also called centroid-based or model-based clustering, represents each cluster by a model, i.e., its centroid". Central clustering algorithms [4] are often more efficient than similarity-based clustering algorithms. We choose centroid-based clustering over similaritybased clustering. We could not efficiently get a desired number of clusters, e.g., 100 as set by users. Similarity-based algorithms usually have a complexity of at least O (N2) (for computing the data-pair wise proximity measures), where N is the number of data instances. In contrast, centroid-based algorithms are more scalable, with a complexity of O (NKM), where K is the number of clusters and M the number of batch iterations. In addition, all these centroid-based clustering techniques have an online version, which can be suitably used for adaptive attack detection in a data environment K-Mean Algorithm The K-Means algorithm is one of a group of algorithms called partitioning clustering algorithm [4]. The most commonly use partitional clustering strategy is based on square error criterion. The general objective is to obtain the partition that, for a fixed number of clusters, minimizes the total square errors. Suppose that the given set of N samples in an ndimensional space has somehow been partitioned into K-clusters {C1, C2, C3... CK}. Each CK has nK samples and each sample is in exactly one cluster, so that nK = N, where k=1 K. The mean vector Mk of cluster CK is defined as the centroid of the cluster MK = (1/nk) Where xik is the ith sample belonging to cluster CK. The square-error for cluster CK is the sum of the squared Euclidean distances between each sample in CK and its centroid. This error is also called the within-cluster variation [5]: ek2 = The square-error for the entire clustering space containing K cluster is the sum of the within-cluster variations The basic steps of the K-mean algorithm are: 1. Select an initial partition with K clusters containing randomly chosen sample, and compute the centroids of the clusters, 2. Generate a new partition by assigning each sample to the closest cluster centre, 3. Compute new cluster centre as the centroids of the clusters, 4. Repeat steps 2 and 3 until optimum value of the criterion function is found or until the cluster membership stabilizes. 4.3 Problem identification: Problems with k-means In k-means, the free parameter is k and the results depend on the value of k. unfortunately; there is no general theoretical solution for finding an optimal value of k for any given data set. It take more time for calculating the data set. It can only handled the Numerical data set. The Result depend on the Metric used the measure || x-mi||.

PROPOSED APPROACH:In the present work, the role of the k-means algorithm is to reduce the computation intensity of the neural network, by reducing the input set of samples to be learned. This can be achieved by clustering the input dataset using the k-means algorithm, and then take only discriminate samples from the resulting clustering schema to perform the learning process. The number of fixed clusters can be varied to specify the coverage repartition of the samples. The number of selected samples for each class is also a parameter of the selection algorithm. Then, for each class, we specify the number of samples to be selected according to the class size. When the clustering is achieved, samples are taken from the different obtained clusters according to their relative intraclass variance and their density. The two measurements are combined to compute a coverage factor for each cluster. The number of samples taken from a given cluster is proportional to the computed coverage factor. Let A be a given class, to witch we want to apply the proposed approach to extract S sample. Let k be the number of cluster fixed to be used during the k-means clustering phase. For each generated cluster cli, (i:1..k), the relative variance is

192
computed using the following expression: When Card(X) give the cardinality of a given set X, and dist(x,y) give the distance between the two points x and y. The most commonly used distance measure is the Euclidean metric which defines the distance between two points x=(p1,.pN) and y=(q1,.,qN) from RN as: {if dist(Candidates[j].point,x)<min then min:= dist(Candidates[j].point,x) ; } if (min > ) then Sam(i):=Sam(i) U{Candidates[j].point}; j:=j+1; } if card(Sam(i)) < Num_samples(cli) then repeat{Sam(i):=Sam(i)UCandidates[random].poin }until (card(Sam(i)) = Num_samples(cli)); 3-For i=1 to k do Out_sam:=Out_sam U Sam(i);

The density value corresponding to the same cluster cli is computed like the following: The coverage factor is then computed by: We can clearly see that: 0 Vr(cli) 1 and 0 Den(cli) 1 for any cluster cli. So the coverage factor Cov(cli) belong also to 1-Cluster the class A using the k-means algorithm into k cluster.the [0,1] interval. Furthermore, it is clear that: We can so deduce easily that: Hence, the number of samples selected from each cluster is determined using the expression Num_samples(cli)=Round(S*cov(cli) Let A be the input class; k: the number of cluster; S: the number of samples to be selected (S k); Sam(i): the resulting selected set of samples for the cluster i; Out_sam: the output set of samples selected from the class A; Candidates: a temporary array that contain the cluster points and their respective distance from the centroid. i,j,min,x: intermediates variables; : Neiberhood parameter The proposed selection model algorithm is 1-Cluster the class A using the k-means algorithm into k cluster. 2-For each cluster cli (i:1..k) do { Sam(i) :={centroid(cli)}; j:=1; For each x from cli do { Candidates [j].point :=x; Candidates [j].location :=dist(x, centroid(cli)) ; j:=j+1 ;}; Sort the array Candidates in descending order with Hence, the number of samples selected from each cluster is respect to the values of location field; j:=1; While((card(Sam(i)))<Num_samples(cli)) and (j<card(cli)) do{min:=100000; For each x from Sam(i) do

Conclusion In this work, we study the possible use of the neural networks learning capabilities to classify the web traffic data mining set. The discovery of useful knowledge, user information and server access patterns allows Web based organizations to mining user access patterns and helps in future developments, maintenance planning and also to target more rigorous advertising campaigns aimed at groups of users. Previous studies have indicated that the size of the Website and its traffic often imposes a serious constraint on the scalability of the methods. As popularity of the web continues to increase, there is a growing need to develop tools and techniques that will help improve its overall usefulness.

REFRENCES: [1] W.J. Frawley, G. Piatetsky-Shapiro, and C.J. Matheus, Knowledge Discovery in Databases: An Overview, Knowledge Discovery in Databases, G. Piatetsky-Shapiro and W.J Frawley, eds., Cambridge, Mass.: AAAI/MIT Press, pp. 1-27, 1991. [2] Mika Klemettinen, Heikki Mannila, Hannu Toivonen: A Data Mining Methodology and Its Application to Semi-automatic Knowledge Acquisition. DEXA Workshop 1997: 670-677 [3] R. Kosala, H. Blockeel, and Web Mining Research: A Survey, SIGKKD Explorations, vol. 2(1), July 2000. [4] Borges-Levene, An average linear time algorithm for web usage mining:, 2000 [5] J. Srivastava, R. Cooley, M. Deshpande, P.-N. Tan, Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data, SIGKKD Explorations, vol.1, Jan 2000. [6] P. Batista, M. J. Silva, Mining web access logs of an on-line newspaper, (2002), http://www.ectrl.itc.it/rpec/RPEC-apers/11batista.pdf.

193
[7] Cernuzzi, L., Molas, M.L. (2004). Integrando diferentes tcnicas de Data Mining en procesos de Web Usage Mining. Universidad Catlica "Nuestra Seora de la Asuncin". Asuncin. Paraguay. [8] R. Ivncsy, I. Vajk, Different Aspects of Web Log Mining. 6th International Symposium of Hungarian Researchers on Computational Intelligence. Budapest, Nov., 2005. [9] Chau, M.; Chen, H., Incorporating Web Analysis Into Neural Networks: An Example in Hopfield Net Searching, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, Volume 37, Issue 3, May 2007 Page(s):352 358 [10] Raju, G.T.; Satyanarayana, P. S. Knowledge Discovery from Web Usage Data: Extraction of Sequential Patterns through ART1 Neural Network Based Clustering Algorithm, International Conference on Computational Intelligence and Multimedia Applications, 2007, Volume 2, Issue , 13-15 Dec. 2007 Pages :88 -92 [11] Jalali, Mehrdad Mustapha, Norwati Mamat, Ali Sulaiman, Md. Nasir B. , A new classification model for online predicting users future movements, in International Symposium on Information Technology, 2008. ITSim 2008 26-28 Aug. 2008, Volume: 4, On page(s): 1-7, Kuala Lumpur, Malaysia

194

Higher Education Through Entrepreneurship Development In India


Mrs Vijay Research Scholar Manav Bharti University, Solan H.P MBA( HR/Mktg),M.Phil(Management) Pursuing Ph.D(Management) vijaykharb18@gmail.com9253722808.
Abstract Higher education plays a very precious role now a day. Higher education developed a new skills, new ideas and new strategy. In India higher education system is one of the largest systems in the world. Higher education for role entrepreneurship can play shaping the institutional development. It is clear statement entrepreneurships engagement is a rapidly expanding and evolving aspect of higher education that requires proper support and developed. Higher education is changing the phase of entire society and help to develop the nation. INTRODUCTION Higher education plays a very precious role now a day. Higher education developed a new skills, new ideas and new strategy. In India higher education system is one of the largest systems in the world. Higher education for role entrepreneurship can play shaping the institutional development. It is clear statement entrepreneurships engagement is a rapidly expanding and evolving aspect of higher education that requires proper support and developed. Higher education is changing the phase of entire society and help to develop the nation. . gurudakshina after completion studies. Education mostly for males all the work done by males and women take care child and home. In olden days a few popular education entrepreneurship Takshila(medicine), Ujjan(astronomy) and Nalanda is biggest branches of knowledge(10,000 students). In 18th century widespread-by British education every temple, village and regions of the country. The subjects taught included Reading, Writing, Arithmetic, Theology, Law, Astronomy, Metaphysics, Ethics, Medical Science and Religion. The schools were attended by students representative of all classes of society. 20th century education systems under British rule. Gandhi is said to have described the traditional educational system as a beautiful tree that was destroyed during the British rule. After Independence, education became the responsibility of the states. The central Governments been to co-ordinate in technical and higher education and specifies standard YEAR Field Education system 1964 Dr. Kothari, Education CHANGING EDUCATION SYSTEMS IN INDIA History In olden days education system in India has a long history. The Gurukul system of education is popular in those days. Gurukul gave a highest education for human development physical, mental and spiritual. At the Gurukuls, the teacher imparted knowledge of Religion, Scriptures, Philosophy, Literature, Statecraft, Medicine Astrology etc. Education was free but students gave a voluntary Commission under the D.S. Scientific field

Chairmanship 1976 Ministry Human Resource Development's Department of Education policy planning and

195
Education 1986-1992 Government Education compulsory up years 1998 PM Vajpayee A.B. Setting up of Vidya Vahini Network link to up to 14 It has assumed super importance for accelerating economic growth both in developed and developing countries. It promotes capital formation and creates wealth in country. It is hope and dreams of millions of individuals around the world. It reduces unemployment and poverty. It is the process of searching out opportunities in the market place and arranging resources required to exploit these opportunities for long term gains. It is the process of planning, organizing, controlling, opportunities and assuming. Thus it is a risk of business enterprise. It is a creative and innovative skill and adapting response to environment of what is real, when where and why. Higher education in India has evolved in distinct and divergent streams with each stream monitored by an apex body, indirectly controlled by the Ministry of Human Resource Development. There are 18 important universities called Central Universities, which are maintained by the Union Government. The private sector is strong in Indian higher education.. The National Law School, Bangalore is highly regarded, with some of its students being awarded Rhodes Scholarships to Oxford University, and the All India Institute of Medical Sciences is consistently rated the top medical school in the country. Indian School of Business, Hyderabad and the Indian Institutes of Management (IIMs) are the top management institutes in India. The University Grants Commission Act 1956 explains. "The right of conferring or granting degrees shall be exercised only by a University established or incorporated by or under a Central Act carol bon tempo, or a State Act, or an Institution deemed to be University or an institution specially empowered by an Act of the Parliament to confer or grant degrees. Thus, any institution which has not been created by an enactment of Parliament or a State Legislature or has not been granted the status of a Deemed to be University is not entitled to award a degree." Accreditation for higher learning autonomous institutions (13) established by the University Grants Commission:A) All India Council for Technical Education (AICTE) B) Indian Council for Agriculture Research (ICAR) C) National Council for Teacher Education (NCTE) D) Pharmacy Council of India (PCI)

universities, UGC CSIR. and

2000

Government

6% Gross

of

the

Domestic Product (GDP) spent Primary education 2000 onwards Government More Emphasized Higher Education. Expenditure on Education in India - The Government expenditure on Education has greatly increased since the First five-year plan. The Government of India on elementary education goes towards the payment of teachers' salaries. Government established Bal Bhavans, Distance education, Education for Women (30% of the seats have been reserved for women). Higher Education Entrepreneurships Entrepreneurship is the act of being an entrepreneur, which can be defined as "one who undertakes innovations, finance and business acumen in an effort to transform innovations into economic goods.

196
E) Indian Nursing Council (INC) F) Dentist Council of India (DCI)* Central Council of Homeopathy (CCH) G) Central Council of Indian Medicine (CCIM) The entire society and humane life both have changed considerably since that a few years back. Information Technology has increased the pace of development. Higher education is helpful to developed entrepreneurship by entrepreneurs. Now a day a best women entrepreneurs A number of courses in India help to developed entrepreneurships. For ex MBA are a one of the best course to developed entrepreneurs. MBA is an indicator of persons eligibility for working on managerial positions and deal with heavy work load. Moreover the employer can be sure that an applicant for certain positions have a solid experience background because MBA was initially designed for already working professionals who want to expand their skills, knowledge, Networking opportunity, Start your own business, confidence, Innovation and creativity etc.,. One of the main criteria for applying for MBA is several years of working experience depending on a program. MBA is a very attractive to your help handled situations and useful to develop economic of our country. Objectives: (Through Education developed) 1. Develop and strengthen their entrepreneurial quality.. 2. Analyze Environment 3. Entrepreneurial disciplines 4. Understand overall procedures 5. Provides large scale employment 6. Effective resource mobilization of capital and skill 7. Balanced Regional development 8. Developed Trade 9. It promotes capital formation 10. Develop passion for integrity and honesty.

Conclusion: Best Education through cultivate the human minds. REFERENCES:[1] Akhouri, M.M.P and vinod Gupta: Sustaining Entrepreneurship, NIESBUD, New Delhi, 1990. [2] Brimmer, A.F.: The Setting of Entreprenurship in India, Quartely journal of Management, L 20 (4), November 1955. [3] Dhar,P.N. and H.F.Lydal: The Role of Small Entreprises in Indias Economic Development, Asia Publishing House, Bombay, 1961. [4] Khanka S.S.: Entrepreneurial Development ,S. Chand & Company,New Delhi 2006. [5] Kilby, Peter(Ed): Entrepreneurship and Economic Development, The Free Press, New York,1971. [6] Singh, N.P. : The Role of Entrepreneurship in Economic Development, Vora & Co Publishers (Pvt.) Ltd., Bombay, 1966. [7] Shane, Scott "A General Theory of Entrepreneurship: the Individual-Opportunity Nexus", Edward Elgar,

Role of Entrepreneurships According to Jones and Butler 1992 Corporate Entrepreneurships is the process by which firms notice opportunities and act to creatively organized transactions between factors of production so as to create the surplus values There are a different variable affect to developed entreprenurships1. Economic Factors - Capital, Labour, Raw Materials and Market 2. Social Factors - Family, Society and Socia- Cultral 3. Psychological Factors - Primary and Secondary needs 4. Political Factors - Government Rule and Regulation.

197

Concepts, Techniques, Limitations & Applications Of Data Mining

Abstract Data mining is a new powerful technique to extract useful information from large and unorganized databases. It is the search for relationship and global patterns which exists in large databases but are hidden among the very large amount of data. It is concerned with the analysis of data and the use of software techniques for generating patterns and regulations in sets of data. Data mining enables us to understand the present market trends and makes us able to adopt proactive measures to get maximum profit from the same. In the present paper concepts,techniques,limitations and application of data mining in marketing have been discussed and analysed.The paper demonstrates the ability of data mining in improving the decision making process in marketing field. Keywords: Mass marketing, Artificial Neural Networks, Proactive, Predictive, Data mining Applications.

INTRODUCTION ata mining can be defined as the process of data Dselection and exploration and building models using vast data stores to uncover previously unknown patterns[1]. It aims to identify valid, novel, potentially useful and understandable correlations and patterns in data by combing through copious data sets to sniff out patterns that are too subtle are complex for humans to detect [2].The existence of medical insurance fraud and abuse for example has led many healthcare insurers to attempt to reduce their losses by using data mining tools to help them find and track offenders [3]. Data mining can improve decision making by discovering patterns and trends in large amounts of complex data [4]. Presently most of the industries which sell products and services require advertising

and promoting their products and services. Banks, insurance companies and retail stores are typical examples. Usually two type of techniques to advertisement and promotion. (i)Mass marketing and (ii) Direct marketing. Mass marketing deals with mass media such as televisions, radio and newspapers, broadcasts messages to the public with out discrimination. It has been an effective way of promotion when the products were in huge demand by the public. The second techniques of promotion is direct maketing.In this techniques instead of promoting to customers in discriminatively, direct marketing studies customers characteristics and requirements and chooses certain customers as the target for promotion. It is expected that the response rate for the selected customers can be much improved. At present a large amount of information on customers is available in databanks. Hence data mining can be very useful for direct marketing. Data mining has been used widely in direct marketing to target customers [5, 6]. In medical community some authors refer to data mining as the process of acquiring information whereas others refer to data mining as utilization of statistical techniques within the knowledges discovery process [7].The terror related activities can be detached on the web using data mining techniques [8] .Terrorists cells use the internet infrastructure to exchange news and recruit new members and supporters .It is believed that the detection of terrorists on the web might prevent further terrorist attacks. Keeping their view in mind law enforcements agencies are trying to detect terrorists by monitoring all ISPs traffic using data mining [8].

Process of Data Mining:

198
The various steps [9] in the data mining process to extract useful informations are: (i) Problem definition: This phase is to understand the problem and the domain environment in which the problem occurs. We need to clearly define the problem before proceed further. The Problem definition specifies the limits within which the problem needs to be solved. It also specifies the cost limitations in solving the problem. (ii) Creation of a database for data mining: This phase is to create a database where the data to be mined are stored for knowledge acquisition. The creation of data mining database consumes about 50% to 90% of the overall data mining process. Data warehouse is also a kind of data storage where large amount of data is stored for data mining. (iii) Searching of the database: This phase is to select and examine important data sets of a data mining database in order to determine their feasibility to solve the problem. Searching the database is a time consuming process and requires a good user interface and computer system with good processing speed. (iv) Creation of a data mining model: This phase is to select variables to act as predictors. New variables are also built depending upon the existing variables along with defining the range of variables in order to support imprecise information. (v) Building a data mining model: This phase is to create various data mining models and to select the best of these models. Building a data mining model is an iterative process. The data mining model which we select can be a decision tree, an artificial neural network or an association rule model. (vi) Evaluation of data mining model: This phase is to evaluate the accuracy of the selected data mining model. In data mining the evaluating parameter is data accuracy in order to test the working of the model. This is because the information generated in the simulated environment varies from the external environment. (vii) Deployment of the data mining model: This phase is to deploy the built and the evaluated data mining model in the external working environment. A monitoring system should monitor the working of the model and produce reports about its performance. The information in the report helps to enhance the performance of selected data mining model. The following fig. shows the various phases in the data mining process. ,

Data mining process models: We need to follow a systematic approach of data mining for meaningful retrieval of data from large data banks. Several process models have been proposed by various individuals and organizations that provide systematic phases for data mining. The three most popular process models of data mining are: The CRISP-DM process model: In this process model CRISP-DM stands, for cross industry standard process for data mining. The life cycle of CRISP-DM process model consists of six phases: (i) Understanding the business: This phase is to understand the objectives and requirements of the

199
business problems and generating a data mining definition for the business problem. (ii) Understanding the data: This phase is to first analyze the data collected in the first phase and study its characteristics and matching patterns to propose a hypo these for solving the problem. (iii) Preparing the data: This phase is to create final datasets that are input to various modeling tools. The raw data items are first transformed and cleaned to generate datasets which are in the form of tables, records and fields. (iv) Modeling: This phase is to select and apply different modeling techniques of data mining. We input the data sets collected from the previous phase to these modeling techniques and analyze the generated output. (v) Evaluation: This phase is to evaluate a model or a set of models that we generate in the previous phase for better analysis of the refined data. (vi) Deployment: This phase is to organize and implement the knowledge gained from the evaluation phase in such a way that it is easy for the end users to comprehend. The 5As process model: This process model stands for Assess, Access, Analyze, Act and Automate. The 5As process model of data mining generally begins by first assessing the problem in hand. The next logical step is to access or accumulate data that are related to the problem. After that we analyze the accumulated data from different angles using various data mining techniques. We then extract meaningful information from the analyzed data and implement the result in solving the problems in hand. At least we try to automate the process of data mining by building software that uses the various techniques which we used in the 5As process model. The six sigma process model: The six sigma is a data driven process model that eliminates defects, wastes or quality control problems that generally occurs in a production environment. Six sigma is very popular in various American industries due to its easy implementation and it is likely to be implemented world wide. This process model is based on various statistical techniques, use of various types of data analysis techniques and implementation of systematic training of all the employees of an organization. Six sigma process model postulates a sequence of five stages called DMAIC, which stands for Define, Measure, Analyse, Improve and Control. The life cycle of six sigma process model consists of five phases: (i) Define: This phase is to define the goals of a project along with its limitations. (ii) Measure: This phase is to collect information about the current process in which the work is done and to try to identify the basis of the problem. (iii) Analyze: This phase is to identify the root cause of the problem in hand and ensure those root causes by using various data analysis tools. (iv) Improve: This phase is to implement all those solutions that tries and solves the root causes of the problem in hand. (v) Control: This phase is to monitor the outcome of all its previous phases and suggest improvement measure in each of its earlier phases. Data Mining Techniques: The most commonly used techniques [10] in data mining are: (a) Artificial Neural networks: Non linear predictive models that are learnt by training & resemble and variable biological neural networks in structure. (b) Decision trees: The decision tree methods include classifications and regression tress (CART) and chi-square automatic interaction detection (CHAID). These resemble tree shaped structures. (c) Genetic algorithms: Optimizations techniques that use processes such as genetic combination, mutation and natural selection in a design based on the concepts of evolution. (d) Nearest neighbor method: A technique that classifies each record in a data set based on a combination of the classes of the k record(s) most similar to it in a historical dataset. It is sometimes called the k-nearest neighbor technique. (e) Rule Induction: The extraction of useful if then rules from data based on statistical significance. How Data Mining is applied: The technique that is used in data mining is called modeling. Modeling is simply the act of building a model in one condition where one knows the answers and then applying it to another condition that one does not know. The following two examples may help in understanding the use of data mining for building a model for new customer. Let us suppose that the marketing director has a lot of information about his prospective customers i.e. their age, sex, credit history etc. Now his problem is that he dont know the long distance calling usage of these prospects (because they are most likely now customers of his

200
Data mining can be used for direct marketing to get higher profit as compared to mass marketing. For that whole database contains 300000 customers. In direct marketing only 20% of the customers identified as likely buyers by data mining (which costs. 60000) are chosen to receive the promotion package in the mail. The mailing cost is thus reduced dramatically. At the same time however the response rate can be improved from 1% in mass mailing to 3 %( real improvements for a 20% rollout). We see from table-3 that net profit from the promotion becomes positive in direct mailing compared to a loss in mass mailing. Table-3 A comparison between directional campaign and mass mail campaign.

Details Number of customers mailed Cost of printing, mailing(Rs 40.00 each)

Mass mailing 300,000 12000000

Direct Mailing (20%) 60000 2400000

Cost of data mining Nil 1000000 Total promotion cost 12000000 3400000 Response rate 1.0% 3.0% Number of sales 3000 1800 Profit from sale(Rs. 3000 9000000 5400000 each) Net profit from promotion -3000000 2000000 competition).He would like to concentrate on those prospects that have large amounts of long distance usage. Then he can obtained this by building the model [8] as shown in the table-1 Table-1 data mining for Prospecting

The aim in prospecting is to make some calculated guesses about the information in the lower right hand quadrant Which is based on the model that he builds while going from customers general information to customers proprietary information. With this model in hand new customers can be selected as target. Another common example for building the models is shown in the table-2.Test marketing is an excellent source of data for this kind of modeling. By mining the results of a test market which represents a broad but relatively small sample of prospects one can provide a base for identifying good prospects in the overall market. Table2 Data mining for predictions

Discussions: We see from table-3 that net profit from the promotion becomes positive in direct mailing compared to a loss in mass mailing. Thus the paper demonstrates the ability of data mining in improving the decision making process in marketing field. Data mining Issues: Although data mining has been developed into a conventional, mature, trusted and powerful technique even then there are certain issues [10, 11] related to data mining which are discussed below in detail. One should note it that these issues are not exclusive and are not ordered in any way. (i) Issues related to Data mining methods: It is often desirable to have different data mining methods available because different approaches mine data differently depending upon the data in hand and the mining requirements [9].The algorithms that we use

201
in data mining assumes that the stored data is always noise free and in most of the cases it is a forceful assumption. Most data sets contain exceptions, invalid or incomplete information which complicates the data analysis method. Presence of noisy data reduces the accuracy of mining results. Due to which data preprocessing i.e. the cleaning of data and its transformation becomes essential. Data cleaning is a time consuming process but it is one of the most important phase in knowledge discovery method. Data mining techniques should be able to handle noisy or incomplete data. (ii) Issues related to Data source: Data mining systems rely on databases to supply the raw data for input and this raises serious issues because databases are dynamic, incomplete, noisy and large[10] The current trend[9] is to collect as much of data as possible and mine them later as and when required. The concern is about the quality and type of the large data being collected; very clear understanding is required to collect right data of proper amount and to distinguish between useful and useless data. Now a days databanks are of different types and stores data with complex and diverse data types. It is very difficult to expect a data mining system to effectively and efficiently achieve good mining results on all kinds of data and sources. Different data types and data sources require specialized mining algorithms and techniques . (iii) Issues related to user interface: The knowledge invented by data mining tools is useful as long as it is interesting and understandable by the end user. Good data visualization simplifies interpretation of data mining results as well as helps end users to better understand their needs. The major problems related to user interfaces and visualization are Screen real estate, information rendering and interaction.Intacterivity with stored data and data mining results is essential because it provides means for the end user to focus and refine mining tasks and to visualize the discovered knowledge from different angles and at different conceptual levels. (iv) Issues related to Security and social matters: Security is an important problem with any type of data collection which is shaved and/or is intended to be used for strategies decision making [9].Moreover when data is collected for customers profiling, user behavior understanding, correlating personal data with other information etc .large amounts of sensitive and private information about individuals or companies is gathered and stored. This becomes controvertisial given the confidential nature of some of this data and the potential illegal areas to the information. Moreover data mining could disclose new implicit knowledge about individuals or groups that could be against privacy policies (v) Issues related to Performance: Artificial intelligence and statistical methods for the data analysis and interpretation are generally not designed for mining large data sets. Data sets size in terabytes is common now a day. As a result, processing large data sets raises the problems of scalability and efficiency of data mining methods. It is not possible to fractionally use algorithms with exponential and even medium order polynomial complexity for data mining. Linear algorithms are generally used for mining large data. Applications of Data Mining: Data mining finds its applications in many fields in our daily life. It is very useful for small, middle and large organizations that produces large amount of data everyday. Almost all the present day different organizations use data mining in all the plans of their work. Some of these organizations use data mining in all the phases of their work. Some of these potential applications are mentioned below: (i) Data mining in marketing: Data mining provides the marketing and sales executive with various decision support systems that helps us in consume acquisition, consumer segmentation consumer retention and cross selling. In this way it enables us to better interact with consumers, improve the level of consumer services that we provide and establish a song lasting consumer relationship. One can demonstrate that data mining is an effective tool for direct marketing which can give more profit to the retail industry than the traditional means of mass marketing. (ii) Data mining in healthcare: Healthcare organizations and pharmaceutical organizations provides huge amount of data in their clerical and diagnostic activities. Data mining enables such organizations to use the machine learning techniques to analysis healthcare and pharmaceutical data and retrieve information that might be useful for developing new drugs. When medical institutions use data mining for their existing data they can discover new, useful and potentially life saving knowledge that otherwise remained inert in their database. (iii) Data Mining in Banking: Bank authorities can be able to study and analyze the credit patterns of their consumer and prevent any kind of bad credits or fraud detection in any kind of banking transactions using data mining. By data mining the bank authorities can find hidden correlations between different financial indicators and can identify stock

202
trading rules from historical market data. By data mining bank authorities can identify to change credit card affiliation. (iv) Data mining insurance sector: Data mining can help the insurance companies to predict which customers with buy new policies and can also identify the behavior patterns of risky customers and fraudulent behavior. (v) Data Mining in Stocks and investments analysis and management: Data mining enables us to study fist the specific patterns of growth or downslides of various companies and then intelligently invest in a company which shows the most stable growth for a specific period. (vi) Data Mining in Computer security analysis and management: Data mining enables network administrators and computer security experts to combine its analytical techniques with our business knowledge to identify probable instances of fraud and abuse that compromises the security of a computer or network. (vii) Crime analysis and management: Data mining enables security agencies and police organizations to analyze the crime rate of a city or a locality by studying the past and the current trend that led to the crime and prevents the reoccurrence of such incidences and enables concerned authorities to take preventive measures. References: [1] A.Milley (2000), Healthcare and data mining, Health Management Technology, 21(8), 44-47. [2] D.kreuze(2001).Debugging hospitals.Technology Review,104(2),32. [3] T.christy(1997).Analytical tools help health firms fight fraud.Insurance & Technology,22(3),22-26.

[4] SBiafore, (1999).Predictive solutions bring more poer to decision makes, health management technology, 20(10), 12-14 [5] T.Terano, Y.Ishino (1996).Interactive knowledge discovery from marketing questionnaire using simuated breeding and inductive learning methods. In proceedings of the second international conference on knowledge discovery and data mining pp .279282. [6] V.Ciesielski, G. Palstra (1996). Using hybrid neural/expert system for database mining in market survey data. In proceedings of the second international conference on knowledge Discovery and Data Mining,pp 38-43. [7] A.Wilson,L.Thabane,A Holbrook(2003).Application of data mining techniques in pharmacovigilance.British Journalof Clinical Pharmacology.(57)2,127-134. [8] M.Ingram (2001) .Internet privacy threatened following terrorist attacks on VS,URL : http://www.wsws.org/articles/2001/Sep 2001/isps24shtm [9] Data Mining, (BPB publications,B-14 ,Connaught place, New Delhi-1) p-15-16. [10] System Analysis & Design by A.C.Swami & V.Jain (College Book House p. ltd. Chaura Rasta,jaipur ,p-328-329) [11] G. Sharma, Data mining, data warehousing And olap (S.K.Katariya & Sons, Ansari road, dariyaganj, New Delhi) pp. 14-15

203

ICT for Energy Efficiency, Conservation & reducing Carbon Emissions

Abstract Energy efficiency and conservation is a component that requires in-depth analysis and is as important as alternate energy resources and other socially relevant issues like climate change. This opportunity confirms to what other reports have foundenergy efficiency and conservation is important. The focus of the effort is to not only understand why energy efficiency and conservation is important and why the emphasis provided to is justified but is to explore beyond and look for initiatives the society must take to conserve energy. Our goal is to unlock the efficiency potential and look for methods to conserve energy in the future. In this report we target not only the small households and try to establish what an individual can do to conserve energy while using smart appliances before moving on to industrial sector. The focus of the report is also to look how not only large industry setups can contribute their bit to the environment by energy efficiency and conservation but also try to link an untapped economic savings potential by doing so. In this effort, the report also talks about smart building, an initiative that has developed recently to cut down on carbon emissions as well as cut on resources and hence using energy efficiently. The research also talks about the kind of impetus government of India has provided in its trade and development policies to favor energy efficiency. In this the kind of pilot projects what Indian Government has carried out and what more needs to be done is also emphasized. The research talks about sustainable development and a smarter future which deals with energy efficiency and lowering carbon emissions by inculcating the newer technologies that have started to surface around the world. The research is an endeavor towards a smarter planet with contributions from the individuals or the society as a whole.

INTRODUCTION

CT has made progressive inroads in our society. Whether it deals with providing education to the rural India or implementing a safer government, ICT is one of the fastest growing industries in the world. In the years to come, ICT would play a bigger, better

and a safer role in shaping our future. ICT shares equal responsibility in building a smarter future, improving the quality of life by reducing carbon emissions, Greenhouse emissions as well as providing day-to-day solutions for seamless services and information.ICT shares equal responsibility towards Energy utilization. The sudden growth in energy utilization and its conservation is not only because of gradual depletion of resources but also because of factors like reliability on unstable regions of earth for petrol and other crude resources or imbalance amongst different countries on such factors. Witnessing the sudden increase in crude oil prices fortnightly is one such reason, why energy must be efficiently used and reserved. Apart from the ethical responsibility that we share towards our planet and the future generations to come that what we have utilized belongs to the future generation as well, all of us share the equal responsibility of reducing carbon emissions and other GhG emissions. While discussing energy efficiency, not only energy conservation is important but also embracing other renewable sources to provide energy to individuals/organizations is also important. Utilizing the renewable resources is another step ICT has to fulfill. The scenario we face today is indeed depressing as we possess the knowledge and technology to slash energy use but also at the same time increase the level of comforts. The basic barriers faced today by the ICT sector while energy efficiency is three in nature, namely behavioral, organizational and financial. These barriers can be dealt with valuing energy, transforming behavior and government incentives and policies. Greater energy efficiency is an important component in comprehensive national and strategic policy making for energy resources and climate change in the future. Energy utilization would definitely lead us to lower carbon emissions and GhG emissions which would be an important landmark for the generations to come. Lowering carbon emissions is an important and necessary step that ICT would help in the years to come. Monetization of carbon emissions would be

204
a welcome step to provide individuals and organizations alike to the amount of money that could be saved by efficient energy conservation. Energy Efficiency offers a vast, low-cost energy and economic resource for the country only if smartly unlocked. Most of the power grid lay out in the country for distributing electricity throughout the country is decades old and answering the energy crisis on the basis of such a grid is not a feasible solution. A report by IIT K in March 1999, states that the Transmission and Distribution losses in India account for roughly 4-6% in Transmission and 1518% in Distribution. While India still suffers a deficit of over 35-40% in supply and demand. Due to energy deficit, power cuts are common in India and have adversely effected the countrys economic growth. According to Wikipedia, theft of electricity, common in most parts of urban India, amounts to 1.5% of India's GDP. Despite an ambitious rural electrification program, some 400 million Indians lose electricity access during blackouts. Statistics speak that most of the deficit is due to T&D losses. It is possible to bring down the distribution losses to a 6-8 % level in India with the help of newer technological options (including information technology) in the electrical power distribution sector which will enable better monitoring and control. About 70% of the electricity consumed in India is generated by thermal power plants, 21% by hydroelectric power plants and 4% by nuclear power plants. Of the 70% produced by thermal power plants, 20% is lost in Transmission and Distribution. Apart from the losses, considerable carbon emissions result which further hamper with the ecological stability. More than 50% of India's commercial energy demand is met through the country's vast coal reserves. Indian Government has looked to increase production of electricity with nuclear power plants which according to experts is a dangerous liaison. Using nuclear reactors to produce energy is waiting for a Bhopal or Chernobyl to happen once again or provide a potential threat to terrorists. With ever increasing products looking for more and more energy looking for other viable sources of energy is an important task in front of the Indian Government. Wind energy and solar energy are ever lasting potential sources of energy that needs to be efficiently utilized. ICT can play a huge role in embracing such renewable sources with the current existing infrastructure. The country has also invested heavily in recent years in renewable energy utilization, especially wind energy. In 2010, India's installed wind generated electric capacity was 13,064 MW. Additionally, India has committed massive amount of funds for the construction of various nuclear reactors which would generate at least 30,000 MW. In July 2009, India unveiled a $19 billion plan to produce 20,000 MW of solar power by 2020. Even though India was one of the pioneers in renewable energy resources in early 1980s, its success has been dismal. Only 1% of energy needs are met by renewable energy sources. Apart from electricity, petroleum products are a huge concern for India. Conditions have changes since the past few decades and the energy resources cant be utilized the way they were in the past. Petrol products need to be utilized efficiently and ICT can play a pivotal role in this development. Hybrid Cars have started coming in the market but with limited success due to high operational cost involved. Apart from that electrical cars have surface in India as well, but with little respite. It is important the government rolls out efficient policy and business plans for the efficient utilization. Energy efficiency and utilization would not only maximize the life scale of resources but also help in climate change and reducing carbon footprint. Business and policy makers now realize that climate change is a global problem and needs immediate and sustained attention. ICT industry can enable a large part of that reduction. ICT solutions include measure not just for efficient utilization of electricity but these efforts would translate into gross energy and fuel savings.

What kind of policies India requires?


A) Better public-private collaborations: Most of the energy utilities are government operated with almost 97% electricity government produced. It is important that private companies are given ample opportunities which would facilitate better utilization and minimum wastage. B) Government incentives and policy making: Government should provide incentives and subsidies to the private players that have taken a step towards climate change and efficiently cutting down on energy utilization. C) Setting up a project and monitoring: Use a comprehensive, building commissioning plan throughout the life of the project. D) Advocate for cleaner energy at local and national level: Most of the companies make use of coal to spread its network around the country. Strict actions at the management level are required against such companies. Also, advocate these companies on how

205
reduced carbon footprint would mean increased savings on fuel and other resources. E) Create a long term strategy that supports market based solutions F) Leading by example and support pilot projects: An investment of such kind can only be facilitated by government. It is important India promotes these kinds of ICT solutions and support pilot projects undertaken at local level. G) Monitoring of production plants, like wind farms and giving power in the hands of citizens. washing machine would be able to function at the time when the demand for voltage is least. Also a fridge would be allowed to cool to temperatures lower than the usual in case of high voltage and would be allowed to go at a temperature higher than it would have in case of low or erratic power supply. Modernization is necessary for energy consumption efficiency, real time management of power flows and to provide the bi-directional metering needed to compensate local producers of power.

Thrust areas where ICT solutions can be used for energy efficiency/conservation.
Smart Grid and Smart Meters:
India is trying to expand electric power generation capacity, as current generation is seriously below peak demand. Although about 80% of the population has access to electricity, power outages are common, and the unreliability of electricity supplies is severe enough to constitute a constraint on the country's overall economic development. The government had targeted capacity increases of 100,000 megawatts (MW) over the next ten years. As of January 2001, total installed Indian power generating capacity was 112,000 MW. The electric grid of the country is still not complete, although the government has started on the unification of state electricity boards (SEB). In the time when Indian Government is talking about completing Electric Grid, implementation of smart grid looks unrealistic. A smart grid in simple terms is an electrical network using information technology. The Smart Grid is envisioned to overlay the ordinary electrical grid with an information and net metering system, which includes smart meters. Smart grids are being promoted by many governments as a way of addressing energy independence, global warming and energy conservation issues. A smart grid is not a new grid that would change the ways how power is distributed, it is merely an enhancement. A smart grid involves two way communications between the user and the supplier. Smart Grid is responsible for automated processes in distributed systems. A smart grid is useful in the fact that it would help shifting the peak loads which generally result in erratic power supply and blackouts. Apart from this implementation of smart grid would mean that excessive voltage demand would be shifted to off-peak hours. Example, a

Smart Grid implementation would mean:


1. Reducing transmission and distribution losses 2. Real-Time usage i.e. a two way communication in which the grid would know exactly what kind of appliance is being utilized. 3. Renewable Energy An area where wind energy or solar energy is available can rely on them rather using thermal or other conventional sources of energy. ICT enabled solutions would make sure that most of the demands to such an area would be met by the renewable sources of energy and the other sources be concentrated elsewhere. Hence, Smart Grid would provide a formidable way to embrace other renewable sources of energy.

Smart Meter:
Another smart technology provided by using Smart Grid would be the smart meters. Smart Meters would aim at providing remote meter reading to reduce physical trips for maintenance and meter data collection. Another major advantage of ICT enabled solutions can be to lower electricity demand by communicating real-time electricity usage and price through smart meter. Example of this would be, letting the customer know on an hourly basis that how much energy he consumed in the last one hour and what would be the bill for that particular service. Or letting the customer know whenever the energy usage in the last one hour crosses the average electricity usage. Smart Meters would provide the government with the knowledge on how consumers would react to prices, thus giving providers better information on how to structure and price electricity to optimize efficiency. Another major advantage would be smart thermostats enabled meters that would allow reduce or time-shift demand based on price triggers. Gas and Water meters can be implemented on the same

206
platform and provide opportunities for efficiency. Linking Smart Meters with weather channels would also allow consumers to decide how much energy they need to utilize and how much energy they need to conserve. Aside from reducing carbon dioxide emissions, employing ICT solution in the grid can lead to other benefits such as: 1. Increasing national security by enabling decentralized operations and energy supply. 2. Providing a platform for electric vehicles. important issue. The accident rate among cars is the highest in the world. India has about 1% of the world's cars (some 4.5 million) yet still manages to kill over 100,000 people in traffic accidents each year. This amounts to 10% of the entire world's traffic fatalities. The U.S., with more than 40% of the world's cars, creates just 43,000 fatalities. Alleviating congestion is not important only because it would help in timely movement of traffic and reduce accidents, it is also important for the climate. Most of the petroleum products and diesel products are being utilized in this sector be it for personal movement or commercial logistics. Removing congestion would reduce the traffic snarls and traffic jams; Urban India faces everyday and would translate into fuel savings and advocate climate control.

Road Transportation
Another possible area where ICT would help us in energy conservation and reducing carbon emissions is Road Congestion. With easier access to car loans and better salaries, Indians have given up on the usage of two wheelers and shifter to four wheel drives for better safety and standards. The 2000$ Tata Nano and other low-cost cars have fuelled this change too. Quoting from Traffic congestion in Indian cities: Challenges of a rising power by Azeem Uddin the statistics say Indians are rushing headlong to get behind the wheel. Indians bought 1.5 million cars in 2007, more than double of that in 2003. The cumulative growth of the Passenger Vehicles segment during April 2007 March 2008 was 12.17 percent. In 2007-08 alone, 9.6 million motorized vehicles were sold in India. India is now the largest growing car market in the world. With the increased share of cars and increasing number of cars on the Indian roads came the problem of congestion. India has one of the worst congestions in the world understandable from the fact that while other countries have a continuous traffic of cars, motorcycles and heavy motor vehicles, India runs on auto-rickshaws and hand pulled rickshaws, bullock carts and cycles as well. Unlike the western world, India doesnt believe in the methods of lane driving and safe driving. No braking distance, continuous honking are common scenes on the Indian road. To add to it, non functioning traffic signals, deteriorating road surfaces and lax attitude in restoring bridges/broken roads contribute to the congestion as well. Traffic congestion is a serious problem in most Indian metros. The scorching pace of economic growth and the growing incomes of Indias burgeoning middle class are only likely to make the situation worse. Public transport systems are overloaded, and there is a limit on how much additional infrastructure such as roads and rail lines a city can add. Apart from the congestion, road safety is another

How can ICT solutions conserve energy and lower carbon emissions?
Since energy conservation would lead to lower carbon emission, any setup meant for energy conservation would automatically transcend to lower carbon emissions and favorable climate change. Let us look at a few schemes that ICT can bring about. 1. Help Individuals make better plans Continuous updates on the traffic movement on the roads and what road can be avoided is required on the fly. Let the citizens know before they plan a trip about how much time they can save if they take the other route. 2. Improve the journey experience The best possible way to improve congestion is to keep the consumer informed. Set up LCD panels at major traffic junctures to keep the motorist informed about the condition of the road ahead. Provide them with average traffic speed and estimated time of arrival at a particular location. This involved routing based on real-time information to avoid congestion. 3. Traffic lights and roads updated with sensors Rather the traffic lights being time-bound let them be volume bound i.e. install sensors on traffic lights and the roads that can judge the movement of traffic on every major road. If the traffic from a particular side is more as compared to other sides, let the traffic move for a much longer time as compared to others. 4. Increase Vehicle performance Provide mobile apps to provide drivers with feedback on miles per gallon. Install chips in the engine of the car that helps a driver keep track of the engine condition of the vehicle. Such a mobile

207
application can link up revenue as well, as it would not only let the customer make an informed decision about the quality of oil he gets for his vehicle. Also, the same mobile application can be used to inform the customer about timely services which are due for his vehicle and the service centers available for the following. The reduction potential range on ecodriving is also large because the impact of this is still considerably new. This also means off the shelf devices that connect to cars on-board computer to give drivers information about fuel-use. 5. Social Networking and other collaboration tools to ease car-pooling and car sharing Government should take the onus of building websites that would help in pooling of cars. Even private organizations should take part and make sure that four people coming from same location and headed to same part can use the same vehicle for movement. Incentives should be provided to the driver or the vehicle owner like a small gift voucher from petroleum companies. 6. Smart parking/reserve a spot Another smart solution by ICT involves reserving a spot of parking for a particular time-slot on the curbs or in the market place. Else GPS or satellite can help you track the next vacant parking spot. These solutions could also reduce the movement time to minimum and hence conserve oil. 7. Commercial logistics could make use of less carbon intensive methods for transport and use the same eco-driving measures as for the personal drivers. Also, a periodic feedback to the driver on his driving habits could drive him to optimization. 8. Premium toll rates using RFID Levy a core area charge to reduce traffic congestion in their crowded business districts. For instance, if you are in Delhi and every time you drive into the Connaught Circus, the RFID chip will log your entry and you would be charged for entering a core area. Despite the benefits that ICT can provide in improving personal and commercial transport, it has to tackle challenges like lack of infrastructure, simplifying the user experience, awareness regarding car-pool and public transport and improving of road structure and laying down sensors for the job. International Energy Outlook projections for 2030 of the US Department of Energy, China and India account for nearly one-half of the total increase in residential energy use in non-OECD countries. With increasing activity in urban real estate and building sectors, urban buildings will soon become big polluters. The time to take initiatives in this direction is now, through popularizing what are called `intelligent' and `green' buildings. Smart buildings are the intelligent-buildings that incorporate information and communication technology and mechanical systems making them more cost-effective, comfortable, secure and productive. An intelligent building is one equipped with a robust telecommunications infrastructure, allowing for more efficient use of resources and increasing the comfort and security of its occupants. An intelligent building provides these benefits through automated control systems such as heating, ventilation, and air conditioning (HVAC); fire safety; security; energy/lighting management; and other building management systems. For example, in the case of a fire, the fire alarm communicates with the security system to unlock the doors. The alarm will also communicate with the HVAC system to regulate airflow and prevent the fire from spreading. The objective of smart building is to optimize energy consumption and hence lower carbon emissions. Smart buildings require both a firm groundwork i.e. the design and the embedded technology. While the design sets the initial energy consumption of the building, technology optimizes the energy use of building operations. Smart buildings require proper use of sunlight and ventilation system and less reliability on the heating, ventilation and the airconditioning system (HVAC). ICT can also provide for software tools that would aim at choosing materials, modeling lighting, assessing air flows and sizing HVAC systems. Smart Building apart from the energy efficiency it provides also has several additional benefits. 1. Higher quality of life. 2. Better air and access to sunlight while working. 3. Setups generally in reduced water consumption and environment friendly locations. Of course, the setup cost involved in setting up a Smart building is high but savings tend to make the technology effective. Let us have a look at the few ICT solutions that exist today for smart buildings: Smart appliances have been setup in smart buildings. These appliances are in communication with the smart grid and can switch on or off

Smart Buildings:
The percentage of urban population in India increased from 18.0 in 1961 to 27.8 in 2001. The energy consumption rose threefold, from 4.16 to 12.8 quadrillion Btu between 1980 and 2001, putting India next only to the US, Germany, Japan and China in total energy consumption. According to the

208
depending on the energy supply and demand and hence manage to shift the load to off-peak hours. Safety: Smart buildings bring with them impressive safety systems. In case of a fire, the alarm system will not only trip the water sensors and alert the police booths but also regulate the HVAC system to control the fire. Smart thermostat: The smart thermostat allows occupants the kind of temperature they would require in their room and plus work on the real time information like weather alerts and climate control. Sensors: automatically trip the lighting system or the conditioning system whenever the room is empty and switch it on as soon as any door or any window is opened. Intensity Sensors: that can gauge the amount of sunlight coming from outside and set the lighting system according. The same principle applies for HVAC as well. systems and is possibly the first building in India without any light switch. All cabins are equipped with infra-red detectors to detect occupancy. Entry is only through smart cards with built-in antennas. The Wipro Technologies Development Centre (WTDC) in Gurgaon is the largest platinumrated green building in Asia that has been felicitated by USGBC. [Source Hindu Business, Urban Buildings: Green and smart, the way ahead.] Even thought the advantages and the reduction potential Smart buildings provide is enormous, it faces a number of challenges like limited interoperability, limited deployment of smart grip and the high up-front cost and the shortage of expertise. Smart Buildings require constant support from the government and increased incentives from the government for the organization that implement smart buildings rather the traditional buildings. Government also needs to address the shortage of expertise and support investments from both the public and the private partnerships. It also needs to commission new high performing buildings and retrofit existing ones at levels of government. This will contribute to a better understanding of building costs and expected energy efficiency, as well as increase the knowledge base for building professionals.

What is required?
Need for certifications
Worldwide, green buildings are certified through an independent body, the US Green Building Council (USGBC), through its LEED (Leadership in Energy and Environmental Design) certification programme. In India, Indian Green Building Council (IGBC) set up by the Confederation of Indian Industry (CII) certifies the smart buildings. It comprises all stakeholders in the green building value chain. But there are only 135 certified green buildings in the country as yet. A few achievements by India in the field of smart building include: India Habitat Centre in New Delhi. Its exteriors are so designed that it is cleaned every time it rains. Despite its location at the intersection of two major roads with heavy traffic, the building is devoid of disturbance and protected against tropical sunshine due to its unique design. The use of shaded canopies over large paved courts reduces energy load on air conditioning and produces an improved climate for its occupants. The Confederation of Indian Industry Sohrabji Godrej Green Business Centre (CII-Godrej GBC) was the first structure in India to receive the prestigious `platinum' rating from the USGBC. The Engineering Design and Research Centre (EDRC) of Larsen &Toubro's Engineering Construction and Contracts Division in Chennai is another such building. It has fully automated energy management, life-safety and telecommunication

Conclusion
ICT provides a tremendous opportunity to use energy efficiently and reduce carbon emissions. Beyond the benefits, it would help India on a path to energy independence, but it is a task that requires collective efforts from the organizations/the citizens and the government as well. The adaptation of energy efficiency policies would increase the productivity of employees and the overall quality of life. The reduction of energy utilization and carbon emissions is an opportunity for humanity to enable change. IGBCs vision stands true To usher in a green building movement and facilitate in India emerging as one of the world leaders in green buildings by 2015 which would help India unlocking the potential of other sources of energy as well as optimizing the use of present resources. The monetization of carbon emissions and the need of ubiquitous broadband throughout the country for implementation of a smart grid cannot be underlined. The search for alternative fuels like biodiesel, ethanol, hybrids, green batteries, flex fuel vehicles and conversions, synfuels, solar assisted fuel

209
and ways to implement them at a lower cost is also required. The hybrid cars that have surfaced are expensive and considerable research is required to lower the prices further. Development of infrastructure and government policies and incentives would be the front-runners in implementing smart and green technologies. Government also needs to make sure it keeps private sector involved so as to make Green enabled IT commercially-viable. Reducing carbon emissions and efficiently utilizing energy would not only increase the savings and the economy but also increase green-collar jobs, an avenue of employment for the talented and high skilled Indian youth. Energy efficiency offers a vast, low cost energy resource for the Indian Economy only if the nation can craft a comprehensive and innovative approach to unlock it. Significant and persistent barriers will need to be addressed to stimulate demand for energy efficiency and manage its delivery across the nation. REFERENCES: 1. Unlocking energy efficiency in the US economy, McKinsey and Company, July 2009 2. Energy Efficiency in Buildings, Business realties and opportunities, 3. World Business Council for sustainable development 4. Energy Efficiency, Best Practices, Foundation for community association research. 5. Indian Green Building Council, [www.igbc.in/site/igbc/index.jsp\] 6. Urban buildings: Green and Smart, the way to go [www.hindubusinessline.in] 7. SMART BUILDINGS: Make Them Tech Smart; [http://voicendata.ciol.com/] 8. Traffic congestion in Indian cities: Challenges of a rising power; Kyoto of the Cities, Naples 9. Smart grip, Wikipedia;[ en.wikipedia.org/wiki\] 10. Global Energy Network Institute, Geni.org 11. Smart 2020, United States Report Addendum, GeSI, 2008 12. Automation in power distribution, Vol. 2 No.2, iitk.ac.in 13. Electricity sector in India, Wikipedia 14. Electrical Distribution in India: An overview, Forum for regulators 15. Optimize energy use, WBDG sustainable Development, 12-21-2010

210

Study of Ant Colony Optimization For Proficient Routing In Solid Waste Management

Abstract. The routing in solid waste management is one of the key areas where 70% to 85% of the total system cost is wasted in just the collection of the waste. During the transportation many of the collection points may be missed out and it may happen that the path followed by the driver is longer than the optimal path. The study involved in the paper intends to find the optimal route for collecting solid waste in cities. Ant colony optimization is a new meta- heuristic technique inspired by the nature of the real ants and helps in finding the optimal solution of such type of problems. The system tries to implement the solid waste management routing problem using Ant colony optimization. Key-Words: Solid waste management, Ant colony optimization (ACO), Routing, Waste collection.

INTRODUCTION

he collection of municipal solid waste is one of the most difficult operational problems faced by local authorities in any city. In recent years, due to a number of costs, health, and environmental concerns, many municipalities, particularly in industrialized nations, have been forced to assess their solid waste management and examine its cost effectiveness and environmental impacts, in terms of designing collection routes. During the past 15 years, there have been numerous technological advances, new developments and mergers and acquisitions in the waste industry. The result is that both private and municipal haulers are giving serious consideration to new technologies such as computerized vehicle solutions [1]. It has been estimated that, of the total amount of money spent for the collection, transportation, and disposal of solid waste, approximately 6080% is spent on the collection phase [2, 4]. Therefore, even a small improvement in the collection

operation can result to a significant saving in the overall cost. The present study is mainly focused on the collection and transport of solid waste from any loading spot in the area under study. The routing optimization problem in waste management has been already explored with a number of algorithms. Routing algorithms use a standard of measurement called a metric (i.e. path length) to determine the optimal route or path to a specified destination. Optimal routes are determined by comparing metrics, and these metrics can differ depending on the design of the routing algorithm used [3, 10].The complexity of the problem is high due to many alternatives that have to be considered. Fortunately, many algorithms have been developed and discussed in order to find an optimized solution, leading to various different results. The reason for this diversity is that the majority of routing algorithms include the use of heuristic algorithms. Heuristic algorithms are ad hoc, trial-and-error methods which do not guarantee to find the optimal solution but are designed to find near-optimal solutions in a fraction of the time required by optimal methods.

RELATED WORK
In the literature of the past few years, much effort has been made in the domain of urban solid waste. The effort focuses either on theoretical approaches, including socio-economic and environmental analyses concerning waste planning and management, or on methods, techniques and algorithms developed for the automation of the process. The theoretical approaches examined in the literature refer to issues concerning the conflict between urban residents and the municipality for the selection of sites for waste treatment, transshipment

211
stations and disposal, the issue of waste collection and transport and its impact to human health due to noise, traffic, etc. In this context, the calculation of total cost for collection and transport, for a specific scenario, is implemented. The identification of the most cost-effective alternative scenario and its application is simulated. In the literature, methods and algorithms have been used for optimizing sitting and routing aspects of solid waste collection networks that were deterministic models including, in many cases, Linear Programming (LP) [Hsieh and Ho, 1993], [Lund and Tchobanoglous, 1994]. However, uncertainty frequently plays an important role in handling solid waste management problems. The random character of solid waste generation, the estimation errors in parameter values, and the vagueness in planning objectives and constraints, are possible sources of uncertainty. Fuzzy mathematical programming approaches for dealing with systematic uncertainties have been broadly used in the last few years. For example, the sitting planning of a regional hazardous waste treatment center [Huang et al., 1995], the hypothetical solid waste management problem in Canada [Koo et al.1991], an integrated solid waste management system in Taiwan [Chang and Wang, 1997]. To cope with non-linear optimization problems, such as deciding about efficient routing for waste transport, methods based on Genetic Algorithms (GA), Simulated Annealing (SA), Tabu Search and Ant Colony Optimization (ACO) are also proposed [Chang and Wei, 2000], [Pham and Karaboka, 2000], [Ducatelle and Levine, 2001], [Bianchi et al. 2002], [Chen and Smith, 1996], [Glover and Laguna, 1992], [Tarasewich and McMullen , 2002]. The problem could be classified as either a Traveling Salesman Problem (TSP) or a Vehicle Routing Problem (VRP) and for this particular problem, several solutions and models have been proposed. However, the complexity of the problem is high due to many alternatives that have to be considered and the number of possible solutions is considerably high, too. As it is mentioned above, the most popular algorithms used today in similar cases include the Genetic Algorithms, the Simulated Annealing (SA), the Tabu Search and the Ant Colony Optimization (ACO) Algorithm. Genetic Algorithms [Glover et al., 1992], [Pham and Karaboka, 2000], [Chen and Smith, 1996] use biological methods such as reproduction, crossover, and mutation to quickly search for solutions to complex problems. GA begins with a random set of possible solutions. In each step, a fixed number of the best current solutions are saved and they are used in the next step to generate new solutions using genetic operators. Crossover and mutation are the most important genetic operations used. In the crossover function parts of two random solutions are chosen and they are exchanged between two solutions. As a result two new child solutions are generated. The mutation function alters parts of a current solution generating a new one. The Ant Colony Optimization (ACO) algorithm [Dorigo and Maniezzo, 1996], was inspired through the observation of swarm colonies and specifically ants. Ants are social insects and their behaviour is focused to the colony survival rather the survival of the individual. Specifically, the way ants find their food is noteworthy. Although ants are almost blind, they build chemical trails, using a chemical substance called pheromone. The trails are used by ants to find the way to the food or back to their colony. The ACO simulates this specific ants characteristic, to find optimum solutions in computational problems, such as the Travelling Salesman Problem. As this context is mainly focused on the ACO algorithm and its testing to the solid waste collection problem, the ACO is analytically described in the next section.

Problem Formulation of Solid Waste Routing.


The collection of solid waste operation in City begins when workmen collect plastic bags containing residential solid waste from different points in the city. These bags are carried to the nearest pick-up point where there are steel containers. The containers are unloaded into special compact vehicles. Every morning the vehicles are driven from the garage to the regions, where they begin to collect residential solid waste from the pick-up points. There is no specific routing basis for the vehicles being left to the driver's choice. Occasionally, some pick-up points may be missed. It may happen that the route followed by the driver is longer than the optimal route. In regions which have two collection vehicles, they may meet at the same pick-up point several times. Once the solid waste is loaded into the vehicles, it is carried out of to the disposal site located far away. The suggested procedure for solving the vehicle routing problem in the selected Region 2 of begins with a particular node closest to the garage or the previous region and ends with the nodes closest to the disposal site. This reduces the number of permutations considerably. Further detailed node networks, description and procedures can be found in. The problem of routing in solid waste management is the main point of focus in thesis. There are many ways to

212
solve the problem of solid waste management. Ant colony optimization is a new technology to solve the optimization problem. As routing in solid waste management is a challenge, so in this thesis we are planning to tackle the routing in solid waste management through the technique of ant colony optimization. The Ant Colony Optimization (ACO) algorithm, was inspired through the observation of swarm colonies and specifically ants. Ants are social insects and their behaviour is focused to the colony survival rather the survival of the individual. Specifically, the way ants find their food is noteworthy. Although ants are almost blind, they build chemical trails, using a chemical substance called pheromone. The trails are used by ants to find the way to the food or back to their colony. choosing by randomly, with at most one ant in each customer point. Each ant builds a tour incrementally by applying a state transition rule. While constructing the solution, ants also updating the pheromone on the visited edges by local updating rule. Once the ants complete their tour, the pheromone on edges will be updated again by applying global updating rule[2]. During construction of tours, ants are guided by both heuristic information and pheromone information. Heuristic information refers to the distances of the edges where ants prefer short edges. An edge with higher amount of pheromone is a desirable choice for ants. The pheromone updating rule is designed so that ants tend to leave more pheromone on the edges which should be visited by ants. The Ant Colony System algorithm is given as in Fig. 2[3,4]. In the ant ACS, an artificial ant k after serves customer r chooses the customer s to move to from set of Jk(r) that remain to be served by ant k by applying the following state transition rule which is also known as pseudo-random-proportional-rule:

Ant Colony Optimization (ACO) algorithm.


Ant System was efficient in discovering good or optimal solutions for small problems with nodes up to 30. But for larger problems, it requires unreasonable time to find such a good result. Thus, Dorigo and Gambardella [3,4] and Bianchi et al.[1] devised three main changes in Ant System to improve its performance which led to the existence of ant colony system. Initialize Loop//Each loop called an iteration Each ant is placed on a starting customer spoint Loop//Each loop called a step Each ant constructs a solution (tour) by applyi State transition rule and a local pheromone updating Until all ants have construct a complete solution. A global pheromone updating rule is applied Until stopping criteria is met Ant Colony System is different from Ant System in three main aspects. Firstly, state transition rule gives a direct way to balance between exploration of new edges and exploitation of a priori and accumulated information about the problem. Secondly, global updating rule is applied only to those edges which belong to the best ant tour and lastly, while ants construct the tour, a local pheromone updating rule is applied. Basically, Ant Colony System (ACS) works as follows: m ants are initially positioned on n nodes chosen according to some initialization rule such as

arg max
u jk ( r )

[ (r , u )] if a a0 {exp loitation).. .. [ (r , u )] other (biased exp loration ...

.(1) Where: = The control parameter of the relative importance of the visibility. (r,u)=The pheromone trail on edge (r,u) (r,u)=A heuristic function which was chosen to be the inverse distance between customers r and u A = A random number uniformly distributed in [0,1] a0 (0 a0 1) = A parameter S = A random variable selected according to the probability distribution which favors edges which is shorter and higher amount of pheromone It is same as in Ant system and also known as random-proportional-rule given as follow:

pk(r , s) 0

[ (r , s)][ (r , s)] [ (r , s)][ (r , s)]


u j (r ) k

ifs

jk (r )

(2)

otherwise

Where, pk(r,s) is the probability of ant k after serves customer r chooses customer s to move to. The parameter of a0 determines the relative importance of exploitation versus exploration. When an ant after serves customer r has to choose the customer s to

213
move to, a random number a (0 a 1) is generated, if a a0, the best edge according to Eq -1 is chosen, otherwise an edge is chosen according to Eq -2[3]. While building a tour, ants visits edges and change their pheromone level by applying local updating rule as follow: (r, s) (1 p). (r, s) p. (r, s), 0 p 1 (3) Where: = A pheromone decay parameter (r,s) = 0 (initial pheromone level) Local updating makes the desirability of edges change dramatically since every time an ant uses an edge will makes its pheromone diminish and makes the edge becomes less desirable due to the loss of some of the pheromone. In other word, local updating drives the ants search not only in a neighborhood of the best previous tour. Global updating is performed after all the ants have completed their tours. Among the tours, only the best ant which produced the best tour is allowed to deposit pheromone. This choice is intended to make the search more directed. The pheromone level is updated by global updating rule as follow: (r, s) (1 p). (r, s) . (r, s), 0 1 (4) Where: 1 / Lgb i f (r , s) global best tour (r , s ) (5) 0 otherwise Lgb is the length of the global best tour from the beginning of the trial. (global-best) and a is pheromone decay parameter. Global updating is intended to provide greater amount of pheromone to shorter tours. Eq -5 dictates that only those edges belong to globally best tour will receive reinforcement. given point an agent has to choose between different options and the one actually chosen results to be good, then in the future that choice will appear more desirable than it was before. The main purpose in this paper is to provide an adequate fast heuristic algorithm which yields a better solution than traditional available methods. The ant system exploits the natural phenomenon of ants to solve such optimization problems. The concept of this method is to find the priori tour that gives the minimum total expected cost. In many such types of solutions the ACO has shown the ability in obtaining good solutions.

REFERENCES [1] Nikolaos V. Karadimas, Maria Kolokathi, Gerasimoula Defteraiou, Vassili Loumos, Ant Colony System vs ArcGIS Network Analyst: The Case of Municipal Solid Waste Collection, 5th Wseas Int. Conf. on Environment, Ecosystems and Development, Tenerife, Spain, December 14-16, 2007. [2] Municipality of Athens, Estimation, Evaluation and Planning Of Actions for Municipal Solid Waste Services During Olympic Games 2004.Municipality of Athens, Athens, Greece, 2003. [3] Parker. M., Planning Land Information Technology Research Project: Efficient Recycling Collection Routing in Pictou County, 2001. [4] Karadimas N.V., Doukas N., Kolokathi M. and Defteraiou,Routing Optimization Heuristics Algorithms for Urban Solid Waste Transportation Management, G., World Scientific and Engineering Society, Transactions on Computers. , Vol. 7, Issue 12, ISSN: 1109-2750, pp. 2022-2031, 2008. [5] A. Awad, M. T. Aboul-Ela, and R. AbuHassan,"Development of a Simplified Procedure for Routing Solid Waste Collection", International Journal for Science and Technology ( Scientia Iranica), 8 (1), 2001, pp. 71-75. [6] Karadimas, N.V., Kouzas, G., Anagnostopoulos, I. and Loumos, V., Urban Solid Waste Collection and Routing: the Ant Colony Strategic Approach. International Journal of Simulation: Systems, Science & Technology, Vol. 6, 2005, pp. 4553. [7] M. Dorigo, Optimization, Learning and Natural Algorithms, PhD thesis, Politecnico di Milano, Italie, 1992. [8] S. Goss, S. Aron, J.-L.Deneubourg et J.-M. Pasteels, The self-organized exploratory pattern of the Argentine ant, Naturwissenschaften, volume 76,

Conclusion
The study mainly concentrates on the carrying or moving of the solid waste from different points of collection. Recent results as indicated by literature review show that the ant colony optimization is a new field emerging in the area of network optimization problems. The general idea underlying the Ant System paradigm is that of a population of agents (ants) each guided by an autocatalytic process directed by a greedy force. Were an agent alone, the autocatalytic process and the greedy force would tend to make the agent converge to a suboptimal tour with exponential speed. We employ positive feedback as a search and optimization tool. The idea is that if at a

214
pages 579-581, 1989. [9] J.-L. Deneubourg, S. Aron, S. Goss et J.-M. Pasteels , The self-organizing exploratory pattern of the Argentine ant, Journal of Insect Behavior, volume 3, page 159, 1990. [10] A. Colorni, M. Dorigo, and V. Maniezzo, Distributed optimization by ant colonies Proceedings of ECAL'91, European Conference on Artificial Life, Elsevier Publishing, Amsterdam, 1991. [11] M. Dorigo, V. Maniezzo, and A. Colorni, The ant system: an autocatalytic optimizing process, Technical Report TR91-016, Politecnico di Milano (1991). [12] Ingo von Poser, Adel R Awad, Optimal Routing for Solid Waste Collection in Cities by Using Real Genetic Algorithm, 07803-9521- 2/06/ IEEE, pp 221-226. [13] M. Dorigo, and Luca Maria Gambardella, Ant Colony System: A Cooperative Learning Approach to the Traveling Salesman Problem, IEEE Tramsactions on Evolutionart Computation, Vol. 1, No. 1, April 1997 [14] Ismail, Z. and S.L. Loh, 2009, Ant Colony Optimization for Solving Solid Waste Collection Scheduling Problesm. Journal of Mathematics and Statistics 5(3)199205, 2009 ISSN 1549 - 3644.

215

Survey on Decision Tree Algorithm

Abstract
This paper aims to study the various classification algorithms used in data mining. All these algorithms are based on constructing a decision tree for classifying the data but basically differ from each other in the methods employed for selecting splitting attribute and splitting conditions. The various algorithms which will be studied are: CART (Classification and regression tree), ID3 and C4.5.

existing software and hardware platforms to enhance the value of the existing information resources, and can be integrated with the new products and the systems as they are brought on-line. Data mining tools can analyze massive databases to deliver answers to questions such as, "Which clients are most likely to respond to my next promotional mailing, and why?, when implemented on high performance client/server or parallel processing computers.

CLASSIFICATION ALGORITHMS
A data mining function that assigns items in a collection to target categories or classes is known as Classification. Goal of classification can be described as to accurately predict the target class for each case in the data. The classification task will begin with a data set in which the class assignments are known for each case. The classes are the values of the target. The classes are distinct and do not exist in an ordered relationship to each other. Ordered values would indicate a numerical, rather than a categorical, target. A predictive model with a numerical target uses a regression algorithm, not a classification algorithm. For example, customers might be classified as either users or non-users of a loyalty card. The predictors would be the attributes of the customers: age, gender, address, products purchased, and so on. The target would be yes or no (whether or not the customer used a loyalty card). In the model build (training) process, a classification algorithm finds relationships between the values of the predictors and the values of the target. Different classification algorithms use different techniques for finding relationships. These relationships are summarized in a model, which can

Introduction
The extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses, is known as data mining. The future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions are predicted by the Data mining tools. The automated, the prospective analyses offered by the data mining move beyond the analyses of past events provided by retrospective tools typical of the decision support systems. The Data mining tools can answer business questions that were traditionally too much time consuming to resolve. They search databases for the hidden patterns, finding predictive information that the experts may have missed because it lies outside their expectations.Most of the companies already collect and refine the massive quantities of data. These Data mining techniques can be implemented rapidly on

216
then be applied to a different data set in which the class assignments are unknown. Definition : Given a database D = { t1,t2,t3,..tn} of tuples (data , records). And a set of classes c = {c1,c2,c3cm} the classification problem is to define a mapping f:D ->c where each ti is assigned to one class . A class .cj ,contains precisely those mapped to it , that is, cj = { ti : f(ti)=cj , 1<i<n ,and ti belongs to D} [4] compared to the predefined pattern . Then that item is going to be placed in the class with largest similarity value.

K Nearest Neighbors
The KNN technique, it assumes that the entire training set includes not only the data in the set but also the desired classification for each item . As a result, the training data become the model When classification is to be made for a new item, its distance to each item in the training set is to be determined .only the K , closest entries in the training set are considered items from this set of K , closest items.

Different types of classification algorithms


1. Statistical based algorithms Regression
As with all the regression techniques we assume type existence of a single output variable and more input variable s. The output variable is numerical . The general regression tree building methodology allows input variables to be tye mixture of continuous and categorical variables. A decision tree is generated where each decision node in the tree contains a test on some input variables value . The terminal nodes of the tree contain the predicted output variable values . Regression tree may be considered as a variant of decision trees , designed to approximate real valued function instead of being used for classification tasks.

3. Decision Tree-Based Algorithm


The decision tree approach is most useful in classification problems. Here in this technique a tree is constructed to model the classification process. Once the tree is built , it is applied to each tuple in the database and results in classification for that tuple . It involves two basic steps : building the tree and applying the tree to the database.

ID3
The ID3 technique is based on information theory and attempts to minimize the expected number of comparisons,. Its basic strategy is to choose splitting attributes with the highest gain first. The amount of information associated with an attribute value is related to the probability of occurrence. The concept here used to qualify information is called entropy.

Bayesian Classification
The effect of a variable value on a given class is independent of the values of other variable is assumed by the Nave Bayes classifications. This assumption is called class conditional independence. This assumption is made to simplify the computation and in this sense considered to be Nave. This is a fairly among assumption and is often not applicable. Although, bias is estimating probabilities, not their exact values that determine the classifications.

C4.5
It is an improvement of ID3. Here classification is via either decision trees or rules generated from them. For splitting purposes, It uses the largest Gain Ratio that ensures a larger than average information gain .This is to compensate for the fact that the gain Ratio value is skewed towards splits where the size of one subset is close to that of the starting one.

2. Distance based algorithms Simple approach


Here in this approach , if we have a representation of each class , we can perform classification by assigning each tuple in the class in which it is most similar . A simple classification technique would be to place each item in the class where it is most similar to the center of the class, a predefined pattern can be used to represent the class. Here if once the similarity measure is defined, each item to be classified will be

CART
Classification and Regression tree (CART) is a technique that generates a binary decision tree. Similarly as with ID3, entropy is used as a measure to choose the best splitting attribute and criterion Here however , where the child is created for each subcategory , only two children are created . The splitting is performed around what is found to be the best split point. The tree stops growing when no split will improve the performance.

4. Neural Network Based algorithm

217

A model representing how to classify any given


database is constructed with neural networks, just as with decision trees. The activation functions typically are sigmoid. when a tuple must be classified , certain attribute values from that tuple are input into the directed graph at the corresponding source nodes .

The ID3 Algorithm


ID3 is a non incremental algorithm, meaning it derives its classes from a fixed set of training instances. An incremental algorithm revises the current concept definition, if necessary, with a new sample. The classes created by ID3 are inductive, that is, given a small set of training instances, the specific classes created by ID3 are expected to work for all future instances. The distribution of the unknowns must be the same as the test cases. Induction classes cannot be proven to work in every case since they may classify an infinite number of instances. Note that ID3 (or any inductive algorithm) may misclassify data.

How does ID3 decide which attribute is the best? A statistical property, called information gain, is used. Gain measures how well a given attribute separates training examples into targeted classes. The one with the highest information (information being the most useful for classification) is selected. In order to define gain, we first borrow an idea from information theory called entropy. Entropy measures the amount of information in an attribute. Given a collection S of c outcomes Entropy(S) = -p(I) log2 p(I) where p(I) is the proportion of S belonging to class I. is over c. Log2 is log base 2. Note that S is not an attribute but the entire sample set.

Algorithm
ID3 (Examples, Target_Attribute, Attributes) Create a root node for the tree If all examples are positive, Return the single-node tree Root, with label = +. If all examples are negative, Return the single-node tree Root, with label = -. If number of predicting attributes is empty, then Return the single node tree Root, with label = most common value of the target attribute in the examples. Otherwise Begin A = The Attribute that best classifies examples. Decision Tree attribute for Root = A. For each possible value, , of A, Add a new tree branch below Root, corresponding to the test A = Let Examples( examples that have the value . for A ), be the subset of

Data Description
The sample data used requirements, which are: by ID3 has certain

Attribute-value description - the same attributes must describe each example and have a fixed number of values. Predefined classes - an example's attributes must already be defined, that is, they are not learned by ID3. Discrete classes - classes must be sharply delineated. Continuous classes broken up into vague categories such as a metal being "hard, quite hard, flexible, soft, quite soft" are suspect. Sufficient examples - since inductive generalization is used (i.e. not provable) there must be enough test cases to distinguish valid patterns from chance occurrences.

If Examples( ) is empty Then below this new branch add a leaf node with label = most common target value in the examples Else below this new branch add the subtree ID3 (Examples( Attributes - {A}) ), Target_Attribute,

Attribute Selection

218
End Return Root

augmented

with

vector

C4.5 ALGORITHM
C4.5 is an algorithm used to generate a decision tree developed by Ross Quinlan. C4.5 is an extension of Quinlan's earlier ID3 algorithm. The decision trees generated by C4.5 can be used for classification, and for this reason, C4.5 is often referred to as a statistical classifier. This section explains one of the algorithms used to create Univariate DTs. This one, called C4.5, is based on the ID32 algorithm, that tries to find small (or simple) DTs. We start presenting some premises on which this algorithm is based, and after we discuss the inference of the weights and tests in the nodes of the trees.

where represent the class to which each sample belongs. [1] At each node of the tree, C4.5 chooses one attribute of the data that most effectively splits its set of samples into subsets enriched in one class or the other. Its criterion is the normalized information gain (difference in entropy) that results from choosing an attribute for splitting the data. The attribute with the highest normalized information gain is chosen to make the decision. The C4.5 algorithm then recurs on the smaller sub lists. This algorithm has a few base cases. All the samples in the list belong to the same class. When this happens, it simply creates a leaf node for the decision tree saying to choose that class. None of the features provide any information gain. In this case, C4.5 creates a decision node higher up the tree using the expected value of the class. Instance of previously-unseen class encountered. Again, C4.5 creates a decision node higher up the tree using the expected value.

Construction
Some premises guide this algorithm, such as the following: if all cases are of the same class, the tree is a leaf and so the leaf is returned labeled with this class; for each attribute, calculate the potential information provided by a test on the attribute (based on the probabilities of each case having a particular value for the attribute). Also calculate the gain in information that would result from a test on the attribute (based on the probabilities of each case with a particular value for the attribute being of a particular class); depending on the current selection criterion, find the best attribute to branch on.

CART ALGORITHM
Classification and regression trees (CART) is a nonparametric Decision tree learning technique that produces either classification or regression trees, depending on whether the dependent variable is categorical or numeric, respectively. Trees are formed by a collection of rules based on values of certain variables in the modeling data set Rules are selected based on how well splits based on variables values can differentiate observations based on the dependent variable Once a rule is selected and splits a node into two, the same logic is applied to each child node (i.e. it is a recursive procedure)

Algorithm
C4.5 builds decision trees from a set of training data in the same way as ID3, using the concept of information entropy. The training data is a set samples. Each sample of already classified is a

vector where represent attributes or features of the sample. The training data is

219
Splitting stops when CART detects no further gain can be made, or some pre-set stopping rules are met Each branch of the tree ends in a terminal node Each observation falls into one and exactly one terminal node Each terminal node is uniquely defined by a set of rules node have identical values of the dependent variable, the node will not be split. If all cases in a node have identical values for each predictor, the node will not be split. If the current tree depth reaches the userspecified maximum tree depth limit value, the tree growing process will stop. If the size of a node is less than the userspecified minimum node size value, the node will not be split. If the split of a node results in a child node whose node size is less than the userspecified minimum child node size value, the node will not be split. Conclusion and Future Work Decision tree induction is one of the classification techniques used in decision support systems and machine learning process. With decision tree technique the training data set is recursively partitioned using depth- first (Hunts method) or breadth-first greedy technique (Shafer et al , 1996) until each partition is pure or belong to the same class/leaf node (Hunts et al, 1966 and Shafer et al , 1996). Decision tree model is preferred among other classification algorithms because it is an eager learning algorithm and easy to implement. Decision tree algorithms can be implemented serially or in parallel. Despite the implementation method adopted, most decision tree algorithms in literature are constructed in two phases: tree growth and tree pruning phase. Tree pruning is an important part of decision tree construction as it is used improving the classification/prediction accuracy by ensuring that the constructed tree model does not overfit the data set (Mehta et al, 1996). In this study we focused on serial implementation of decision tree algorithms which are memory resident, fast and easy to implement compared to parallel implementation of decision that is complex to implement. The disadvantages of serial decision tree implementation is that it is not scalable (disk resident) and its inability to exploit the underlying parallel architecture of computer system processors. Our experimental analysis of performance evaluation of the commonly used decision tree algorithms using Statlog data sets (Michie et al, 1994) shows that there is a direct relationship between execution time in building the tree model and the volume of data records. Also there is an indirect relationship between execution time in building the model and attribute size of the data sets. The experimental analysis C4.5 algorithms have a good classification accuracy compared to other

Tree Growing Process


The basic idea of tree growing is to choose a split among all the possible splits at each node so that the resulting child nodes are the purest. In this algorithm, only univariate splits are considered. That is, each split depends on the value of only one predictor variable. All possible splits consist of possible splits of each predictor For each continuous and ordinal predictor, sort its values from the smallest to the largest. For the sorted predictor, go through each value from top to examine each candidate split point (call it v, if x v, the case goes to the left child node, otherwise, goes to the right.) to determine the best. The best split point is the one that maximize the splitting criterion the most when the node is split according to it. The definition of splitting criterion is in later section. For each nominal predictor, examine each possible subset of categories (call it A, if x A , the case goes to the left child node, otherwise, goes to the right.) to find the best split. Find the nodes best split. Among the best splits found in step 1, choose the one that maximizes the splitting criterion. Split the node using its best split found in step 2 if the stopping rules are not satisfied. Splitting criteria and impurity measures At node t, the best split s is chosen to maximize a splitting criterion i(s,t) . When the impurity measure for a node can be defined, the splitting criterion corresponds to a decrease in impurity. In SPSS products, I (s, t) p(t) i(s, t) is referred to as the improvement. Stopping Rules Stopping rules control if the tree growing process should be stopped or not. The following stopping rules are used: If a node becomes pure; that is, all cases in a

220
algorithms used in the study. The variation of data sets class size, number of attributes and volume of data records is used to determine which algorithm has better classification accuracy between IDE3 and CART algorithms. In future we will perform experimental analysis of commonly used parallel implementation tree algorithms and them compare it that serial implementation of decision tree algorithms and determine which one is better, based on practical implementation. [5] Alcal, J., Snchez, L., Garca, S., del Jesus, M. et. al. KEEL A Software Tool to Assess Evolutionary Algorithms to Data Mining Problems. Soft Computing, 2007. [6] Clark, P., Niblett, T. The CN2 Induction Algorithm. M] Hmlinen, W., Vinni, M. Comparison of machine learning methods for intelligent [7] tutoring systems. Conference Intelligent Tutoring Systems, Taiwan, 2006achine Learning 1989, 3(4) [8] Jovanoski, V., Lavrac, N. Classification Rule Learning with APRIORI-C. In: Progress in Artificial Intelligence, Knowledge Extraction, Multi-agent Systems, Logic Programming and Constraint Solving, 2001. [9] Romero, C., Ventura, S. Educational Data Mining: a Survey from 1995 to 2005. Expert Systems with Applications, 2007, 33(1). [10] ] Yudelson, M.V., Medvedeva, O., Legowski, E., Castine, M., Jukic, D., Rebecca, C. Mining Student Learning Data to Develop High Level Pedagogic Strategy in a Medical ITS. AAAI Workshop on Educational Data Mining, 2006.

References
[1] Baik, S. Bala, J. (2004), A Decision Tree Algorithm for Distributed Data Mining: Towards Network Intrusion Detection, Lecture Notes in Computer Science, [2] McSherry, D. (1999). Strategic induction of decision trees. Knowledge-Based Systems, [3] Agrawal R.,and Srikant R. 1994 Fast Algorithms for Mining Association Rules in Large Databases Proc. 20th Int. Conf.Very Large Data Bases (VLDB 94) [4] Jeffrey W. Seifert Analyst in Information Science and Technology Policy Resources, Science, and Industry Division , Data Mining: An Overview

221

PART - 2

222

Analysis of Multidimensional Modeling Related To Conceptual Level


Udayan Ghosh and Sushil Kumar University School of Information technology, Guru Gobind singh Indraprastha University Kashmere Gate, Delhi. sushiljaisw@gmail.com g_udayan@lycos.com
Abstract Many OLAP usages indicate that their usability performance degrades due to wrong interpretation of business dimensions. In this paper, we are focusing about business dimensions by multidimensional data model structures for the DWs. Multidimensionality is just a design technique that separates the information into facts and dimensions by understanding the business processes and the required dimensions . Many approaches have been suggested but we will focus on widely accepted star Schema with slight improvement using Snowflake Schema a variation of star schema, in which the dimensional tables from a star schema are organized into a hierarchy by normalizing them. Multidimensional model present information to the end-user in a way that corresponds to his normal understanding of his business dimensions, key figures or facts from the different scenarios that influence users requirement . Key Words: Business Dimensions, Multidimensional Modeling, Data Warehouse, OLAP, Star Schema, Snowflake Schema, Fact. 1.
1

Introduction

DW generalize and consolidate the data in the multidimensional space. The construction of DW involves data warehouses involves data cleaning, data integration, and data transformation and can be viewed as an important preprocessing step for data mining. Moreover, data warehouses provide on-line analytical processing (OLAP) tools for the interactive analysis of multidimensional data of varied granularities, which facilitates effective data generalization and data mining. A data warehouse is a set of data and technologies aimed at enabling the executives, managers and analysts to make better and faster decisions. DWs to manage information efficiently as the main organizational asset. The principal role of DW in taking strategic decisions, quality is fundamental. Data warehouse systems are

important tools in todays competitive, fast-changing era. In the last several years, many firms have spent millions of dollars in building enterprise-wide data The DWs have to inherent support for complex queries however its maintenance does not suppose transactional load. These features cause the design techniques and the used strategies to be different from the traditional ones. Many people feel that with competition mounting in every industry and domain, data warehousing is the latest must-have marketing weapon and panacea a way to retain customers by learning more about their requirements Enterprise DW3 -An enterprise DW provides a centralized database architecture for decision support for the enterprise. Operational Data Store-It has a broader enterprise wide frame, but unlike the real one. Enterprises DW, data is refreshed in near real time and used for routine business processes. Data Mart -Data mart is a subset of data warehouse and it supports a specific domain, business process. 1.1- Characteristics of DW8- The main characteristics of data warehouse are: Subject oriented. DW is organized around major subjects, such as, supplier, product, customer and sales. Separate, DW is always a physically distinct store of data transformed from the application data found from the traditional OLTP environment. Due to this separation, a data warehouse does not require transaction processing, recovery, and concurrency control mechanisms. It usually combines two operations in data accessing: initial loading of data and access of data. Time Variant. Problems have to be addressed; trends and correlations have to be explored. They are time stamped and associated with defined periods of time2. Not dynamic. When the data is updated, it is done only periodical, but not as on individual basis.

223
Integrated Performance. The data which is requested by the user has to perform well on all scales of integration. Data cleaning and data integration techniques are applied to ensure consistency in naming conventions, encoding structures, attribute measures, and so on. Consistency. Architectural and contents of the data is very significant and can only be ensured by the use of metadata: this is independent from the source and collection date of the data. 1.2-data warehouse building process7: To construct an effective data warehouse we have to analyze business processes, dimension and business environment. After obtaining the DW logical schema, build it through application of transformations to the source logical schema, and apply the construction of a large and complex information system, can be viewed as the construction of a large and complex building, for which the owner, architect, and builder have different views. These perspectives are merged to form a complex framework that represents the top-down, business-driven, or owners perspective, as well as the bottom-up, builder-driven, or implementers view of the information system. The multidimensional model transforms the visualization of a schema into a more business-focused framework. All these structures cubes, measures and dimensions interact with each other to provide an extremely powerful reporting environment. Most of the multidimensional database systems used in business framework and decision support applications is particular. Generally, they can be categorized into two categories: 1st is the special traditional relational DBMS which create multidimensional schemas such as star schema and snowflake schema by applying the mature theory of relational database systems, the 2nd is the multidimensional database systems which are designed specially for online analysis. All dimensional tables are directly connected with the fact table and do not generate connections with other dimensional tables. However, it will need to separate one dimension into many dimensions as per business dimension mappings. Such structure is called the snowflake mode, a slight modification of star adding relational constraints of normalization. Relational database systems are suitable for OLTP4 applications, but it does not guarantee to meet the expectations of online analytical processing applications in real time environment. Relational OLAP3 systems which are inherently ORDBMS can only be classified as relational database systems, because after changing into systems supporting OLTP applications, relational approach can only used, that disappeared the object features. A multidimensional database is a type of database (DB) which is optimized for DW and OLTP applications. Multidimensional databases are mostly generated using the given data from existing RDs, a multidimensional database allows a user to refer problem and questions related to concizing business operations and trends analysis. An OLTP application that processes data from a multidimensional database is formally referred as a multidimensional OLTP application. A multidimensional database or a multidimensional database management system implies the ability to rapidly accept the data in the database so that answers can be generated easily. A number of vendors provide products that use multidimensional databases. An approach to how data is stored and the user interface differs. To multidimensional database systems, applications are eased due to uniform specifications does not exist. They are special database systems which do not support comprehensive query, Four different views regarding the design of a data warehouse must be deemed: the top-down view, the data source view, the data warehouse view, and the business query view. The top-down view allows the selection of the relevant information vital for the DW. This information resembles the current and future business requirements. The data source view shows or reflect the information being captured, stored, and managed by operational systems. This information may be documented at various Hierarchies of detail and accuracy, from individual data source tables to integrated data source tables. Data sources are often modeled by traditional data modeling approach, such as the entity-relationship model or CASE (computeraided software engineering) tools. The data warehouse view combines fact tables and dimension tables. It represents the information that is stored inside the DW, including predetermined aggregates and counts, as well as information pertaining to the source, date, and time of origin, added to provide historical scenario. Finally, the business query view is the perspective of data in the data warehouse from the viewpoint of the end user. 2. Multi Dimensional modeling9 It is a technique for formalizing and visualizing data models as a set of measures that are defined by common aspects of the business processes. Business Dimensional modeling has two basic concepts. Facts:

A fact is a collection of related data items, composed of Business measures. A fact is a focus of interest for the decision making Business process.

224
Measures are continuously valued results that describe facts. A fact is a business statistics. Multidimensional data-base technology is a key term in the interactive analysis of large amounts of data for decision-making purposes. Multidimensional data model is introduced based on relational elements. Dimensions are modeled as dimension relations. Data mining applications provides knowledge by searching semi-automatically for previously unknown patterns, trends and their relationships in multidimensional databases structures. OLAP software enables analysts, managers, and executives to gain insight into the performance of an enterprise through fast and interactive access to a wide range of views of data organized to reflect the multidimensional nature of the enterprise wide data. 3.1:-The Goals of Multi-Dimensional Data Models11 To enable end-user to access the information in a way that corresponds to his normal understanding of his business, key figures or facts from the different perspectives that relates with the business environment that influence them. To facilitate the physical implementation that the software recognizes (the OLAP), thus allowing a program to easily access the data required for processing. 3.2:- Usages of Multi-Dimensional modeling use business dimensions: INFORMATION PROCESSING: support for querying, basic statistical analysis, and reporting using crosstabs, graphs, tables or charts. A current trend in data warehouse information processing is to construct low-cost Web-based application tools for global access integrated with Web browsers. ANALYTICAL PROCESSING Using dimensions with OLAP it includes OLAP operations such as slice-and-dice, drill-down, roll-up, drill-through, drill-across and pivoting. It generally operates on historical data in both summarized and detailed forms. The major strength of on-line analytical processing over information processing is the multidimensional data analysis of data warehouse data. DATA MINING support with KDD (knowledge discovery in databases) it helps to discovers hidden patterns and associations, clustering, performing classification and prediction, and presenting the mining results using visualization tools etc.

Dimension: The parameter over which we have to perform analysis of facts and data. The parameter that gives meaning to a measure number of customers is a fact, perform analysis over time. Dimensional modeling has been coherent architecture for building distributed DW Applications. If we come up with more complex queries for our DW which involves three or more dimensions. This is where the multi-dimensional database plays a eminent role. Dimensions are distributed by which summarized data can be used. Cubes are data manipulating units composed of fact tables and dimensions from the data warehouse (DW). Dimensional modeling also has emerged as the only coherent architecture for building distributed data warehouse Applications. 3. Multi-Dimensional Modeling using business dimension9 Multidimensional database technology has come a long way since its inception more than 30 years ago. It has recently begun to reach the mass market, with major providers now delivering multidimensional database engines along with their traditional relational database software, often at no extra cost. A multidimensional data model is typically referred for the design of corporate data warehouses and departmental data marts. Such a model can be adopted with star schema, snowflake schema, or fact constellation schema. The core of the multidimensional model is the data cube, which consists of a large set of facts (or measures) and a number of business dimensions. Business dimensions are the entities or perspectives with respect to organizations that wants to keep information and are hierarchical in nature.Multi-dimensional technology has also made significant gains in scalability and maturity to describe the organizations current business requirement. Multidimensional model is based on three key concepts: Modeling business rules Cube and measures Dimensions

3.3:- Logical Multidimensional Model

225
The multidimensional data model is important because it enforces simplicity. As Ralph Kimball states in his landmark book, The DW Toolkit: The central attraction of the dimensional model of a business is its simplicity that simplicity is the fundamental key that allows users to understand DBs, and allows software to navigate databases efficiently. The multidimensional data model is composed of logical cubes, measures, dimensions, hierarchies, levels, and attributes. The simplicity of the model is inherent because it defines objects that represent real-world business entities. Analysts know which business measures they are interested in examining, which dimensions and attributes make the data meaningful, and how the dimensions of their business are organized into levels and hierarchies. Multidimensional data cubes, are the basic logical model for OLAP applications12. The focus of OLAP tools is to provide multidimensional analysis to the underlying information. To achieve this goal, these tools employ multidimensional models for the storage and presentation of data. Figure1: Diagram of logical Multi dimensional model Users can quickly and easily create multi level queries. The multi-dimensional query model has one important advantage over the relational querying techniques. Each dimension can be queried separately. This allows users to divide and analyze what would be a very complex query into simple manageable steps. The multidimensional model also provides powerful filtering capabilities. Additionally, it is also possible to create conditions based on measures that are not part of the final report. Because the dimensional query is independent of the filters, it allows complete flexibility in determining the structure of the condition. The relational implementation of the multidimensional data model is typically a star schema, or a snowflake schema. 3.4 Conceptual View: Figure 2: Levels of view

Figure 1: Keys of multidimensional model A logical model (figure1) for cubes based on the key observation that a cube is not a self-existing entity, but rather a view over an underlying data set. Logical cubes provide a means of organizing measures that have the same shape, that is, they have the exact same dimensions. The relational model forces users to manipulate all the elements as a whole, which tends to lead to confusion and unexpected result sets. In contrast, the multi-dimensional model allows end users to filter each dimension in isolation and uses more friendly terms such as Add, Keep and Remove.

Conceptual view describes the semantics of a domain, being the scope of the model. For example, it may be a model of the interest area of an organization or industry. This consists of entity classes, representing kinds of things of significance in the domain, and relationships assertions about associations between pairs of entity classes. A conceptual view specifies the kinds of facts or propositions that can be expressed using the model. In that sense, it defines the allowed expressions in an artificial 'language' with a scope that is limited by the scope of the model. Early phases of many software development projects emphasize the design of a conceptual data model. Such a design can be detailed into a logical data model6. In later stages, this model may be translated into physical data model. However, it is also possible to implement a conceptual model directly. Multidimensional Conceptual View provides a multidimensional data model that is intuitively analytical and easy to use. Business users view of an enterprise is multidimensional in nature. Therefore, a

226
multidimensional data model conforms to how the users perceive business problems. 3.5 Star schema architecture with business dimension scenario: it consists of a fact table for a particular business process ( for example: Sales analysis would take Sales as fact table) with a single table for each dimension table. Star Schema is the special design technique for multidimensional data representations. It Optimize data query operations instead of data update operations. Star Schema is a relational database schema for representing multidimensional data. It is the simplest form of data warehouse schema that contains one or more dimensions and fact tables15. It is called a star schema because the entity-relationship diagram between dimensions and fact tables resembles with a star like structure where one fact table is connected to multiple dimensions. The center of the star schema consists of a huge fact table and it points towards the dimension tables. The advantage of star schema is slicing down, performance increase and easy understanding of data. Steps in designing star schema Identify a business process for analysis. Identify measures or facts. Identify the dimensions for facts. List the columns that describe the each dimension. Determine the lowest level of summary in a fact table15. Snowflake schema: The snowflake schema is a variant of the star schema, where some dimension tables are normalized, and enhanced further splitting the data into additional tables16. The resulting schema graph forms a shape similar to a snowflake. Important aspects of Star Schema & Snow Flake Schema In a star schema every dimension will have a primary key and also a dimension table will not have any parent table. Whereas in a snow flake schema, a dimension table will have one or more parent tables. Hierarchies for the dimensions are stored in the dimensional table itself in star schema. Whereas hierarchies are broken into separate tables in snow flake schema16,17. These hierarchies help to drill down the data from topmost hierarchies to the lowermost hierarchies. Snowflake schema is the normalized form of star schema.

4 Proposed Model: In this section we will summarize the basic concepts of object oriented multi-dimensional model [19]. The multi dimensional model is the core of the comprehensive object oriented model of a DW containing all the details that are necessary to specify a data cube, the dimensions, the classification hierarchies, the description of fact and measures attributes.

3.6 Snowflake schema

227
Locationdimensions have upper level hierarchies say Product, QTR and Region respectively. Then in the notation of GOOMD model, there will be four middle layer groups {DCustomer, DModel, DLocation, DTime} with hierarchy.

Figure 3: Hierarchical View of Proposed Model Schema At the lowest layer, each vertex represents an occurrence of an attribute or measure, e.g. product name, day, customer city etc. A set of vertices semantically related is grouped together to construct an Elementary Semantic Group at lower layer. On next, several related elementry are group together to form a Semantic Group at middle layer the next upper layer constructs to represent any context of business analysis. A set of vertices of any middle, those determine the other vertices of the lower, is called Determinant Vertices. This layered structure may be further organized by combination of two or more middle as well as lower group to represent next upper level layers from the topmost layer the entire database appears to be a graph with middle as vertices and edges between middle layer object. Dimensional Semantic Group is a type of middle layer object to represent a dimension member, which is an encapsulation of one or more lower layer group along with extension and / or composition of one or more constituent middle layer groups. Fact Semantic Group (FSG) is a type of group represents facts, which are an inheritance of all related lower, middle and a set of upper defined on measures. In order to materialize the Cube, one must ascribe values to various measures along all dimensions and can be created from FSG. Example: Let consider an example, based on Sales Application with sales Amount as measure and with four dimensions Customer, Model, Time and Location with the set of attributes {C_ID, C_NAME, C_ADDR}, {M_ID, M_NAME, P_ID, P_NAME, P_DESC}, {T_ID, T_MONTH, Q_ID, Q_NAME, YEAR} and {L_ID, L_CITY, R_ID, R_NAME, R_DESC} respectively. Model, Time and

Figure 4:Schema for Sales Application in Proposed Model model also provides algebra of OLAP operators those will operate on different semantic groups. The Select operator is an atomic operator and will extract vertices from any middle layer groups depending on some predicate P. The Retrieve operator extracts vertices from any Cube using some constraint over one or more dimensions or measures. The Retrieve operator is helpful to realize slice and dice operation of OLAP. The Aggregation operators perform aggregation on Cube data based on the relational aggregation function like SUM, AVG, MAX etc. on one or more dimensions and are helpful to realize the roll-up and drill down operations of OLAP. 5 Comparisons of Conceptual Design Models Property 1 (Additivity of measures): DF, starER and OOMD support this property. Using ME/R model, only static data structure can be captured. No functional aspect can be implemented with ME/R model. Property 2 (Many-to-many relationships with dimensions):StarER and OOMD support this property. DF and ME/R models do not support manyto-many relationships. Property 3 (Derived measures): None of the conceptual models include derived measures as part of their conceptual schema except OOMD model.

228

References
[1] Jiawei Han and Micheline Kamber Data Mining: Concepts and Techniques, Second Edition. Morgan Kaufmann publications [2] S.Kelly.Data Warehousing in Action.John Wiley &Sons(1997). [3] Kimball, R. The Data Warehouse Toolkit: Practical Techniques for BuiIding Dimensional Data Warehouses. John Wiley and Sons 1996. ISBN 0-471-15337-0 [4] S.Chaudhuri,U.Dayal.An overview of data warehousing and OLAP technology. SIGMOD Record 26,1 (1997). [5] G.Colliat.OLAP, relational and multi-dimensional database systems.SIGMOD Record 25, 3 (1996) [6] M. Golfarelli, D. Maio, and S. Rizzi. The dimensional fact model: a conceptual model for data warehouses. IJCIS, 7(2- 3):215247, 1998. [7] M. Jarke, M. Lenzerini, Y. Vassilious, and P. Vassiliadis, editors. Fundamentals of Data Warehousing. Springer-Verlag,1999. [8] L. Cabibbo and R. Torlone. A logical approach to multidimensional databases. In Proc. of EDBT-98, 1998. [9] E. Franconi and U. Sattler. A data warehouse conceptual data model for multidimensional aggregation. In Proc. of the Workshop on Design and Management of Data Warehouses (DMDW-99), 1999. [10] McGuff, F. Data Modeling for Data WarehousesOctober, 1996 from http://members.aol.corn/fmcguff7dwmodel/dwmodel.html [11] Gyssens, M. and Lakshmanan, L.V.S. A foundation for multi-dimensional databases, Technical Report, Concordia University and University of Limburg, February 1997. [12] M.Blaschka,C.Sapia,G.H ofling, and B. Dinter. Finding Your Way through Multidimensional Data Models. In DEXA 98, pages 198203, 1998. http://www.pentaho.org (16.06.2006), 2006. [13] Antoaneta Ivanova, Boris Rachev Multidimensional models- Constructing DATA CUBE International Conference on Computer Systems and Technologies CompSysTech2004 [14] Multidimensional Database Technology by Torben Bach,Pedersen,Christian S. Jensen,Aalborg University [15] Rakesh Agrawal, Ashish Gupta, and Sunita Sarawagi.Modeling multidi-mensional databases. Research Report, IBM Almaden Research Center, San Jose, California,1996. [16] P. Vassiliadis and T.K. Sellis, A Survey of Logical Models for OLAP Databases, ACM SIGMOD Record, vol. 28, no.4, 1999. [17] L. Cabibbo and R. Torlone. A logical approach to multidi-mensional databases. In Proc.of EDBT-98, 1998 [18] Deepti Mishra, Ali Yazici, Beril Pinar Baaran A Casestudy of Data Models in Data Warehousing. [19] Anirban Sarkar, Swapan Bhattacharya Object Relational Implementation of Graph Based Conceptual Level Multidimensional Data Model

Property 4 (Non-strict and complete classification hierarchies): Although DF and ME/R can define certain attributes for classification hierarchies; starER model can define exact cardinality for non-strict and complete classification hierarchies. OOMD can represent non strict and complete classification hierarchies. Property 5 (Categorization of dimensions specialization/ generalization): All conceptual design models except DF support this property. Property 6 (Graphic notation and specifying user requirements): All modeling techniques provide a graphical notation to help designers in conceptual modeling phase.ME/R model also provides state diagrams to model systems behavior and provides a basic set of OLAP operations to be applied from these user requirements. OOMD provide complete set of UML diagrams to specify user requirements and help define OLAP functions. Property 7 (Case tool support): All conceptual design models except starER have case tool support18. 6. Conclusion This paper helps us to enlighten our comprehensibility with the multidimensional structure related to business processes and dimensions. Multi-dimensional data model combined with facts and context dimensions using Star and Snow Flake schema. This paper relates the various multi-dimensional modeling according to the multidimensional space, language aspects and physical representation of the traditional Database Model and establish relationship multidimensional data to object oriented data.

229

WIRELESS SENSOR NETWORKS USING CLUSTERING PROTOCOL


Gurpreet Singh*, Shivani Kang** Faculty of Computer Applications*, Dept.of Computer Science** Chandigarh Group of Colleges, Gharaun Campus, Mohali, Punjab. gp_cgc@yahoo.com* Abstract: This Paper presents the Centralized Energy Management System (CEMS), a dynamic fault-tolerant reclustering protocol for wireless sensor networks. CEMS reconfigures a homogeneous network both periodically and in response to critical events (e.g. cluster head death). A global TDMA schedule prevents costly retransmissions due to collision, and a genetic algorithm running on the base station computes cluster assignments in concert with a head selection algorithm. CEMS performance is compared to the LEACH-C protocol in both normal and failure-prone conditions, with an emphasis on each protocols ability to recover from unexpected loss of cluster heads. Keywords: CEMS, TDMA, WSN, GA. I. Introduction nodes transmitting their data to the base station, they instead transmit to another sensor designated as the local cluster head. The cluster head then sends aggregated (and possibly compressed) sensor information to the sink as a single transmission. Note that clustering makes some nodes more important than others, while increasing the energy dissipation of those same nodes. This Paper implements a novel reclustering technique that minimizes both energy expenditure and loss of network coverage due to the failure of cluster heads. CEMS (Centralized Energy Management System) moves almost all processing not directly related to data collection off of the energy-limited sensor nodes and onto the sink. Furthermore, the base station maintains a record of expected transmission times from the networks cluster heads, based on their location on the global TDMA1 schedule. If a cluster head consistently fails to transmit during its expected window of time, the sink triggers an emergency reclustering to restore network coverage.

Wireless sensor networks (WSNs) are increasingly deployed in a variety of environments and applications, ranging from the monitoring of medical conditions inside the human body to the reporting of mechanical stresses in buildings and bridges. In these and many other WSN applications the sensors cannot be recharged once placed, making energy expenditure the primary limiting factor in overall network lifetime. One standard WSN configuration consists of a set of sensors that communicate to the external world via a base station, or sink, that has no power constraints. The sensors number in the hundreds or even thousands, and are primarily constrained by a limited battery supply of available energy. While the sink is modeled as a single node, it may provide access to other systems upstream such as distributed processing facilities or databases devoted to consolidating and cataloging the reported WSN data. Since the primary form of energy dissipation for wireless sensors is in radio transmission and reception [1], a variety of network modifications have been proposed to limit radio use as much as possible. Sensor clustering at the network layer has been shown to be a scalable method of reducing energy dissipation. Rather than individual sensor

II.

Wireless Sensor Network

While WSNs have much in common with more traditional ad-hoc and infrastructure-mode wireless networks, they differ in several important ways. A WSN generally has a large number of sensors scattered over an area and a single node referred to as the base station, or sink, which is responsible for receiving data transmitted by sensors in the field. It bears some similarity to an access point in an infrastructure-mode network. The sink may or may not be located inside of the space being sensed, and is almost always considered to be a powered node operating without energy constraints. Depending on the application and configuration of the WSN, the base station may have additional responsibilities such as coordinating network activities, processing or formatting incoming data, or working with an upstream data analysis system to provide data matching any query requests that it receives [2]. In contrast to the single base station, WSNs may have hundreds or even thousands of sensors operating in

230
the field. These low-power devices are often battery powered and sometimes include solar panels or other alternative energy sources. During their limited lifespan (defined as the time interval during which sufficient energy remains to transmit data), sensors are tasked with monitoring a single aspect of their surrounding environment and reporting their sensed data via an onboard radio transceiver. Given the above characteristics, a few important differences from standard WLANs become apparent: Most traffic is upstream, from the sensors to the base station. The small amount of downstream traffic tends to be dominated by broadcast traffic from the base station to provide generic updates to all sensor nodes. Power use is a key performance metric, as sensors are battery powered. Network links tend to be low-capacity, as high throughput is energy intensive and often unnecessary. Long delays may be acceptable for many WSN applications (e.g. a network monitoring soil pH will be relatively immune to high latencies.) representative signal, a processor governing sensor operations, onboard flash memory, and the radio transceiver responsible for linking the sensor node to the rest of the network. Sensing Unit At their simplest, sensors are designed to generate a signal corresponding to some changing quantity in the surrounding environment. This may be anything from a simple Peltier diode to measure temperature (such as the Microchip TC74), or as complex as a charge-coupled device to monitor video input (such as the Omnivision OV7640). The data are sent to an onboard microprocessor after being converted to a digital signal by the sensor electronics, which may perform some processing or culling before electing to transmit the information over the modules transceiver. Generally sensors are procured and attached to WSN sensor platforms by the purchaser, and are often manufactured by different companies than those which provide the platform itself. Processing Unit A sensors processor must, at minimum, serve as an effective interface to the sensor module and regulate data flow from the sensing unit to the radio transceiver. There are currently three popular types of processing unit in general use: microcontrollers, microprocessors, and Field-Programmable Gate Arrays (FPGAs) [4]. Onboard storage, often in the form of flash memory, is also often included as part of a sensors processing unit. Microcontrollers such as the 8-bit TI MSP430 [9] and 16-bit Atmel AVR are the simplest and one of the most common forms of processor, unable to support complex operations but running at a low clock speed and consuming the least amount of power. They are most often used when little data processing or decision making is necessary. Microprocessors such as the 32-bit Intel Xscale are a more general-purpose CPU, and are potentially much more powerful than microcontrollers, with significantly higher clock speeds and more flexibility in terms of their programming. Field-Programmable Gate Arrays (FPGAs) use a hardware description language to allow sensor modules to be reconfigured in the field to rapidly process the data that their sensor units are reporting. This can be invaluable for real-time surveillance networks and target tracking, where image processing algorithms can be implemented on the hardware level without purchasing a dedicated GPU. FPGAs are also

III.

Components of Sensor Network

Wireless

WSNs employ a variety of hardware platforms and software systems. Sensors themselves vary in size from a few millimeters even within similar sensors, radio transceivers, sensors, and microprocessor facilities may vary. Given the wide variety of platforms available, any protocols developed for a WSN must consider the characteristics of the underlying hardware on which they will operate.

Hardware
Wireless sensor nodes have shrunk significantly as MEMS technology has progressed. Individual components are now often integrated into the same chip, and hardware design has evolved to reflect this change. A sensor node is composed of several independent components linked together to form one operative package. A power supply provides the necessary energy to a sensing unit designed to monitor the environment and produce a

231
the highest energy consumers of the three processors, and may not be compatible with general-purpose WSN software systems. and only a few specific applications such as live video surveillance impose any strict time constraints on data acquisition and processing. Since sensor hardware is often extremely limited in terms of both resources and available energy, small footprints and efficient use of memory and processor cycles is a key requirement for any WSN operating system.E.g TinyOS. WSN Simulation Wireless sensor network research is largely directed at improving the energy efficiency, coverage, reliability, and security of sensors and networks. This translates into a need for detailed information about conditions on the lower levels of the network stack in an ad-hoc wireless environment. Conventional network simulators often have more support for packet-level network-layer simulation than for, e.g., frame-level information gathering and radio energy dissipation modules. To meet this need a number of simulators have evolved or adapted to service the needs of WSN.E.g GloMoSim IV. The Centralized System (CEMS) Energy Management

Software
Software written for wireless sensor networks differs from more conventional platforms in several respects. The vast majority of WSN operating systems and network protocols that have been produced in academia or in the commercial sector are poweraware due to the limited amount of energy available to sensor nodes. From a software design standpoint, this necessitates optimizing algorithms and program architectures to minimize the amount of energy dissipated per operation. Efficiency of execution in terms of running time is still a concern, but is of secondary importance. If an algorithm could finish rapidly but consume more power than a slower implementation, the slower version might still be selected for use in a WSN. Furthermore, sensor nodes are extremely limited in terms of resources. On-board RAM capacity is extremely small due to the energy drain of volatile memory. Software systems must therefore rely primarily on register-based operations and any flashbased storage medium that might be present. This requires that a program use a limited number of often-accessed data structures, and that it performs computations using as little memory as possible. Two other distinguishing features of WSN software arise less from hardware limitations and more from environmental constraints [5]. Since sensors can be deployed in a potentially inaccessible field (e.g. underwater, inside walls, in a combat zone), WSN software systems must be able to run unattended for long periods of time. Any logical or physical faults should be able to be dealt with, worked around, or minimized in impact without the intervention of human agencies. Support for any kind of graphical user interface, or even a terminal interface in field conditions, is not generally provided. Sensors may be reprogrammed or configured in controlled conditions, however, via software running on an external machine to which individual nodes may be connected.

The Centralized Energy Management System is a clustering protocol that exploits the predictable nature of TDMA-based channel access to rapidly detect and respond to critical failures. Almost all energy-intensive operations (such as cluster formation) are moved upstream to the base station, which is assumed to not have any energy constraints. CEMS has two distinct phases: cluster formation and steady-state operation. The former is run at the beginning of each reclustering phase, which occurs both periodically and in response to cluster head death. The base station calculates cluster assignments and notifies the new cluster heads. If all heads acknowledge, the steady-state phase is initiated. During this phase, sensors report data to their cluster heads. The data are then compressed and aggregated before being forwarded to the base station. Two assumptions governed the creation of this system: The optimal clustering configuration changes over time as the residual energy of cluster heads decreases Any node, including a cluster head, has a non-zero probability of failing at a given time due to random accidents. The sink maintains state information on each node in the network consisting of its location and its

Operating Systems WSN-specific operating systems are distinguished from existing embedded OSs such as ChibiOS/RT or Nucleus RTOS by their lack of real-time processing constraints. Sensor networks are rarely interactive,

232
projected amount of residual energy. The assignment of nodes to clusters is calculated using a genetic algorithm (GA) which considers nodes spatial positions, and the assignment of a cluster head to each cluster is calculated using node energy and position. The number of cluster heads, determined a priori, is an input parameter to the system. Clustering Phase CEMS clustering phase is initiated at network startup and at each subsequent reclustering, whether due to period triggers or in response the cluster head failure. Selection of cluster heads and cluster members is divided into two stages. A genetic algorithm first determines cluster membership for each sensor in the network during the cluster formation stage. This information, along with spatial coordinates and current energy levels, is then passed to a cluster head selection algorithm during the head selection stage. Once both cluster heads and members have been determined, the sink informs each sensor of its new assignment during the sensor notification stage. The genetic algorithm which determines cluster membership is implemented with the GALib C++ library. It uses a fixed-length list of integers to describe a genome representing a potential network topology. The genomes length is always equal to the current number of living sensors. Each value in the list signifies a cluster ID. (The number of clusters is determined a priori.) Selection is accomplished through the minimizing objective function presented in Figure 1. First, a centroid for each cluster is determined. Each cluster is then assigned a score based on the sum of the squared distances between each cluster member and that clusters centroid. The sum of all cluster scores is used as the objective score for that genome [7]. Each individual sensor has a probability of being chosen for mating equal to its fitness score divided by the sum of fitness scores over that generation. Two individuals are chosen each generation, and the highest scoring genome is selected.

zc The centroid for cluster z zsi Sensor i in cluster z sz Score for cluster z n The number of sensors in a given cluster m The number of clusters in the genome Figure 1 - Objective Function

Sensor Notification
After sensors have been assigned and heads elected to clusters, the base station broadcasts a message to each cluster head informing it of its new role, which sensors are in its cluster, the distance it must transmit to the base station, and the distances that its members must transmit [9]. The cluster head relays distance and membership data to each sensor in its cluster and sends an acknowledgement to the base station. Once all acknowledgments are received, the sink initiates the network's steady-state phase. If all cluster heads do not send an acknowledgement before a timeout window expires, the sink reclusters and increases the missed transmission count of any cluster head which failed to acknowledge. Any sensor with three consecutive missed transmissions will be declared dead and removed from future clustering assignments.

TDMA Scheduling
CEMS employs a global TDMA schedule (i.e. all sensors and clusters participate) to manage channel access among sensors and the base station. There is a single broadcast slot at the beginning of each cycle, while the remaining slots are strictly unicast. Each sensor is given a unique slot in the TDMA schedule during each reclustering phase. Note that there is no guarantee a sensors slot will be the same in two different rounds of operation. All sensors, including cluster heads, transmit to the base station on their slot. All nodes must also listen on slot 0, which is reserved for broadcast communications from the base station. Furthermore, cluster heads must listen during each slot in their clusters range to receive data from their members. To minimize hardware delays resulting from switching between sleep and wake states, slots within a single cluster always form a contiguous block of slots. Sensors are

233
in sleep mode at all other times, their radio electronics turned completely off. cluster members reclustering. that they die shortly after

Failure Reclustering

Detection

and

Emergency

CEMS uses the periodic nature of its global TDMA cycle to rapidly recover from coverage loss due to cluster head death. At the beginning of each steadystate phase, the base station computes the expected transmission times of each cluster head using its TDMA slot and the overall cycle length. If any cluster head fails to transmit during its expected time, the sink increments that sensors missed transmission count. Three missed transmissions result in that sensor being labeled as dead, and trigger an emergency reclustering event to reconnect the cluster to the WSN. A successfully received transmission resets the sensors missed transmission count. Note that a tradeoff exists between the recovery period and accurate classification of cluster head death [10]. The more missed transmissions required before a sensor is declared dead, the longer a cluster may be offline before emergency reclustering is triggered. A small missed transmission count, however, is vulnerable to false positives. In the field a sensors transmissions may be blocked by a mobile obstacle (e.g. a passing vehicle), interfered with by a spike in radio noise, etc. Misinterpretation of these temporary problems as permanent sensor death could lead to unnecessary energy expenditure and downtime due to reclustering.

Therefore, in dense networks with overlapping areas of coverage or for networks which do not require complete coverage, a long reclustering period may be preferable. For networks where coverage must be maintained for as long as possible, a shorter reclustering period is desirable.

Figure 2: Hour Reclustering Period

Reclustering Period
The ideal duration of each reclustering period in CEMS is application-specific. Figures Figure 2 and Figure 3 show the tradeoff between coverage and lifetime for reclustering periods of 20 hours and 100 hours, respectively. Each graph shows the residual energy over time for each sensor in the network. Given an initial population of one hundred sensors, the simulation begins with five clusters. Sharp declines in energy correspond to being made a cluster head, while gradual energy loss represents cluster membership. The relatively narrow gap between the sensor with the lowest residual energy and that with the highest residual energy in Figure 2 indicates a fairly even balancing of energy costs over the network. This preserves coverage for as long as possible, after which all sensors die within a few hours of each other. Figure 3, conversely, begins to lose sensors almost immediately. In this configuration, cluster heads lost so much energy before being reassigned as

Figure 3: Hour Reclustering Period

V.

Conclusion

By reclustering infrequently, a network may have a longer operative lifespan at the expense of early and increasingly common gaps in its coverage. For denser networks or those monitoring conditions likely to

234
register on multiple sensors, this may be an acceptable tradeoff. For sparser of more precise networks, however, a decreased lifespan may be an acceptable cost for ensuring good coverage. A further tradeoff must be made between the number of clusters in a network and the expected failure rate of sensors due to accidents. A small number of cluster heads cause a significant loss of coverage if they fail, while a larger number of clusters head cause a proportionally smaller coverage loss. A synchronized global TDMA schedule allows the base station to predict when transmissions from given cluster heads are expected and ensures that no sensors will act as hidden terminals. CEMS uses the former feature to implement a quick recovery system that rapidly restores network coverage in the event of cluster head death. 02 (June 28 - July 01, 2004). ISCC. IEEE Computer Society, Washington, DC, 238-243. [4] Manjeshwar, A. and Agrawal, D.P. Teen: a routing protocol for enhanced efficiency in wireless sensor networks. In Parallel and Distributed Processing Symposium. Proceedings 15th International, pages 2009-2015. [5] Tang, Q.; Tummala N.; Gupta S. and Schweibert L. Communication Scheduling to Minimize Thermal Effects of Implanted Biosensor Networks in Homogeneous Tissue. IEEE Transcations on Biomedical Engineering 52 (2005): 1285-1293. [6] Mudundi, S.R., and Hasham H.A. A New Robust Genetic Algorithm for Dynamic Cluster Formation in Wireless Sensor Networks. Proceedings of the Seventh IASTED International Conferences (2007). [7] Hussain S.; Matin A.W.; Islam O., "Genetic Algorithm for Energy Efficient Clusters in Wireless Sensor Networks," pp.147-154, International Conference on Information Technology (ITNG'07), 2007 . [8] Grefenstette, J., "Optimization of control parameters for genetic algorithms," IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC-16(1), pp. 122-128, 1986. [9] Haupt, R. "Optimum population size and mutation rate for a simple real genetic algorithm that optimizes array factors." Proc. of Antennas and Propagation Society International Symposium, 2000, Utah, Salt Lake City. [10] Varga, A. Omnet++ Discrete Event Simulation System. Computer software. Vers. 3.2. Omnet++ Community Site. <http://www.omnetpp.org/>.

VI.

References

[1] Heinzelman, W.; Chandrakasan, A.; Balakrishnan, H., "Energy-efficient communication protocol for wireless microsensor networks," System Sciences, 2000. Proceedings of the 33rd Annual Hawaii International Conference, 10 pp. vol.2-, 4-7 Jan. 2000 [2] Heinzelman, W., Chandrakasan A., Balakrishnan H. "An Application-Specific Protocol Architecture for Wireless Microsensor Networks." IEEE Transactions on Wireless Communications 1 (2002). [3] Voigt, T.; Dunkels, A.; Alonso, J.; Ritter, H.; and Schiller, J. 2004. Solar-aware clustering in wireless sensor networks. In Proceedings of the Ninth international Symposium on Computers and Communications 2004 Volume 2 (Iscc"04) - Volume

235

Performance Evaluation of Route optimization Schemes Using NS2 Simulation


MANOJ MATHUR1, SUNITA MALIK2, VIKAS3
Electronics & Communication Engineering Department D.C.R.U.S.T., MURTHAL (Haryana)
mnjmathur03@gmail.com 2 snt mlk@yahoo.co.in 3 vikaspanchal93@gmail.com
1

Abstract- MIPv4 (Mobile Internet Protocol version 4), in which the main problem is triangular routing. Mobile node able to deliver packets to a corresponding node directly through foreign agent but when corresponding node sends packet to the mobile node packet comes to foreign agent via home agent then it comes to mobile node. This asymmetry is called triangle routing. It leads to many problems, like load on the network and delay in delivering packets. The next generation IPv6 is designed to overcome this kind of problem (triangle routing). To solve the triangle routing problems three different route optimization schemes are used which exclude the inefficient routing paths by creating the shortest routing path. These are Liebschs Route optimization scheme, Light Weight Route optimization scheme, enhanced light weight route optimization scheme. I have taken Throughput and Packet delivery fraction, Performance metrics to compare these three schemes by using NS-2 simulations. Throughput is the rate of communications per unit time. Packet delivery fraction (PDF) is the ratio of the data packets delivered to the destinations to those generated by the CBR sources. By using these parameters I have found that enhanced light weight route optimization scheme performance is better than Liebschs Route optimization scheme & Light Weight Route optimization scheme. I INTRODUCTION As the growth of wireless network technology dimension for accessing mobile network has been increased dramatically. Mobile Internet Protocol version6 is a mobility protocol standardized by the Internet Engineering Task Force (IETF). In Mobile Internet Protocol version6, communications are maintained even though the mobile node (MN) moves from its home network to foreign network. This is because that the MN sends Binding Update (BU) message to its Home Agent (HA) located in the HN to inform the location information

whenever the MN hands off (move) to other networks. The Mobile Nodes in the Internet, it requires that the MNs maintain mobility related information and create own mobility signaling message. In other words, the MNs that has limited processing power, battery, and memory resource. To overcome such limitations, IETF has proposed Proxy Mobile IPv6 (PMIPv6) protocol. In PMIPv6, the MN's mobility is guaranteed by the newly proposed network entities such as the local mobility anchor (LMA) and the mobile access gateway (MAG). PMIPv6 causes the triangle routing problem that causes inefficient routing path. In order to establish the efficient routing paths, three different Routing Optimization (RO) schemes have been introduced. To solve the triangle routing problems three different route optimization schemes are used which exclude the inefficient routing paths by creating the shortest routing path The RO schemes using correspondent information (CI) message. These are Liebschs Route optimization scheme, Light Weight Route optimization scheme, enhanced light weight route optimization scheme., In this paper I have compare these three schemes by using NS-2 simulations. II Terminology 1. Local Mobility Anchor: The LMA is the Home Agent of an MN in a PMIPv6 domain. It is topological anchor point for the MNs home network. LMA provides charging and billing services to the MN when MN accesses network resources and services. So, Communication between the MNs must pass through the LMA. 2. Mobile Access Gateway: It is the functional element that manages mobility related signalling on behalf of MNs. It is responsible for detecting the MN's attachment or detachment from an access network. 3. Binding Cache Entry: It provides the route information about a communicating node in the networks. It can exist either within an Local Mobility Anchor or in the Mobile Access Gateway.

236

4. Proxy Mobility Agent: PMA is the proxy mobility agent that resides in each of the access routers that are within a mobility domain. PMA helps to send proxy binding update to LMA on behalf of the mobile. 5. Proxy Binding Update: The PBU is a message sent by a Mobile Access Gateway to the MN's Local Mobility Anchor for establishing or deestablishing a connection between the Mobile Node's. It is also called signaling message. Local Mobility Anchor is act as home agent. It informs the Local Mobility Anchor that the MN is now connected to or disconnected from the MAG. 6. Proxy Binding Acknowledgment (PBA): A PBA is a response message sent by an LMA in response to a PBU that it (LMA) earlier received from the corresponding MAG. A success or positive response indicates that it can start transmitting data packets on behalf of the MN through the responding LMA to the MNs Corresponding Node(s).

III THE RO SCHEMES Liebschs RO Scheme: In Liebschs Route Optimization we use Local Mobility Anchor and Mobile Access Gateway to exchange the RO message for establishing RO path for the Mobile Nodes. It is the LMA which enable the packet sending possible between MN1and MN2. When the MN1 sends the packet to the MN2, the Local Mobility Anchor enable the RO trigger for data packets sent from the MN1 to the MN2. This is because the LMA has all network topology information in the LMD. In the beginning of Route Optimization procedures LMA sent the Route Optimization Int message to Mobile Access Gateway2. After Mobile Access Gateway2 send the RO Init Acknowledgement to Local Mobility Anchor. Local Mobility Anchor sends the RO setup message to the MAG1. The MAG1 send the RO setup Acknowledgment message to the LMA. As the LMA send and receive the same message for the MAG2, the RO procedure is finished. Then data packets are directly delivered between the Mobile Node1 and Mobile Nnode2 due to the effect of the RO.

2. Light Weight Route Optimization Scheme (LWRO): In Light Weight Route Optimization Scheme Local Mobility Anchor and Mobile Access Gateway are used. To establish the Route Optimization path between the Mobile Nodes we use Local Mobility Anchor and Mobile Access Gateway. In it Mobile Node1 connected to Mobile Access Gateway and the Mobile Node2 connected to Mobile Access Gateway2. The packets from the Mobile Node1 to the Mobile Node2 are passing through the Local Mobility Anchor. When the Local Mobility Anchor received the packet, it knows the path for the packets to the Mobile Access Gateway2, but at the same time, it also sends a corresponding Binding Update to Mobile Access Gateway2. The Mobile Access Gateway1 receive the corresponding Binding Acknowledgment. Now packet is send from Mobile Access Gateway2 to Mobile Node1. Thus packets from the MN1 destined to the MN2 get intercepted by the Mobile Access Gateway1 and are forwarded to the Mobile Access Gateway2, instead of being forwarded to the Local Mobility Anchor.

237

3. Enhance Light Weight Route Optimization Scheme (ELWRO): In Enhance Light Weight Route Optimization Scheme Local Mobility Anchor and Mobile Access Gateway are used. To establish the Route Optimization path between the Mobile Nodes we use Local Mobility Anchor and Mobile Access Gateway. In ELWRO scheme in Corresponding Binding Information (CBI) message are used. In MN1 sends data packets to the MN2.First of all MN1 sends the data packets to the Mobile Access Gateway1, and then the MAG1 sends the data packets to the Local Mobility Anchor. The LMA knows the possible setup with RO. The LMA sends Corresponding Binding Information (CBI) message to the MAG1.Corresponding Binding Information (CBI) message include the MN1's address, the MN2's address, and the MAG2's address information. When the MAG1 received CBI message, then the MAG1 send Correspondent Binding Update message to the MAG2. Correspondent Binding Update message include the MN1's address, the MN2's address and the MAG1's address information The MAG2 sends Corresponding Binding Acknowledgment (CBA) message to the MAG1 for Corresponding Binding (CB). Now the packets are exchange between the MN1 and the MN2.

IV PERFORMANCE METRICS Performance matrix for the above three scheme is given by 1) Throughput: Defined as rate of communication per unit time. TH=SP/PT SP=sent packet PT=pause time

2) Packet Delivery Fraction: Defined as the ratio of data packets delivered to destination to those generated by CBR source is known as packet delivery fraction. PDF=SPD/GPCBR*100 SPD =sent packet to destination GPCBR =generated packet by cbr V PERFORMANCE RESULT A) Throughput: As indicated in graph the Enhanced Light Weight route optimization scheme perform better than the Liebschs and light weight route optimization scheme. In ELWRO rate of communication of packets are more with respect to pause time. In packet are transmitted between CN & MN more fastly.

238

REFERENCES [1] D. Johnson, Scalable Support for Transparent Mobile Host Internetworking, in Mobile Computing, edited by T. Imielinski and H.Korth. [2] RFC-2002 IP Mobility Support. [3] C.E. Perkins, Mobile IP: Design Principles and Practises. [4] IETF Working Group on Mobile IP. [5]Effect of Triangular Routing in Mixed IPv4/IPv6 Networks [6] Route Optimization Mechanisms Performance Evaluation in Proxy Mobile IPv6 2009 Fourth International Conference on Systems and Networks Communications. [7] J. Lee, et al. "A Comparative Signaling Cost Analysis of Hierarchical Mobile IPv6 and Proxy Mobile IPv6", IEEE PIMRC 2008, pp.1-6, September 2008 B) Packet Delivery Fraction: As indicated in graph the Enhanced Light Weight route optimization scheme perform better than the Liebschs and light weight route Optimization scheme. In packet are transmitted between CN & MN more fastly. [8] Deering, S., & Hinden, R.: Internet Protocol, Version 6 (IPv6) Specification, IETF, RFC 2460, December, 1998.

VI CONCLUSION In this paper, we have introduced the operation of three RO schemes that solve the triangle routing problem and provided the results of performance evaluation. The results of Throughput and Packet Delivery Fraction performance evaluation show that performance of our ELWRO scheme is better than Liebschs route optimization scheme & LWRO scheme.

239

IT-Specific SCM Practices in Indian Industries: An Investigation


Sanjay Jharkharia Associate Professor, QMOM Area Indian Institute of Management Kozhikode

Abstract-To capture the issues related to Information Technology and Supply Chain Management in Indian industries, a survey is conducted for manufacturing industries. The objective of the survey includes understanding the status and practices of IT-specific supply chain management practices in Indian manufacturing industries. Some other relevant issues which are not exclusively in the domain of IT but greatly influence the performance of supply chains are also discussed and investigated in this survey. For example, the aspects of performance measurement of supply chains are also addressed. The survey-outcome has been compared with the previous surveys, which were conducted in similar areas with global or Indian context. It is observed that the companies are using IT in their supply chain activities but the benefits of ITenablement are not fully realized due to various reasons, which the authors have explored in this report. Index Terms Supply chain, Information technology, Survey methodology

responsiveness have to play a greater role in the survival of a company. Further, the increased requirement for greater responsiveness and shortening of products life cycle create uncertain environment. Many companies have identified supply chain management (SCM) as a way to effectively tackle these situations. Harland (1997) describes supply chain management as managing business activities and relationships: (i) internally within an organization (ii) with immediate suppliers (iii) with first and second-tier suppliers and customers along the supply chain and (iv) with the entire supply chain. In the manufacturing sector, purchased goods and services represent a significant amount in the value of the product. The firms, in this sector depend on many suppliers and service providers. Therefore, a firms competitive

IT-SPECIFIC SCM PRACTICES IN INDIAN INDUSTRIES: AN INVESTIGATION I INTRODUCTION Since economic reforms, which started in India in 1991, the competitive environment for Indian companies has become more complex. This has led to more focus on customer service for survival in the global market. As the market is flooded with new and innovative products, the cost, quality and

advantage no longer lies within its own boundary but also depends on the links, which form its supply chain. Parlar and Weng (1997) investigated the relationship between the manufacturing and supply chain functions. Through a mathematical modeling approach, they demonstrated that the two functions should be coordinated because the costs associated with the second round of supply and production, to meet unsatisfied demand, is much higher than that for the first production run.

239

240
There is a growing confidence that adoption of supply chain management is essential for the companies to compete in the global market. There is also a perception that supply chain management helps in improving flexibilities at different levels of the operations. Tully (1994) has found that firms are achieving volume, design, and technology flexibilities through SCM. However, successful implementation of supply chain This is an improvement from its rank of 57th in year 2001-02. The competitiveness of manufacturing sector of a country has a significant role in these rankings. Manufacturing in India is believed to be suffering from neutrality syndrome (Korgaonker, 2000), which means little strategic emphasis and overwhelming focus on decisions such as capacity planning, make or buy etc. The stability of production is a major organizational goal for Indian manufacturing industry. Information processing is still very much fragmented even in computerized applications area. The decision making process in the companies is still based on traditional information processing which is time consuming and may yield insufficient or unreliable

management requires shift in the paradigms, structures, policies and behaviors in the

organizations. In order to monitor and facilitate this transition, it is important to develop an understanding of the existing scenario of supply chain management. In the supply chains,

Information technology (IT) can play the role of a facilitator for information sharing. The

information. Departments and companies are internally managed according to their own goals rather than the goals of the whole organization or the supply chain (Sahay et al., 1997). Under such situations, organizations may not deliver superior value to their customers. In a developing country like India, where the market is diverse and fragmented, supply chain efficiency can bring in remarkable benefits to the organizations (Kanungo et al., 1999). Supply chains of Indian

advancements in IT can be used to share the information on a real time basis. Many companies are now deploying IT tools to integrate their supply chains to make these IT-enabled. Therefore, the issues related to the IT-enablement of supply chains are now more important and relevant. A questionnaire-based survey on Indian

manufacturing companies was undertaken to address the related issues. Whether these companies are in line with the global practices on supply chain and IT applications or lack in adopting these advanced practices is the focal issue that lies at the core of this research. II INDIAN MANUFACTURING INDUSTRIES AND SUPPLY CHAINS According to the global competitiveness report (2009-10) India ranks 49th out of 133 countries in global competitiveness index in the year 2009-10.

manufacturing industries are characterized by weak infrastructure outside the organization (Kadambi, 2000) and lack of supply chain policy (Sahay et al., 2001). However, many manufacturing companies are now in the process of integrating their supply chain to stay competitive in the market.

240

241
III ROLE OF IT IN SUPPLY CHAIN MANAGEMENT As the business grows, the number of products and the geographical spread of market begin to rise. At the same time, the role of information sharing becomes critical in order to manage the business. The information technology that facilitates satisfaction, higher productivity and ultimately higher financial performance etc. However, in some cases traditional performance measures like return on investment and return on sales may not necessarily increase with IT usage since the IT investment would simply be the cost of remaining competitive in the industry (Byrd and Marshall, 1997). The way IT could be deployed and maintained in a supply chain is a crucial issue (Scala and McGrawth, 1993) and this depends on many factors such as maturity and compatibility of ITtools that the supply chain partners use, level of costs involved, strategic alliances among the supply chain partners, competitiveness of the supply chain, level of integration needed etc.

information sharing also contributes to the reduction of lead times and shipment frequency by reducing the time and cost to process order. The key findings of the KPMG 1997 global supply chain survey put IT as a major enabler of SCM (Freeman, 1998). Information technology plays its roles, both internally and externally within the supply chain. It assists not only the identification and fulfillment of the customer needs but also enhances the crossfunctionality within the organization. This

IV REVIEW OF SURVEY PAPERS AND PROJECT OBJECTIVES Many researchers have conducted surveys in the area of supply chain management. Some have also addressed to the specific needs of manufacturing industries. From the literature review and previous empirical studies, it is observed that many researchers have investigated the use of SCM practices in the organizations. These practices improve the overall deliverables of the

significantly reduces the cycle-time for the planning process. The developments in information

technology have made it possible that information is available on a real time basis to the supply chain partners. By the integration of operations outside the organization, it increases the accuracy of sales forecasts and helps manage inventories effectively. The reduction in inventory is possible because, IT systems enable a company to make decisions, which are based on real time information rather than on guesses (Kwan, 1999). For effective utilization of the latest developments in IT, all the supply chain partners must have some minimum essential IT infrastructure. However, despite substantial

organizations and the supply chains. However, the extent to which industries have really embraced these practices still needs to be examined. It is observed that there are some issues, which are not discussed in the literature and in a majority of cases sufficient time has elapsed since past studies. Therefore, there is a need to take a fresh look into these issues, mainly in the Indian context. Are Indian manufacturing companies aware of and implementing the latest IT tools and supply chain management practices today? Addressing these

investment in IT by companies, the relation between information technology investment and increase in performance has been extremely elusive. There are a number of anticipated benefits from technology investment, which include reduced costs, improved quality, increased flexibility, improved customer

241

242
basic questions is at the core of this research project. In the context of Indian supply chains, a survey was conducted by Kadambi (2000) but it addressed only few supply chain issues. Further, it is based on only 32 responses. Another survey on SCM in India was conducted by Sahay et al. (2001), which was not a comprehensive one and was done about eight years back since the present study undertaken by this author. Hence, the author is motivated to conduct a survey, which not only assess the status of supply chain management in India but also addresses the issues, which have not been discussed in the past surveys. This study covered the following issues: a) Supply chain strategy of IT for tools supply in the chain management were also consulted in developing the questionnaire. It was designed on a five point Likert scale. As the response rate of such surveys are not enthusiastic and the respondents are generally reluctant to spare time in filling these

questionnaires, the questions were set close ended, that require less time and efforts in filling the questionnaire. Further, the Indian experience of mailed / postal surveys by taking random sample from an industrial database has not been encouraging. Therefore, to obtain a high response rate, convenience-randomized sampling was preferred in deriving the companies database. Accordingly, the targeted respondents of the questionnaire included: (i) Working executives who are participating in IIMKs Executive Education Programmes,

b) Application organizations effectiveness c)

(ii) Executives from the manufacturing industry in the National Capital Region Delhi, and (iii) Other executives from the manufacturing sector who were easy to contact. The questionnaire was served in hard copy to those respondents who were approached personally (face to face) by the author. However, it was served in the soft copy for the web-based survey. A total of three hundred companies/executives operating in India were approached for their responses during MaySeptember, 2009. The results of the survey are discussed in the next section.

Adoption of supply chain practices by the organizations

d) Opinion of the organizations on certain supply chain issues like benefits and flexibility of IT enabled supply chain etc. e) Issues relevant to supply chain

performance measurement and opinion of the organizations on the indicators for supply chain performance.

V RESEARCH METHODOLOGY To address the supply chain issues in Indian manufacturing companies, a questionnaire-based survey was undertaken. The questionnaire was designed keeping in view the available literature and previous surveys. The practicing managers and academicians in the area of supply chain Out of 300 targeted executives only 97 usable responses were received. These responses have been analyzed to get an overview of supply chain practices in Indian industry. For each question Cronbachs coefficient ( ) was calculated to test the VI SURVEY RESPONSE AND RESPONDENTS PROFILE

242

243
reliability and internal consistency of the responses. Cronbachs coefficient, having a value of more than 0.5 is considered adequate for such exploratory work (Nunally, 1978). Barring two questions, which were later discarded from further analysis, the values of have been found to be more than 0.5 partners. It has been found that 54.4% of the respondent companies are strong believer in collaboration and actively extending their supply chain. However, 30% of the respondents believe in collaboration but use a go-slow strategy. Fifteen percent of the respondents are interested in collaboration but have other priorities before entering into any such collaboration. On the other hand Sahay et al. (2001) have observed that about one-third of the companies had no supply chain policies. Despite respondents being different in both the cases, the comparison of these two results indicates that there is a growing awareness in Indian companies for supply chain collaboration. Though the companies appear to be enthusiastic about collaboration in their supply chain, it also appears that these collaborations are more on oneTable 1: Annual Turnover of the respondent companies S. No. 1 2 3 Annual Turnover in Crore of Rs. 25-100 100-500 More than 500 No. of companies 16 41 40 to-one basis as the companies are not in practice of regular joint meetings with all the partners of the supply chain. When asked about the frequency of such joint meetings on a 1 to 5 Likert scale with one indicating never and five means most often, the mean value of the responses is only 2.77. Forty-six percent of the respondents had never (or rarely) VII SUPPLY CHAIN STRATEGY In order that the companies work effectively in a supply chain, coordinated activities and planning between linkages of the chain are necessary (Cartwright, 2000). Supply chain management focuses on how firms utilize their suppliers processes, technology, and capability to enhance competitive advantage. It also promotes the coordination of manufacturing, logistics, and materials management functions within the attended any such meeting. This trend is quite similar to that reported by Brabler (2001) in his survey on German companies. It is further observed in the present survey that only 38% of the companies have separate supply chain department and in 15% of the companies, it is headed by the CEO of the company. Large number of

with an average value of 0.76. It implies that there is a high degree of internal consistency in the responses to the questionnaire. In this survey, in terms of turnover, companies were divided in three categories. These categories are: (i) Turnover between Rs. 25-100 Crores, (ii) Turnover between Rs. 100-500 Crores, and (iii) More than Rs. 500 Crore turnover. The breakup of the respondent companies in terms of turnover is given in the Table 1 below.

respondents, with a mean score of 3.81 on a 5-point scale, observed that business process reengineering (BPR) is a prerequisite to supply chain integration. This is in tune with McMullan (1996) survey, where 88% of the respondents considered it a necessity in supply chain management. There is a

organization (Lee and Billington, 1992). In the present survey respondents were asked about their policy on supply chain collaboration with trading

243

244
moderate level of agreement in support of the statement that relevant information of one
Cost-benefits analysis

department in the organization of supply chain is available online to the other (3.42). Here, the values in the brackets indicate the mean value of the responses on the five-point Likert scale. It has been recommended in the literature that incentives should be provided to the small partners in the supply chain for information sharing (Munson et al., 2000) but it is not a common practice in the Indian supply chains. On a five point Likert scale, the mean agreement level in support of the statement is quite poor with a score of only 1.70. When asked about the weightage of certain factors in formulating IT-enabled supply chain strategy, it is found that cost-benefits analysis (4.23) has received the maximum weightage among the respondents (Figure 1). Respondents are of the opinion that the (3.84), upcoming trading technological partners IT

Trading partners IT infrastructure and willingness

Human factors

Availability of trained manpower

Financial constraints

Government regulations

developments

infrastructure and willingness (3.74), and logistics related factors (3.73) should also be given more weightage in formulating the strategy for ITenabled supply chain.

Standard Deviation Mean

Figure 1: Factors influencing formulation of IT-enabled supply chain strategy A practical approach to supply chain management is to have only strategically important suppliers in the value chain and reduce the number of suppliers. This strategy strengthens the buyersuppliers relations (Tan et al., 1999). The other benefits of reduced supplier base are: lower price, lower administration cost and improved

communications (Szwejczewski et al., 2001). However, in the survey it is found that there are multiple suppliers for one component or a finished product at the manufacturers end. When asked

244

245
about the adoption of various supply chain practices in their organizations, it is observed that the interaction of business and IT staff (3.27) is the most frequently used practice in most of the organization (Figure 2). Ross et al. (1996) also had a similar observation and reported a strong partnership between IT management and business management. Online tracking of the inventory

Interaction of business and IT staff Online tracking of inventory status Target costing Joint meetings of entities of supply chain Stabilized product price Online order processing and billing Activity Based Costing (ABC) Cross docking Virtual customer servicing Online tracking of Electronic Point of Sales (EPOS)
Standard Deviation

status (3.16) and target costing (3.00) are also being used moderately. However, the practices, rarely used, are: online tracking of electronic point of sales (EPOS) with a score of (1.72), virtual customer servicing (2.03), cross docking (2.18) and activity-based-costing (2.34).

Figure 2: Adoption of various supply chain practices

VIII TYPES OF INFORMATION SHARING

The information sharing among the partners in the supply chain leads to visibility of the processes in the entire supply chain. The information sharing in the supply chain reduces uncertainties, which is the

245

246
root cause of high inventory level in the supply chains. Information sharing can also be made for forecasting and design data sharing etc. Pandey et al. (2010) in their survey on Indian manufacturing companies have found a positive and significant correlation between various types of information sharing and competitive strength. On information sharing, Kaipia and Hartiala (2006) have done some case and empirical studies. They observed that only that information that improves supply chain performance should be shared. This survey explores the types of information, which supply chain partners usually share. Eight widely used domains of information sharing were identified from the literature and respondents were asked to indicate their level of information sharing with suppliers on a 5-point Likert scale. Three most widely used areas of information sharing (Figure 3) are identified as those related to purchasing (3.67), order tracking (3.46), product development (3.30) and inventory status (3.07). The magnitude of this information sharing, as shown in the bracket, is an indicator of only moderate level of information sharing. Therefore, it may be inferred that there is enough scope of further collaboration. The survey results indicate some involvement of supplier in the manufacturers process. However, the KPMG global supply chain survey indicates very low level of involvement of suppliers in manufacturers For the automation of supply chain, firms may use a number of IT tools such as bar-coding, electronic data interchange (EDI), intranet, extranet, internet, website, enterprise resource planning (ERP), and supply chain management (SCM) software etc. Regarding the use of internet in SCM, Cagliano et al. (2005), in his survey, found that both partial adoption of the internet on a few processes and complete adoption throughout the supply chain are used by companies. However, the former is only a transition phase. Respondents were asked to indicate the use of these tools in their organization. It is observed from the survey that Internet is the most widely used IT tool in supply chain automation, currently being used by 100% of the companies. Eighty-nine percent of the companies have either developed their own websites or have planned to develop it in the next one year. Intranet and ERP software are also emerging as favorite IT tools IX USE OF IT TOOLS IN SUPPLY CHAIN AUTOMATION
0 2 4

Order tracking

Inventory status

Sales forecasting

Market

Standard Deviation Mean

Figure 3: Types of information sharing in a supply chain

processes (Freeman, 1998). This may be attributed to the time gap between these two surveys and due to the increasing awareness among manufacturers about suppliers constructive involvement in their supply chain.

246

247
among the companies. Sixty-nine percent of the companies have implemented ERP. Fifteen percent of the companies have planned to install it within next one year but 16% of the companies have no plan to use it in near future. However, the penetration of extranet, bar coding, EDI and SCM software is limited to a few companies only. SCM software is the least used supply chain automation tool, being used by only 16.5% of the companies. Ten percent of the companies intend to use it within next one year but about 60% of the companies have no plan to use it in the near future. In KPMGs global supply chain survey (Freeman, 1998), nearly all companies expected a dramatic increase in the requirement of EDI and Bar-coding by their suppliers and customers in the years ahead. However, it does not seem to be valid in the Indian context. The application of Bar-coding is likely to increase in the coming years but EDI does not seem to be taking ground in the Indian companies. Though EDI is being used by 26% of the companies, more than 50% of the respondent companies have no plan to use it in the near future. ERP implementation has been reported in 69% of the surveyed companies, which is significantly higher than the 40% and 20% figure given by Sahay et al. (2001), and Saxena and Sahay (2000) respectively. However, Kadambi (2000) has reported the ERP implementation in 60% of the responding companies in India but it may also be recalled that his observation is based on only 32 responses. As far as SCM software is concerned, its implementation level is close to fifteen percent in all the Indian surveys discussed in this paper. Fodor (2000) has reported in his survey that 20% of the sample has implemented ERP software and 15% has opted to install supply chain management 0 2
Standard Deviation Mean

software. These results are almost similar to the past Indian surveys in terms of SCM software. The difference in the level of ERP implementation in India and in other countries of the globe may be attributed partially to the time gap between these surveys and partially to local conditions and some other factors. The survey explored the level of ITbased information sharing by the manufacturer with suppliers, customers, distributors, warehouse providers and logistics service providers in the supply chain. It is revealed from the survey that compared to other constituents of the supply chain, suppliers are more frequently sharing the information with manufacturers through IT (2.77). However, the level of IT-based information sharing between manufacturer and any other supply chain constituent is only at a moderate level (Figure 4) as the maximum score on a five point Likert scale is less than three. This indicates that the overall status of IT-based information sharing in the supply chain is still not very enthusiastic and there is enough scope for improvement in this direction.

Customers

Warehouses and logistics service 4

Figure 4: Level of IT based information sharing by linkages in the supply chain

247

248
X INVESTMENT IN IT TOOLS Respondents were asked about the degree of investment in various IT tools, which support the smooth functioning of supply chain management. It is observed from the survey that maximum investment has been made in ERP. The
Office automation Local Area Network (LAN)

investments in local area network (LAN) and computer hardware closely follow the investment in ERP (Figure 5). However, the companies invested much less in supply chain software, extranet and EDI. It is also observed from the study that the investment in SCM software and extranet are likely to increase in the future but the investment in EDI is not likely to significantly increase as the companies are now using Internet to send the information through e-mail and attachments.
Bar-coding Automated Storage and Retrieving System Extranet

Standard Deviation Mean

Figure 5: Investment in IT tools for supply chain automation XI USE OF BAR-CODING Bar coding has the capability to track the flow of goods in a supply chain. It can yield a significant savings for FMCG companies, if implemented properly. Fodor (2000) has reported the results of a questionnaire, which he conducted on the readers of some reputed magazines and journals. According to the survey, bar-coding was implemented by 52% of the sample. In India, bar-coding application by big companies has increased from almost nothing to 30%. It improves the companys demand

forecasting accuracy close to 20% (Anand, 2002). More importantly, it assists in tracking the consumer buying patterns, which enable companies

248

249
to price products as per market conditions, introduce new items at stock keeping units (SKUs) and keep watch on new launches in a better way. In the present survey, undertaken by the authors, only 32% of the companies were in use of bar-coding. Respondents observed that maximum benefits of bar-coding are obtainable in speeding up of data entry and there is almost unanimity among the respondents about this advantage as the standard deviation for this option is the minimum among all the discussed advantages of bar-coding. The other major advantages of bar-coding are reported as verification of orders at receiving and shipping, and updating of the stock position (Figure 6). XII BENEFITS OF IT-ENABLED SUPPLY CHAIN The IT-enablement of the supply chain offers several advantages to the users over the

conventional supply chain where IT is not predominantly used for communication among the supply chain partners. Some of these advantages are responsiveness, reduction in manpower,

increase in turnover etc. Bal and Gundry (1999) have reported time and cost savings as the two advantages, ahead of others in the virtual team working which is a possible emerging application area of IT-enabled supply chain. Closs et al. (1996) have observed that IT capability improves

timeliness and flexibility, which influence the logistics competence. The top five benefits of IT as Speed up data entry observed by Sohal et al. (2001) are control of inventory cost, improvement of management Enhances data security productivity, improvement of order cycle time, improved staff productivity and improved product quality. Brabler (2001) has reported that E-business can reduce the lead-time and general flexibility in Improves customer service the supply chain. Bhatt (2000) and Bhatt et al. (2001) are of the view that the firms could use IT to enhance the quality of the products and services. In Accurate forecasting 0 1 2 3 4 5 the present survey, the five most important benefits of IT-enabled supply chain, in the decreasing order are identified as responsiveness (4.32), inventory reduction (4.16), order fulfillment time reduction (4.14), better customer service (4.14) and improved Figure 6: Benefits of Bar-coding technology relations in the supply chain (4.07). For each of these observed benefits of IT-enabled supply chain, the coefficient of variation (CV) is also calculated, which is defined as the ratio of standard deviation ( ) to the mean value ( ). The values of CV for these benefits are found as responsiveness (17.2%), inventory reduction (22.5%), order fulfillment time

Standard Deviation Mean

249

250
reduction (18.8%), better customer service (21.6%) and improved relations in the supply chain (21.6%). The lower value of CV indicates a greater degree of convergence among the respondents about that parameter. The observed low value of CV for
Better customer service Inventory reduction

responsiveness and order fulfillment time reduction further substantiate the finding that these are the undisputed benefits of IT-enablement of supply chains (Figure 7). The present survey endorses the view that improved customer service can be achieved through IT-enablement as there is a possibility of significant reduction in lead times. It is also observed from the survey that, in the case of manufacturing companies, IT-enablement of the supply chain does not have much impact on the quality of the product. Fawcett et al. (2008) have also reported the strategic benefits of SCM in their survey paper. Of the reported benefits in their paper increased inventory turnover, increased revenue, and cost reduction across the supply chain are the most sought after benefits.

Low working capital

Edge over the new entrants

Accurate forecasting

Reduced unit cost of

Reduction in manpower

Increase in turnover

Product quality

Standard Deviation Mean

Figure 7: Benefits of IT-enabled supply chain

250

251
McCormack and Kasper (2002) have reported a significant relationship between Internet usage and supply chain performance. However, from the present survey, it appears that the use of Internet in Indian supply chains is mainly confined for communication through e-mails. Other supply chain functions such as inventory tracking, purchasing, collaborative information sharing etc. Purchasing activities have an important role in supply chain management. The survey explored the IT applications in purchasing activities and it is observed that 91 percent of the respondent companies have provided personal computers (PCs) or terminals of the main frame computer for their purchasing staff. Ninety percent of the companies had provided e-mail facility and 75% of the companies had provided Internet access to their purchasing staff. Fifty three percent of the companies were having purchase performance evaluation system and 78% of the companies had their own vendor rating system. Thirty- one percent of the companies practiced the automatic release of purchase orders, which is based on inventory level. It is also observed from the survey that despite this high level of IT penetration in the organizations, only twentyone percent of the companies were following the practice of online real time supplier XII USE OF INFORMATION TECHNOLOGY IN ORGANIZATIONAL ACTIVITIES Feedback was also taken from the respondents about the use of IT in various activities of the organization. It is observed that the maximum use of IT is in the area of accounts and finance, and it is reasonably ahead of other application areas like purchasing, sales and service, logistics operations, and manufacturing scheduling. The findings of KPMGs Global Supply Chain Survey (Freeman, 1998) and Indian survey by Saxena and Sahay (2000) reported that IT systems are better integrated in accounts and finance area as compared to other XIV BULLWHIP EFFECT AND ITS CAUSES The amplification of demand variability in the upstream of supply chains is a common information tracking and only nineteen percent of the companies had installed software for automatic handling of online queries. It may be attributed mainly to the disparity in the trading partners IT capability.

applications of Internet like e-business, online ordering and order confirmation, online quotation, tracking of electronic point of sales (EPOS) are not widely used by the respondent companies. In this survey, respondents were also asked to rank the problems in integrating their supply chain with Internet and it is observed that the threat of data security (3.23), insufficient bandwidth (2.93) and lack of trained manpower (2.90) are the main problems in integrating supply chain with Internet. However, these are likely to get phase out with time as technology advancements in Internet security and bandwidth is quite rapid during recent years. Moreover, the mean value of respondents answer is around three, which represents a moderate level of barrier.

phenomenon, which is more visible in the consumer goods sector. This is known as bullwhip effect. When asked about the reasons for bullwhip effect, the respondents observed that long lead-time of material acquisition (3.34), lack of real time

251

252
information availability at the vendors end (3.31), price fluctuations in the market (3.15) and forecasting errors (3.13) are the root causes of the Bullwhip effect (Figure 8). sharing. Using online real time information sharing in the supply chain, the long lead-time of material acquisition can certainly be reduced to a large extent. IT-based information sharing can play an important role in reducing the forecasting errors as well. However, the information sharing is possible only when there is a good trust and integration in the supply chain. The authors therefore suggest that Lack of real time information at to counter Bullwhip effect in the supply chain, following two measures should be given

importance. These are: (a) Integrate the supply chain and promote trust among the linkages for Forecasting errors information sharing, (b) Use latest IT tools for online information sharing so that there is no confusion about the demand at various levels of the Batch ordering 0 1 2 3 4 supply chain. XV CRITICAL SUPPLY CHAIN ISSUES In a study by McMullan (1996), conducted in AsiaPacific region, respondents identified issues such as information technology, inventory and infrastructure, both internal and external, as the key Figure 8: Reasons of Bullwhip effect supply chain management issues. Fantazy et al. (2009) in a survey found flexibility among the top eight factors necessary for successful Batch ordering, based on demand implementation of SCM initiative. This observation has been validated in the present survey. The role of top management in the success of supply chains was investigated in detail by Sandberg and Abrahamsson (2010). They observed that despite its often stated importance, we know little about it. In the present survey also, respondents were asked to assign weighatage to certain issues of supply chain management for judging its effectiveness. It is observed that the issues, which are considered important for the functioning of supply chains, in the decreasing order of importance are (Figure 9): commitment of top management (4.63), buyer consolidation, is considered as the least important of these causes. Earlier, Lee et al. (1997) have identified four major causes of the bullwhip effect, which are: (i) demand forecast updating, (ii) order batching, (iii) price fluctuation and (iv) rationing and shortage gaming. An analysis of the results of these two studies also provides insight on the mechanism to counter it. The present survey identifies long lead-time of material acquisition as the most important cause of bullwhip effect. In this survey, the top two reasons are almost equally important and both are related to information

Standard Deviation

252

253
suppliers relations (4.30), IT and decision support system (4.29), customer focus (4.20), information sharing at all levels of the chain (4.17), motivation and commitment of the personnel (4.08) and flexibility of the supply chain (4.06).

Commitment of top management

IT and Decision Support System

Information sharing at all levels

Flexibility

Cross functional integration

A supplier is a part of competitors chain

Bullwhip effect

Dictatorial attitude of major stake holders 0 2 4 6

Figure 9: Criticality of various supply chain issues

253

254
Bryson and Currie (1995) have observed in their survey that IT is considered strategically important but very few organizations ascribed critical importance to it. However, this survey indicates that now the companies have realized the importance of information technology in the success of the supply chains. Some other issues such as a supplier could be a part of more than one supply chain and dictatorial attitude of the major stake holders of the chain are also discussed in the literature (Munson et al., 2000) but none of these issues are considered important by the respondents. XVII PERFORMANCE MEASUREMENT INDICATORS XVI SUPPLY CHAIN PERFORMANCE MEASUREMENT Performance measurement provides the means by which a company can assess whether its supply chain has improved or degraded over a period of time. In the present survey, respondents were asked about the relevance of performance measurement of a supply chain and it is observed that supply chain performance measurement has a motivational effect on the performance improvement. Keebler (2001) also observed that the impact of good or bad performance of any partner of the supply chain is inevitable on the performance of the entire supply chain. Regarding supply chain performance Balanced Scorecard (Kaplan and Nortan, 1992) provides an excellent background for performance measurement of a supply chain. The balanced scorecard approach can be used to classify the performance measures of a supply chain in the following four categories: (i) (ii) (iii) (iv) Financial Perspectives Customer service perspectives Internal business measures Innovation and other measures performance measurement. It is observed in this survey that supply chain performance measurement is a continuous process in 28% of the organizations. It is 3-4 times in a year in 16% of the organizations, once in a year in 19% of the organizations. It is not regularly measured in about 19% of the

organizations. However, eighteen percent of the respondents did not say anything on the frequency of performance measurement of a supply chain in their organization.

Financial results are the major criteria in determining how supply chains are performing over a period of time yet these are not the complete drivers of the success. Moreover, the operations managers cannot wait long for the availability of financial results of a quarter or a month. In a study by McMullan (1996), the most commonly used performance measures of a supply chain in the customer service category are on-time delivery, customer complaints, back orders, stock out etc. Manrodt (2001) has identified the most frequently used logistics performance measures, which in the decreasing order of usage are outbound freight costs, inventory count accuracy, order fill rate, on-

measurement, Cassivi (2006) has noted that to identify operational performance measures of a supply chain, a good understanding of the most important research initiatives in logistics,

manufacturing, and operations activity is necessary. In India, Saad and Patel (2006) conducted a study on automotive sector companies and found that supply chain performance measurement is not fully embraced by the Indian Auto Sector. They also highlighted the difficulties associated with the

254

255
time delivery, and customer complaints. The least used measures in the increasing order of usage are enquiry response time, cash-to-cash time, units processed per unit time and cost to service. Inappropriate performance measures often lead managers to respond to the situations incorrectly and continue to support undesirable behavior. Therefore, it is desired to identify the most relevant performance measures as felt by the supply chain managers of various organizations. In the present survey, performance indicators were classified in the four categories, which are based on balanced scorecard (Kaplan and Nortan, 1992). Literature review was conducted to identify and shortlist the performance indicators for this purpose (McMullan, 1996; Manrodt, 2001; Beamon, 1999; Johnson and Davis, 1998; Keebler, 2001; Lapide, 2001; Mooraj, 1999; Neely et al., 1995, 2000; Pires et al., 2001). The opinion of supply chain experts from industries was also sought for deciding the appropriate measures. Respondents were asked to identify the most important performance indicators of a supply chain, relevant to their organization from the list of given indicators. The 15 most important 9. 10 . 11 . 12 . 13 . 14 . 15 . 8. 7. 6. 5. 4. 2. 3. 1. On-time delivery Responsiven ess Order fill rate Inventory turnover ratio Ease in tracking of customer orders Return on investment Total supply chain inventory control Plant productivity Just-in-time environment Economic value added Reduced wastes Retention of old customers Cost per unit of product Better product quality Reduced throughput time S. N. Performanc e indicators Standa rd deviatio n () 0.56 0.67 0.61 0.89 Mea n valu e ( ) 4.74 4.44 4.43 4.35 Coefficie nt of variation (CV) 11.8% 15.1% 13.8% 20.4% Table 2: Main indicators for supply chain performance measurement

0.79

4.31

18.3%

0.94 0.77

4.26 4.21

22.0% 18.3%

0.86 1.05 1.07 0.97 0.90

4.15 4.13 4.11 4.10 4.08

20.7% 25.4% 26.0% 23.6% 22.0%

performance indicators, as identified from the survey, in the decreasing order of mean value are shown in the Table 2.

1.12 1.08

4.08 4.08

27.4% 26.4%

0.86

4.07

21.2%

255

256
In the overall ranking of performance indicators, it is observed that on-time delivery (4.74), responsiveness (4.44) and order-fill-rate (4.43) are the three most important indicators for the performance evaluation of a supply chain. These indicators have CV values of 11.8%, 15.1% and 13.8% respectively. These values are the lowest among all the discussed performance indicators therefore, it may be inferred that these performance indicators are the consistently accepted in the industry and can be used as important parameters to measure the performance of the supply chains. Among the top fifteen indicators, five each belong to customer service and internal business measures. Three belong to financial measures and two belong to innovation and other measures. These findings indicate that the business managers accord a very high priority to the customer service and internal business measures. This result is justified also because it is the customer who is the ultimate evaluator of the supply chain by purchasing the products derived through a supply chain and therefore customers satisfaction level should figure in the performance of the supply chain. XVIII CONCLUSION The status of supply chain management in Indian manufacturing companies has been explored
3.

supply chain can be quantified only in certain areas like inventories, working capital and costs of communication but its intangible impact on goodwill and the responsiveness of the company to react to situations is far greater. As more companies emphasize on responsiveness, the importance of information technology in supply chain

management is going to be increasingly important in days to come. It is observed that firms have upgraded their internal capabilities in terms of computer hardware, internet, intranet, extranet, ERP, SCM software etc but they have been less successful in utilizing these capabilities for external co-ordinations, be it in terms of purchase process, design data sharing or inventory control etc. These figures indicate that though companies have developed individual IT capability to a large extent, the integration and information sharing in the supply chain is still much lower than desired. The observation of Closs et al. (1996) is also valid in the Indian context that companies have developed their internal capabilities but substantial improvement is needed to make the supply chain integration a reality. XIX REFRENCES
1. 2. Anand, M. (2002), Operation streamline, Business World, 18 February, pp. 20-26, New Delhi. Bal, J. and Gundry, J. (1999), Virtual teaming in the automotive supply chain, Team Performance Management: An International Journal, Vol. 5 No. 6, 1999, pp. 174-193. Beamon, B. M. (1999)," Measuring supply chain performance", International Journal of Operations and Production Management, Vol.19 No. 3, pp. 275292. Bhatt, G..D. (2000)," An empirical examination of the effects of information systems integration on business process improvement, International Journal of Operations and Production Management, Vol.20 No.11, pp. 1331-1359. Bhatt, G.D. and Stump, R.L. (2001)," An empirically derived model of the role of IS networks in business process improvement initiatives, Omega: International Journal of Management Science, Vol.29, pp.29-48. Brabler, A. (2001), E-Supply Chain ManagementResults Of An Empirical Study, Proceedings of the

through a questionnaire-based survey. The findings indicate that Indian companies are moving steadily to adopt the supply chain practices and these are in line with the practices elsewhere. The IT-

4.

enablement of supply chains is another issue examined in the paper. The benefits observed due to IT-enablement are discussed in the report. The supply chain managers have to decide which IT tools offer the greatest strategic value to their supply chain. The financial impact of IT on the
6. 5.

256

257
Twelfth Annual Conference of the Production and Operations Management Society, POM-2001, March 30-April 2 2001, Orlando Fl. Bryson, C. and Currie, W. (1995)," IT strategy: formal rational orthodoxy or contingent adhocracy", Omega: International Journal of Management Science, Vol. 23 No. 6, pp. 677-689. Byrd, T. A. and Marshall, TE. (1997), " Relating Information Technology Investment to Organizational Performance: a Casual Model Analysis", Omega: International Journal of Management Science, Vol. 25 No.1, pp. 43-56. Cagliano, R., Caniato, F. and Spina, G. (2005), Ebusiness strategy: How companies are shaping their supply chain through the Internet, International Journal of Operations and Production Management, Vol. 25 No. 12, pp. 1309-1327. Cartwright, S. D. (2000)," Supply chain interdiction and corporate warfare", IEEE Engineering Management Review, third quarter, 30-35. Cassivi, L. (2006), Collaboration planning in a supply chain, Supply Chain Management: An International Journal, Vol. 11 No. 3, 249-258. Closs, J.C., Goldsby, T.J. and Clinton, S.R. (1996), "Information technology influences on world class logistics capability", International Journal of Physical Distribution and Logistics Management, Vol.27 No.1, pp. 4-17. Fantazy, K. A., Kumar, V. And Kumar, U. (2009), An empirical study of the relationships among strategy, flexibility, and performance in the supply chain context, Supply Chain Management: An International Journal, Vol. 14 No. 3, pp. 177-188. Fawcett, S. E., Magnan, G.M. and McCarter, M.W. (2008), Benefits, barriers and bridges to effective supply chain management, Supply Chain Management: An International Journal, Vol. 13 No. 1, pp. 35-48. Fodor, G. (2000),"Room to grow", http://www.manufacturing.net/scl/yearbook/trends .htm Freeman, B. (1998), "Highlights of KPMG's global supply chain survey", http://www.infochain.org/quarterly/spring98/survey _figures.htm Harland, C. (1997), "Supply chain operational performance roles", Integrated Manufacturing Systems, Vol. 8 No. 2, pp. 70-78. Johnson, M. and Davis, T. (1998), Supply chain performance by using order fulfillment metrics", National Productivity Review, summer, pp. 3-16. Kadambi, B. (2000), "IT-Enabled supply chain management-A preliminary study of few manufacturing companies in India", http://www.xlri.ac.in/cltm/LogistiX-2-2000.HTM Kaipia, R. and Hartiala, H. (2006), Information sharing in supply chains: five proposals on how to proceed, The International Journal of Logistics Management, Vol. 17 No. 3, pp. 377-393. Kanungo, S., Sharma, S., Bhatia, K. and Babu, S. (1999), Toward a model for relating supply chain management and use of IT: an empirical study, in Sahay, B.S. (Ed.), Supply chain management for global competitiveness, Macmillan India Limited, New Delhi. 22. Kaplan, R.S. and Norton, D.P. (1992), The balanced scorecard- measures that drives performance", Harvard Business Review, January- February, pp. 7180. 23. Keebler, J.S. (2001), "Measuring Performance in the Supply Chain", in Mentzer, John, T. (Ed.), Supply Chain Management, Response Books, New Delhi. 24. Korgaonker, M.G. (2000), Competitiveness of Indian Manufacturing Enterprises, Manufacturing Magazine, December, pp. 26-37. 25. Kwan, A.T.W., (1999), " The use of information technology to enhance supply chain management in the electronics and chemical industries", Production and Inventory Management Journal, third quarter, pp.7-15. 26. Lapide, L.(2001), "What about measuring supply chain performance", http://lapide.ASCET.com 27. Lee, H. and Billington, C. (1992)," Managing supply chain inventory: Pitfalls and opportunities", Sloan Management Review, spring, pp. 65-73. 28. Lee, H. L., Padmanabhan, V., and Whang, S. (1997), "Information distortion in a supply chain: the bullwhip effect", Management Science, Vol. 43, pp. 546-558. 29. Manrodt, Karl B. (2001), "The state of Supply Chain Measurement" www2.gasou.edu/coba/centres/lit/present/niststat. pdf 30. McCormack, K. and Kasper, K. (2002), The extended supply chain: a statistical study, Benchmarking: An International Journal, Vol. 9 No. 2, pp. 133-145. 31. McMullan, A. (1996)," Supply chain management practices in Asia Pacific today", International Journal of Physical Distribution and Logistics Management, Vol. 26 No.10, pp.79-95. 32. Mooraj, S. (1999), "The balanced scorecard: a necessary good or an unnecessary evil", European Management Journal, Vol.17 No. 5, pp. 481-491. 33. Munson, C.L., Rosenblatt, M.J. and Rosenblatt, Z. (2000)," The use and abuse of power in supply chains", IEEE Engineering Management Review", second quarter, pp. 81-91. 34. Neely, A., Bourne, M. and Kennerley, M. (2000)," Performance measurement system design: developing and testing a process-based approach" International Journal of Operations and Production Management, Vol. 20 No. 10, pp. 1119-1145. 35. Neely, A., Gregory, M. and Platts, K. (1995)," Performance measurement system design: a literature review and research agenda, International Journal of Operations and Production Management, Vol. 15 No. 4, pp. 80-116. 36. Nunally, J. C. (1978), Psychometric Methods, McGraw Hill, NY 37. Pandey, V. C., Garg, S. and Shankar, R. (2010), Impact of information sharing on competitive strength of Indian manufacturing enterprises: an empirical study, Business Process Management Journal, Vol. 16 No. 2 (In Press). 38. Parlar, M. and Weng, Z. K. (1997), "Designing a firm's coordinated manufacturing and supply decisions with short product life cycles", Management Science, Vol. 43 No. 10, October, pp. 1329-1344. 39. Pires, S.R.I. (2001), "Measuring supply chain performance", Proceedings of the Twelfth Annual Conference of the Production and Operations

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

18.

19.

20.

21.

257

258
Management Society, POM-2001, March 30- April 2, 2001, Orlando Fl. Ross, J.W., Beath, C.W. and Goodhue, D.L. (1996), Develop long term competitiveness through IT assets, Sloan Management Review, Vol. 38 No. 2, pp. 31-42. Saad, M. and Patel, B. (2006), An investigation of supply chain performance measurement in the Indian automotive sector, Benchmarking: An International Journal, Vol. 13 No. 1/2, pp. 36-53. Sahay, B.S., Cavale, V., Mohan, R., Rajini, R. and Gupta, P. (2001), The Indian supply chain architecture, Industry, September, pp. 19-32, New Delhi. Sahay, B.S., Saxena, K.B.C. and Kumar, A. (1997), Information technology exploitation for world-class manufacturing: the Indian scenario, In research report Centre for Excellence in information management, Management Development Institute, Gurgaon, India. Sandberg, E. and Abrahamsson, M. (2010), The role of top management in supply chain management practices, International Journal of Retail and Distribution Management, Vol. 38 No. 1, pp. 57-69. Saxena, K. B. C. and Sahay, B. S. (2000)," Managing IT for world-class manufacturing: the Indian scenario", International Journal of Information Management, Vol. 20, pp. 29-57. Scala, S. and McGrawth, R. (1993), "Advantages and Disadvantages of Electronic Data Interchange", Information and Management, Vol.25, pp.85-91. Sohal, A.S., Moss, S. and Ng, L. (2001), Comparing IT success in manufacturing and service industries, International Journal of Operations and Production Management, Vol. 21 No. 1/2, 2001, pp. 30-45. Szwejczewski, M., Goffin, K. and Lemke, F. (2001), Supplier management in German manufacturing companies: an empirical investigation International Journal of Physical Distribution and Logistics Management, Vol.31 No. 5, pp. 354-373. Tan, K.C., Kannan, V.R., Handfield, R.B. and Ghosh, S. (1999), Supply chain management: an empirical study of its impact on performance, International Journal of Operations and Production Management, Vol. 19 No. 10, 1999, pp. 1034-1052. Tully, S. (1994), Youll never guess who really makes what, Fortune, October, pp. 124-128.

40.

41.

42.

43.

44.

45.

46.

47.

48.

49.

50.

258

259

Cloud Computing
A Study of Utility Computing and Software as a Service (SaaS) Parveen Sharma, Manav Bharti University, Solan Himachal Pradesh Guided By: Dr. M.K. Sharma Associate Professor & Head MCA Program Depart. of Computer Science Amrapali Institute -Haldwani (Uttarakhand)
Abstract TodayCloud computing is most attractive technology. It studies how the cloud has quickly taken hold as part of our everyday lives and how it is balanced to become a major part of IT strategies for all organizations. Cloud will be advantageous from a business intelligence standpoint over the isolated alternative that is more common today. Itsstudy the utilities computing on demand, such as electricity and telephone,and describe how the cloud model(SaaS) will ultimately serve to change- SaaSis a budget-smart choice that will examine challenges of cloud computing in traditional models for pricing and getting information technology. It begins with SaaS cloud computing which quickly evolved in tentative field of cloud computing. It delivers a single application through the browser to thousands of customers using a multitenant architecture with the down fall of cost and service charges for its applications. SaaS buyers are weighing the trade-offs between the application of fast flexibility, cost savings, and reduced trust on internal IT property. SaaS makes sense for their answer requirements. Index Terms Supplier, Cloud computing, Client, SaaS and multi-tenancy.

I INTRODUCTION Cloud computing means Internet ('Cloud') based development and use of computer technology.The word Cloud is used as a metaphor for the internet, based on the Cloud drawing used in the past to represent the telephone network and later to describe the internet in computer network diagrams as an abstraction of the underlying infrastructure The Cloud Computing is a phrase that is being used today to describe the act of storing, accessing, and sharing data, applications, and computing power in cyberspace. The concepts of storing data in remote locations use of tools only when we need them, but the positives and negatives of Cloud Computing (CC) present userswith unprecedentedopportunities and challenges. CC is a natural evolution of the widespread adoption of virtualization, serviceoriented architectureand computing utility.CC describes a new supplement, consumption, and delivery model for IT services based on the Internet, and it typically involves over-the-internet provision

of dynamically scalable and often virtualized resources. In 1960 John McCarthy said that "computation may someday be organized as a public utility. And it can be regarded as the latest wave of disruption in the IT. It can be best described as a highly automated, readily scalable, on-demand computing platform of virtually unlimited processing, storage always available to carry out a task of any size and charged based on usage. Almost all the modern day characteristics of CC the comparison to use of public, private, government and community forms was thoroughly explored in. The concept of Clouds is not new; it is sure that they have proved a major commercial success over recent years. The primary aim of Cloud Computing is to provide mobility deployment of web-based application by means of easily accessible tools . CC has three types of models, these are given below. IaaSmodel (Infrastructure as a Service)2.PaaSModel (Platform as a Service)3. Saas Model (Software as a Service) II HISTORY Software as a service's acronym, SaaS, first appears in an article called "Strategic Backgrounder: Software as a Service." It was published in February 2001 by the Software & Information Industry's eBusiness Division. Software as a service is essentially an extension of the idea of the Application Service Provider(ASP) model[1]

III WHAT IS SAAS? SaaS referred to as "software on demand," is software that is deployed over the internet. This approach to application delivery is part of the utility computing model where all of the technology is in the "cloud" accessed over the Internet as a service. SaaS is presently the most popular model of cloud computing service because of its high flexibility and scalability, high performance with better availability, vast services and less maintenance. SaaS Cloud delivers a single application through the browser to thousands of customers using a multitenant architecture.

259

260
SaaS model offers a high level explanation of the distributed data manner of software. It allows customer to require a computer with internet access to download the application and develop the software. It also allows the software to be licensed for either a single user or for a whole group of users.SaaS dealer is not only responsible for providing the service of data centresessential to run the application.SaaS is the key setting for the quick development. SaaS model make possible for every customer to take advantages of providers latest technological features without the burden of software maintenance, management, updates and upgrades. IV SAAS AS A PLATFORM SERVICE SaaS platform service is estimated to reduce the costs of the development of the application of business systems .Its conceptual basis is a business standard that overcomes all barriers globally and seamlessly connects all business processes and thus dynamically promotes effective business development. SaaS platform service performs integrated management of application records. The SaaS provides a business system withthe functions required for the development and operation ofSaaS applications. It has four components: (a)Basic functions: Basic functions include authentication,user management and authorization control. (b)Common components:Common components include E-mail distribution and billing data generation. (c)Service linkage: Service linkage implements linkage with other services. (d)Development framework: Development framework covers the methodology .The SaaS platform service runs on the common IT platformservice.Department that provides SaaS applicationsusing the SaaS platform service is called the SaaS provider. V BENEFIT OF SAAS a. SaaSdefined as a method by which Application Service Supplier (ASS) provide applications different over the software Internet i. j. e. b. SaaS makes the customer to get free of installing and operating the application on own computer. It also removes the great load of software maintenance. c. SaaS ability to have access to powerful technologies, with a least financial commitment d. The great benefit of SaaS is the ability to run the most recent version of the application . SaaS helps organizations avoid capital expenditure andpay for the functionality expenditure. f. SaaSremoves customer doubts as an operational

about application servers, loading, application desertion and related, common concerns of IT. g. The SaaS provider can improve the good organization of the

management of SaaS applications by execution work in accordance with the standard operation flow defined by the SaaS platform service. h. Save money by not having to purchase servers or other software to support use. Faster Implementation. Focus Budgets on competitive advantage rather than infrastructure Monthly obligation rather than up front capital cost Multi-Tenant efficiency Multi-Tenant efficiency Flexibility and scalability

leveraging cloud infrastructure on pay-as-you-go pricing structure.

260

261

VI CHARACTERISTICS OF SAAS Applications are network based so that the business users freeto use the service. Each application ispay-per-usage basis,

VII ADVANTAGES Pay per use Anytime, anywhere accessibility Pay as you go Instant scalability Security Reliability VIII SAAS CLIENT SaaSis a new delivery model and flexibility model. SaaS provider remotely manages software applicationsfor its customers. SaaS eliminates customer worriesabout

sanctioning the business owner to expecttheir budget for the usage of number of

applicationsaccording to business need. Application delivery naturally based on oneto-many model. An application is shared across multiple users. Managing Complexity while reducing

software costs. SaaS make possible to have regular

application servers, storage, and application development. It also enables every customer tobenefit from the dealers latest technological for

integration with a large. SaaS is highly efficient as Multi-tenant structural design. Network-based access to, and management of, commercially available software Activities managed from central locations rather than at each customer's site, enabling customers to access applications remotely via the Web Application delivery typically closer to a one-to-many model (single instance, multitenant architecture) than to a one-to-one model, including architecture, pricing, partnering, and management characteristics Centralized feature updating, which obviates the need for end-users to download patches and upgrades. Frequent integration into a larger network of communicating softwareeither as part of a mashup or a plugin to a platform as a service

features. SaaSProviderstasksresponsible

managing servers, power andCooling.SaaS alsomaintain operating system software, databases, installationof updates. Web based applications to easily

provisionsoftware for customers on demand. Typically has a multi-tenant model of application withroom for customization for each customer. Centralized controlled software deployment reduces support costs. Provide the latest version of the application software to thecustomer. Make sure the security and privacy of client data.

261

262
IX BENEFITS TO SAAS OPERATOR Own the development platforms, hardware & high degreeof maintaining. Software is subscribed on yearly or a monthly fee, It Improved reliability,performance and efficiency,Enhanced productivity and fasterdeployment.User can access to on-demand application anywhere,anytime.Does not have to purchase and support theinfrastructure that the application runs upon. X CONCLUSION By integrating all of the application software, data center,database, IT infrastructure and services together in a web-based,multi-tenant on demand delivery model, SaaSdealers canprovide ability to customers with economies of and talent thatwas one of the biggest challenges for traditional,onpremisedeployments. SaaS shifts the duty of deployment, operation, management, support and successfully operation of theapplication from the customer to the vendor. The aim of cloud computing is toapply vast computational power and storage capacity to solveproblems.Resources incloud computing are not confined to data storage and servers,the can also be complete distributed systems_especially clusters. XI REFERENCES Software as a Service: Strategic Backgrounder, Software & Information Industry Association, Feb 2001, http://www.siia.net/estore/pubs/SSB-01.pdf,

retrieved May 2010

262

263

Comparing the effect of Hard and Soft threshold techniques on speech compression using wavelet transform
Sucheta Dhir Indira Gandhi Institute of Technology, G.G.S. Indraprastha University, E-mail: dhirsucheta@yahoo.co.in
Abstract- In mobile communication systems, service providers are trying to accommodate more and more users in the limited bandwidth available to them. To accommodate more users they are continuously searching for low bit data rate speech coder. There are many types of speech coder ( vcoder) available such as Pulse Code Modulation (PCM) based vcoder , Linear Predictive vcoder (LPC), and some higher quality vcoders like Residual Excited Linear Prediction (RELP) and Code Excited Linear Prediction (CELP). This paper deals with slightly newer concept that involves the use wavelet transform in speech compression. The wavelet transformation of a speech signal results in a set of wavelet coefficients which represent the speech signal in the wavelet domain. Most of the speech energy is concentrated in the high valued coefficients, which are few. Thus the small valued coefficients can be truncated or zeroed. For compression wavelet coefficients are truncated below a threshold. There are two approaches for calculating thresholds: Global Threshold and Level Dependent Threshold. Both types of thresholds can be either hard or soft. The result of MATLAB simulation shows that Compression factor increases when soft threshold is used in both global and level dependent threshold techniques. However the better signal to noise ratio and retained signal energy values are obtained when hard threshold is used. Index Terms-DWT, Global threshold, level dependent threshold, Compression

I INTRODUCTION Humans use multiple ways to communicate with one another. Speech is the most commonly used media by people to express their thoughts. Development of telephones, mobile satellite communication, etc. has helped us to communicate with anyone present on the globe that has the access to mobile technology. With the bandwidth of only 4 KHz human speech can convey information with emotions. Now a day there is great emphasis on reducing the delay in transmission as well as on sound clarity of the transmitted & received signal. Through speech coding a voice signal is converted into more compact form, which can then be transmitted on a wired or

wireless. The motivation behind the compression of speech is that there is limited access to the bandwidth available for transmission. Thus a speech signal is first compressed and the coded before its transmission and at the receiver end received signal is first decoded and then decompressed to get back the speech signal. Special stress is laid on the design and development of efficient compression techniques and speech coders for voice communication and transmission. Speech coders may be used for real time coding of speech for its use in mobile satellite communication, cellular telephony, and audio for videophones or video teleconferencing. Traditionally Speech coders are classified mainly into two categories: Waveform coders and analysis/synthesis vcoders. A waveform coder attempts to copy the actual shape of the signal produced by a microphone. The most commonly used waveform coding technique is Pulse code modulation (PCM). A vcoder attempts to reproduce a signal that is perceptually equivalent to the speech waveform. One of the most commonly used techniques for analysis/synthesis coding are Linear Predictive Coding (LPC) [1], Residual Excited Linear Prediction (RELP) and Code Excited Linear Prediction (CELP). This paper deals with slightly newer concept which employs the use of wavelets for speech compression [5]. Wavelets are mathematical functions of finite duration with average value zero. A signal can be represented by a set of scaled and translated versions of a basic function called the mother wavelet and this process is known as Wavelet Transformation [9]. The wavelet transformation of a signal results in a set of wavelet coefficients which represent the signal in the wavelet domain. All the data operations can now be performed using just the corresponding wavelet coefficients.

263

264
II SPEECH COMPRESSION USING WAVELET TRANSFORMATION: Fig-1 shows the Design Flow of Wavelet based Speech Encoder. process can be described as the usual process of setting to zero the elements whose absolute values are lower than the threshold. Soft threshold process is an extension of hard threshold, first setting to zero the elements whose absolute values are lower than the threshold, and then shrinking the nonzero coefficients toward 0.

4- Quantization and Encoding:


Quantization is the process of mapping large set of input values to a smaller set. Since quantization involves many to few mapping therefore it is a nonlinear and irreversible process. The thresholding of wavelet coefficients gives floating point values. These floating point values are converted into integer values using quantization table. These quantized coefficients are the indices to the quantization table. Quantized table contains redundant information. To remove the redundant information the quantized coefficients are then efficiently encoded. Encoding can be performed using Huffman Coding. Huffman coding is a statistical technique which attempts to reduce the amount of bits required to represent a string of symbols. Huffman coding is a type encoding technique which involves computation of probabilities of occurrence of symbols. These symbols are the indices to the quantization table. Symbols are arranged in descending order according to their probability of occurrence. Shortest code is assigned to symbol having maximum probability of occurrence and longest code is assigned to the symbol having minimum occurrence. The actual compression takes place in this step only because in the previous steps the length of the signal at each stage was equal to the length of the original signal. It is in this step that each symbol is represented with a variable code. III PERFORMANCE PARAMETERS

Fig 1: Design Flow of Wavelet based Speech Coder.

The major steps shown in the above diagram are explained in the following sections.

1- Choice of Wavelet Function


To design a high quality speech coder, the choice of an optimal mother wavelet function is of prime importance. The selected wavelet function should be capable of reducing the reconstructed error variance and maximizing signal to noise ratio (SNR). Different criteria can be used to select an optimal mother wavelet function [6]. Selection of optimal mother wavelet function can be based on the amount of energy a wavelet function can concentrate into level 1 approximation coefficients.

2- Wavelet Decomposition
A signal is decomposed into different resolutions or frequency bands. This task can be carried out by taking the discrete wavelet transform of the signal by a suitable function at appropriate decomposition level. The level of decomposition can be selected based on the value of entropy [2]. For processing a speech signal level-5 wavelet decomposition is adequate.

1- Compression Factor: It is the ratio of


original signal to the compressed signal. CR (1)

3- Truncation of Coefficients
Most of the speech energy is concentrated in the high valued coefficients, which are few. Thus the small valued coefficients can be truncated or zeroed. For compression wavelet coefficients are truncated below a threshold. There are two approaches for calculating thresholds: Global Threshold and Level Dependent Threshold. Global threshold is used to retain largest absolute value coefficients, regardless of level of decomposition. Unlike Global threshold, Level Dependent threshold vary depending upon the level of decomposition of the signal. Both types of thresholds can be either hard or soft. Hard threshold

2- Retained Signal Energy (PERFL2): It


indicates the amount of energy retained in the compressed signal as a percentage of the compressed signal. (2)

3- Percentage of Zero Coefficient (PERF0): PERF0 is defined as the number of zeros


introduced in the signal due to thresholding which is given by the following relation. (3)

264

265

4- Signal to Noise Ratio (SNR): SNR gives


the quality of the reconstructed signal. A high value indicates better reconstruction. (4) IV SIMULATION RESULTS For choosing optimal mother wavelet functions of five different wavelet families were used to decompose a speech sample shown in fig 2. The retained signal energy at level-1 wavelet decomposition was calculated and the same is recorded in table 1(a, b, c, d, e).

Function bior-1.1 bior-1.3 bior-1.5 bior-2.2 bior-2.4 bior-2.6 bior-2.8 bior-3.1 bior-3.3 bior-3.5 bior-3.7 bior-3.9 bior-4.4 bior-5.5 bior-6.8

Energy 91.4615 96.3201 92.5828 96.7950 96.9173 96.9730 97.0020 98.3436 98.3986 98.4286 98.4455 98.4556 95.8568 93.5781 96.5751

Table 1(e): Retained Signal Energy for Biorthogonal Wavelet Family Fig 2: Speech signal sample.

Wavelet Function Haar Wavelet Function db-1 db-2 db-3 db-4 db-5 db-6 db-7 db-8 db-9 db-10 Wavelet Function sym-2 sym-3 sym-4 sym-5 sym-6 sym-7 sym-8 Wavelet Function coif-1 coif-2 coif-3 coif-4 coif-5 Wavelet

Retained Signal Energy 91.4615 Retained Signal Energy 91.4160 93.8334 94.8626 95.4728 95.8830 96.1680 96.2927 96.3349 92.3262 96.3416 Retained Signal Energy 93.8224 94.8626 95.6662 96.0647 96.1711 96.1728 96.3343 Retained Signal Energy 93.8450 95.7307 96.1958 96.3504 96.4062 Retained Signal

Table 1(a): Retained Signal Energy for Haar Wavelet Family.

Table 1(b): Retained Signal Energy for Daubechies wavelet Family

One wavelet function out each wavelet family is selected based on the maximum retained signal energy criteria at level 1 wavelet decomposition. Based on maximum retained energy criteria bior-3.9, db-10, sym-8, coif-5 wavelet functions are selected for level-5 wavelet decomposition for speech compression. Table 2 shows the values of Compression Factor (CR), Signal to noise ratio (SNR), Percentage of Zero Coefficients (PERF0), and Retained Signal Energy (PERFL2) for selected wavelet functions for both hard and soft, global threshold. Fig 3 shows the reconstructed signal after decoding and decompression of encoded and compressed speech signal using global threshold approach. Wavelet Function : bior-3.9 Performance Global Threshold Parameter Hard Threshold Soft Threshold 3.2932 3.4657 CR 23.8536 15.1721 SNR 76.5387 76.5383 PERF0 96.9588 63.5240 PERFL2
Table 2(a): Performance Parameter Table for Bior-3.9 wavelet function and Global Threshold Approach.

Table 1(c): Retained Signal Energy for Symlets Wavelet Family

Wavelet Function : db-10 Performance Global Threshold Parameter Hard Threshold Soft Threshold 3.3852 3.5429 CR 23.8335 14.1865 SNR 78.7307 78.7307 PERF0 90.8521 41.3243 PERFL2
Table 2(b): Performance Parameter Table for db-10 wavelet function and Global Threshold Approach.

Table 1(d): Retained Signal Energy for Coiflets Wavelet Family

Wavelet Function : sym-8 Performance Global Threshold Parameter Hard Threshold Soft Threshold 3.3070 3.4524 CR

265

266
SNR PERF0 PERFL2 23.7707 78.6117 90.7802 14.1173 78.6117 40.9308

Table 2(c): Performance Parameter Table for sym-8 wavelet function and Global Threshold Approach.

Wavelet Function : coif-5 Performance Global Threshold Parameter Hard Threshold Soft Threshold 3.3035 3.4460 CR 23.8271 14.1590 SNR 78.6407 78.6407 PERF0 90.8347 41.3203 PERFL2
Table 2(d): Performance Parameter Table for coif-5 wavelet function and Global Threshold Approach.

Fig 3(g): Reconstructed signal for coif-5 wavelet function using hard-global threshold.

Fig 3(h): Reconstructed signal for coif-5 wavelet function using soft-global threshold.

Fig 3(a): Reconstructed signal for bior-3.9 wavelet function using hard-global threshold.

Fig 3(b): Reconstructed signal for bior-3.9 wavelet function using soft-global threshold.

Similarly table 3 shows the values of Compression Factor (CR), Signal to noise ratio (SNR), Percentage of Zero Coefficients (PERF0), and Retained Signal Energy (PERFL2) for selected wavelet functions for both hard and soft, level dependent threshold. And Fig 4 shows the reconstructed signal after decoding and decompression of encoded and compressed speech signal using level dependent threshold approach. Wavelet Function : bior-3.9 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.3905 3.6901 CR 17.7406 9.9707 SNR 78.0619 78.0619 PERF0 92.7563 58.4797 PERFL2
Table 3(a): Performance Parameter Table for Bior-3.9 wavelet function and Level Dependent Threshold Approach.

Fig 3(c): Reconstructed signal for db-10 wavelet function using hard-global threshold.

Wavelet Function : db-10 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.3150 3.5871 CR 18.2082 10.5317 SNR 78.0619 78.0619 PERF0 83.9324 37.4042 PERFL2
Table 3(b): Performance Parameter Table for db-10 wavelet function and Level Dependent Threshold Approach.

Fig 3(d): Reconstructed signal for db-10 wavelet function using soft-global threshold.

Fig 3(e): Reconstructed signal for sym-8 wavelet function using hard-global threshold.

Wavelet Function : sym-8 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.2535 3.4300 CR 17.9551 10.1267 SNR 78.0964 78.0964 PERF0 83.4822 35.9319 PERFL2
Table 3(c): Performance Parameter Table for sym-8 wavelet function and Level Dependent Threshold Approach.

Fig 3(f): Reconstructed signal for sym-8 wavelet function using soft-global threshold.

Wavelet Function : coif-5 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.2273 3.4563 CR 18.3864 10.7569 SNR 77.9466 77.9466 PERF0

266

267
PERFL2 84.1851 38.7021
Fig 4(h): Reconstructed signal for coif-5 wavelet function using soft-level dependent threshold.

Table 3(d): Performance Parameter Table for coif-5 wavelet function and Level Dependent Threshold Approach.

VI CONCLUSION Compression of speech signal is essential, since raw speech is highly space consuming. In this paper wavelet transform is used for speech compression. Its performance was tested on various parameters and the following points were observed. Speech compression using wavelet transformation involves quantization of coefficients before encoding step, which is an irreversible process; hence original speech cannot be retrieved from compressed speech signal. As can be seen in table 2, the percentage of zeros introduced (PERF0) remain exactly the same for hard and soft, level dependent threshold technique. Ideally, for equal values of PERF0 the CRs shall also be equal but a difference in the values of CR is observed. This discrepancy can be accounted for the introduction of additional zeros at the quantization stage, because the coefficients are scaled down in soft threshold. It is due to this scaling that Retained signal energy and hence SNR has dropped to lower values, though audibility and understand ability of the speech was not significantly affected. Higher values of Compression factors are achieved when db-10 wavelet function is used for speech compression and signal to noise ratio are achieved when bior-3.9 wavelet function is used for speech compression. Similar inference can be made from observations for hard and soft, global threshold technique (Table 3). VII REFERENCES
Fig 4(e): Reconstructed signal for sym-8 wavelet function using hard-level dependent threshold.

Fig 4(a): Reconstructed signal for bior-3.9 wavelet function using hard-level dependent threshold.

Fig 4(b): Reconstructed signal for bior-3.9 wavelet function using soft-level dependent threshold.

Fig 4(c): Reconstructed signal for db-10 wavelet function using hard-level dependent threshold.

Fig 4(d): Reconstructed signal for db-10 wavelet function using soft-level dependent threshold.

Fig 4(f): Reconstructed signal for sym-8 wavelet function using soft-level dependent threshold.

Fig 4(g): Reconstructed signal for coif-5 wavelet function using hard-level dependent threshold.

[1] Shijo M Joseph, Firoz Shah A and Babu Anto P, Spoken digit compression: A Comparative Study between Discrete Wavelet Transforms and Linear Predictive Coding International Journal of Computer Applications (0975 8887) Volume 6 No.6, September 2010. [2] Wonyong Chong, Jongsoo Kim, Speech and Image Compressions by DCT, Wavelet, and Wavelet Packet International Conference on Information, Communications and Signal ProcessingICICS '97Singapore, 9-12 September 1997 [3] Wonyong Chong, Jongsoo Kim, Speech and Image Compressions by DCT, Wavelet, and Wavelet Packet International Conference on Information, Communications and Signal ProcessingICICS '97Singapore, 9-12 September 1997 [4] P.Prakasam and M.Madheswaran, Adaptive Algorithm for Speech Compression using Cosine Packet Transform IEEE 2007 proc. International

267

268
Conference on Intelligent and Advanced Systems. pp 1168-1172. [5] AbduI Mawla M,A. Najih, Abdul Rahman Ramli, Azizah Ibrahim and Syed A.R, Comparing Speech Compression Using Wavelets With Other Speech Compression Schemes IEEE 2003 proc. Students conference on research and development (SCOReD). pp 55-58. [6]. R. Polikar. The wavelet tutorial. URL: http://users.rowan.edu/polikar/WAVELETS/WTtutor ial.html, March 1999. [7]. Gonzalez, Woods and Eddins. Digital Image Processing. Gatesmark Publishing Ltd., 2009. ISBN 9780982085400 [8] K. Subramaniam, S.S. Dlay, and F.C. Rind. Wavelet transforms for use in motion detection and tracking application. IEEE Image processing and its Applications, pages 711715, 1999. [9]. P.S. Addison. The Illustrated Wavelet Transform Handbook. IOP Publishing Ltd, 2002. ISBN 0-7503-0692-0. [10]. M. Tico, P. Kuosmanen, and J. Saarinen. Wavelet domain features for fingerprint recognition. IEEE Electronic Letters, 37(1):2122, January 2001. [11] Jalal Karam, End Point Detection for Wavelet Based Speech Compression Procedings of world academy of science, engineering and technology Volume 27 February 2008 ISSN 1307-6884. [12] AbduI Mawla M,A. Najih, Abdul Rahman Ramli, Azizah Ibrahim and Syed A.R, Comparing Speech Compression Using Wavelets With Other Speech Compression Schemes IEEE 2003 proc. Students conference on research and development (SCOReD). pp 55-58. .

268

269

A Hybrid Filter for Image Enhancement


Vinod Kumar, a Kaushal Kishore, b and Dr. Priyanka a
a

Deenbandhu Chotu Ram University of Science and Technology, Murthal, Sonepat, Haryana India b Ganpati Institute of Technology and Management, Bilaspur, Yamunanagar, Haryana, India vinodspec@yahoo.co.in,a kishorenittr@gmail.com,b priyankaiit@yahoo.co.in,a

Abstract- Image filtering processes are applied on images to remove the different types of noise that are either present in the image during capturing or introduced into the image during transmission. The salt & pepper (impulse) noise is the one type of noise which is occurred during transmission of the images or due to bit errors or dead pixels in the image contents. The images are blurred due to object movement or camera displacement when we capture the image. This pepper deals with removing the impulse noise and blurredness simultaneously from the images. The hybrid filter is a combination of weiner filter and median filter. Keywords: Salt & Pepper (Impulse) noise; Blurredness; Median filter; Weiner filter

median filter, we do not replace the pixel value with the mean of neighboring pixel values, we replaces with the median of those values. The median is calculated by first sorting all the pixel values from the surrounding neighborhood into numerical order and then replacing the pixel being considered with the middle pixel value. (If the neighboring pixel which is to be considered contains an even number of pixels, than the average of the two middle pixel values is used.) Fig.1 illustrates an example calculation.

I INTRODUCTION The basic problem in image processing is the image enhancement and the restoration in the noisy envirement. If we want to enhance the quality of images, we can use various filtering techniques which are available in image processing. There are various filters which can remove the noise from images and preserve image details and enhance the quality of image. Hybrid filters are used to remove either gaussian or impulsive noise from the image. These include the median filter and weiner filters. Combination or hybrid filters have been proposed to remove mixed type of noise during image processing from images. II MEDIAN FILTER The median filter gives best result when the impulse noise percentage is less than 0.1%. When the quantity of impulse noise is

Fig.1:Exp. of median filtering III WEINER FILTER The main purpose of the Wiener filter is to filter out the noise that has corrupted a signal. Weiner filter is based on a statistical approach. Mostly filters are designed for a desired frequency response. The Wiener filter deals with the filtering of image from a different point of view. One method is to assume that we have knowledge of the spectral properties of the original signal and the noise, and one deals with the Linear Time Invarient filter whose output would come as close to the original signal as possible [1]. Wiener filters are characterized by the following assumption:

increased the median filter not gives best result. Median filtering is a nonlinear operation used in image processing to reduce "salt and pepper" noise. Also Mean filter is used to remove the impulse noise. Mean filter replaces the mean of the pixels values but it does not preserve image details. Some details are removes with the mean filter. But in the

269

270
a. signal and (additive white gaussian noise) noise are stationary linear random processes with known spectral characteristics. b. Requirement: the filter must be physically realizable, i.e. causal (this requirement can be dropped, resulting in a non-causal solution). c. Performance criteria of weiner filter: minimum mean-square error. Wiener Filter in the Fourier Domain The weiner filter is given by following transfer function: G(u,v) = Dividing the equation by Ps makes its behaviour easier to explain: G(u,v) = Where H(u, v) = Degradation function H*(u, v) = Complex conjugate of degradation function Pn (u, v) = Power Spectral Density of Noise Ps (u, v) = Power Spectral Density of un-degraded image. The term Pn /Ps is the reciprocal of the signal-tonoise ratio. IV IMAGE NOISE Image noise is the degradation of the quality of the image. Image noise is prodouced due to the random variation of the brightness or the color information in images that is produced by the sensors and the circuitry of the scanner or digital cameras. Image noise can also originate in film grain and in the unavoidable shot noise of an ideal photon detector. Image noise is generally regarded as an undesirable by-product of image capture. The types of Noise are following:Additive White Gaussian noise Salt-and-pepper noise Blurredness Additive White Gaussian noise The Additive White Gaussian noise to be present in images are independent at each pixel and signal intensity. In color cameras where more amplification is used in the blue color channel than in the green or red channel, there can be more noise in the blue channel. Salt-and-pepper noise The image which has salt-and-pepper noise present in image will show dark pixels in the bright regions and bright pixels in the dark regions. [2]. The salt & pepper noise in images can be caused by the dead pixels, or due to analog-to-digital conversion errors, or bit errors in the transmission, etc. This all can be eliminated in large amount by using the technique dark frame subtraction and by interpolating around dark/bright pixels. Blurredness The blurredness of the image is depend on the point spread function (psf) .The psf may circular or linear. The image is blurred due to the camera movement or the object displacement. V HYBRID FILTER This hybrid filter is the combination of Median and weiner filter. when we arrange these filter in series we get the desired output. First we remove the impulse noise and then pass the result to the weiner filter. The weiner filter removes the blurredness and the additive white noise from the image. The result is not the same as the original image, but it is almost same.

Algorithm The following steps are followed when we filtered the image: If the image is colored convert it in the gray scale image. Convert the image to double for better precision. Find the median by sorting all the values of the 3*3 mask in increasing order. Replace the center pixel value with the median value. Estimate the Signal to Noise ratio. Deconvolution function is applied to filtered the image. VI MSE & PSNR The term peak signal-to-noise ratio, PSNR, is the ratio between the maximum possible power of a signal and the power of corrupting noise signal. MSE= The PSNR is defined as:

270

271

PSNR = 10 . = 20 . where, MAXI is the maximum possible pixel value of the image. VII SIMULATION RESULT The Original Image is cameraman image . Adding three types of Noise (Additive white Gaussian noise, Salt & Pepper noise blurredness) and pass this image to our hybrid filter we get the desired result. The result depend upon the blurring angle (theta) and the blurring length (Len) and the intensity of the impulse noise. The performance is compared with the MSE & PSNR of the original image and the filter output image.

Fig.4 Blurred image with gaussian noise of mean=0, var=.001

Fig.2 Original Image

Fig.5 Blurred or Impulse noisy


hybrid filter output

image

Fig.6 Hybrid Filter output Fig.3 Blurred Len=21, Theta=11 image with

271

272
Blurre d length 21 15 10 10 05 05 Blurrin g Angle 11 09 05 05 03 03 Percentag e of impulse noise (%) 0,01 0.02 0.01 0.03 0.01 0.04 Mean square error 0.0087 0.0130 0.0060 0.0131 0.0052 0.0135 PSNR

68.22 67.35 70.08 67.39 71.11 67.20

VIII CONCLUSION Fig.7 hybrid Filter Output Now we calculate the mean square error for the different conditions to check the performance of our filter. The Table shows that when the blurredness of the image vary with angle and length and the percentage of impulse noise is constant. Table 1: Blurre Blurrig Percentag Mean Peak d Angle e of square Signal length impulse error to noise(%) Noise ratio 21 11 0.01 0.0087 69.11 15 09 0.01 0.0079 69.30 10 07 0.01 0.0074 69.49 05 03 0.01 0.0050 70.49 02 02 0.01 0.0040 71.49 Next when the blurredness of the image is same and the percentage of the impulse noise is increased, then the following results are obtained: Table 2: Blurre Blurrig Percentag Mean PSNR Angle d e square length of error impulse noise (%) 21 11 0.01 0.0087 68.11 21 11 0.03 0.0172 66.08 21 11 0.05 0.0268 64.15 21 11 0.07 0.0333 63.02 21 11 0.09 0.0398 62.06 When the blurredness and impulse noise is simultaneously varying, we get the following results: Table 3: We used the cameraman image in .tif format and adding three noise (impulse noise, gaussian noise, blurredness) and apply the noisy image to hybrid filter. The final filtered image is depending upon the blurring angle and the blurring length and the percentage of the impulse noise. When these variables are less the filtered image is nearly equal to the original image. IX SCOPE FOR FUTURE WORK There are a couple of areas which we would like to improve on. One area is in improving the de-noising along the edges as the method we used did not perform so well along the edges. Instead of using the median filter we can use the adaptive median filter. we can increase the types of noise. X REFERENCES: [1] Wavelet domain image de-noising by thresholding and Wiener filtering by Kazubek, M. Signal Processing Letters IEEE, Volume: 10, Issue no. 11, Nov. 2003 265 Vol.3. [2] Image Denoising using Wavelet Thresholding and Model Selection by Shi Zhong. Image Processing, 2000, Proceedings, 2000 International Conference held on, Volume: 3, 10-13 Sept. 2000 Pages: 262. [3] A hybrid filter for image enhancement ,by Shaomin Peng and Lori Lucke Department of Electrical Engineering University of Minnesota Minneapolis, MN 55455 [4] Performance Comparison of Median and Wiener Filter in Image De-noising. International Journal of Computer Applications Page No.(0975 8887) Volume 12 No.4, November 2010 [5] Multi-level Addaptive Fuzzy Filter for Mixed Noise Removal by Shaomin Peng and Lori Lucke. Department of Electrical Engineering University of JIinnesota Minneapolis. LIS 55455 612-625-3822 and 612-625-3588.

272

273

Comprehensive Study of Finger Print Detection Technique


Vivekta Singh [1]
Sr. Lecturer GNIT-GIT, Greater Noida[1] Vivekta.chauhan@gmail.com[1] +91-9971506661[1]
Abstract- Extraction and verification of Biometric signature are tedious task that are prone to errors and influence by many factors. Fingerprint is one of the most reliable personal identification methods and most widely used. The fingerprint detection techniques are either automated or in some cases the prints can be matched manually. The manual fingerprint detection is tedious, time consuming and expensive, while the automated fingerprint detection technique is not so reliable and authentic. In this paper, a technique that is based on a combination of an automated system and the traditional manual detection is presented. This technique can cop up with the problems related with automated as well as manual methods of detection. Index Terms Finger Print Detection (FDT), Minutiae, Matching, verification, Ridge extraction
[1]

Vanya Garg[2]
Lecturer [2] GNIT-GIT, Greater Noida[2] vanyagarg@gmail.com [2] +91-9410236413[2]
automatic fingerprint identification techniques is that it do not follow the same guideline as the manual one. For example a sweat pore present in a finger print may be identified as a different pattern, and thus the fingerprint may not match with its counterpart in the database. An automated finger print identification system include mainly the three issues: 1. Preprocessing stage 2. Minutia extraction stage 3. Post processing stage Here in this paper its been tried to address the different aspects of the finger print detections techniques and flow. The coming section named as Finger print Detection Technique speaks about the details of various stages and their flow. II FINGER PRINT DETECTION TECHNIQUE A fingerprint is composed of many ridges and furrows as shown in [Figure-1]. But, fingerprints are not distinguished by their ridges and furrows, but by Minutiae [6], which are some abnormal points on the ridges. Among the variety of minutia types reported in literatures, two are mostly significant and in heavy usage: one is called termination, which is the immediate ending of a ridge; the other is called bifurcation, which is the point on the ridge from which two branches derive.

I INTRODUCTION Fingerprint has been widely used in personal identification for several countries [5]. It is much more reliable than other kinds of popular personal identification method based on signature, face and speech. Apart from fingerprint verification for criminal identification and police work. Now a days it is used for various application such as security control, work our tracking, online transaction authentication, security of PC application, banking cash machines etc[7]. Conventionally the fingerprint verification is performed manually but it has few disadvantages such as time consuming, expensive and tedious. In order to meet out todays

requirements for new applications, the automatic fingerprint identification technique is in great demand [12]. Although significant progress has been made in designing automatic fingerprint

identification system over the past 45 years, still it has few drawbacks. One of the main limitations with

273

274
locally adaptive threshold method. The image segmentation task is fulfilled by a two-step approach: block direction estimation extraction and by Region of

Interest(ROI) operations.

Morphological

V Histogram Equalization Histogram Equalization is used to enhanced the [Figure-1]: Minutia (Valley is also referred as Furrow, Termination is also called Ending, and Bifurcation is also called Branch) III ACQUISITION There are two primary methods of capturing a fingerprint image: inked (off-line) and live scan (inkless). In off-line scan, a trained professional obtains an impression of an inked finger on a paper and the impression is then scanned using a flat bed document scanner. Whereas in live scan method a fingerprint image directly obtained from the finger without the intermediate step of getting an impression on a paper. This technology is based on optical frustrated total internal reflection (FTIR) concept. IV THE PRE-PROCESSING STAGE The fingerprint pre-processing enhancement, stage includes and fingerprint. Histeq function automatically adjusts the intensity values. It performs histogram equalization, which involves transforming the intensity values so that the histogram of the output image approximately matches a specified histogram. The syntax of histeq function is: G=histeq(l, nlevel) Where l is input image and nlevel is the no of intensity level specified for output image. The default value of nlevel is 64 and the maximum possible no of level can be 256. As shown in [Figure 2] the image on the left side is the original fingerprint & the image in the right side is the enhanced image after applying the Histogram Equalization function.

binarization

Segmentation. Fingerprint enhancement is done by Histogram Equalization and Fourier Transform image and then the fingerprint image is binarized using the

[Figure-2]: Enhanced image

274

275
for x = 0, 1, 2, ..., 31 and y = 0, 1, 2, ..., 31. VI FOURIER TRANSFORMATION ENHANCEMENT Fourier analysis is extremely useful for data analysis, as it breaks down a signal into constituent sinusoids of different frequencies. For sampled vector data, Fourier analysis is performed using the discrete Fourier transform (DFT). The fast Fourier transform (FFT) is an efficient algorithm for computing the DFT of a sequence; it is not a separate transform. It is particularly useful in areas such as signal and image processing, where its uses range from filtering, convolution, and frequency analysis to power spectrum estimation. The Fourier transform is a representation of an image as a sum of complex exponentials of varying magnitudes, frequencies, and phases. The Fourier transform plays a critical role in a broad range of image processing applications, including enhancement, analysis, restoration, and compression. We divide the image into small processing blocks (32 by 32 pixels) and perform the Fourier transform according to: VII BINARIZATION Fingerprint Image Binarization transforms the 8-bit Gray fingerprint image to a 1-bit image with: 0-value for ridges 1-value for furrows After this operation, fingerprint ridges are highlighted with black color while furrows are with white. In locally adaptive binarization method transform a pixel value to 1 if the value is larger than the mean intensity value of the current block (16x16) to which the pixel belongs. (1) (1) For u = 0, 1, 2, ... , 31 and v = 0, 1, 2, ..., 31. To enhance a specific block by its dominant frequencies, the FFT of the block is multiplied by its magnitude a set of times. And the magnitude of the original FFT = abs (F (u,v)) = |F(u,v)|. Get the enhanced block according to VIII SEGMENTATION Generally, only a Region of Interest (ROI) is useful image. to be recognized for each fingerprint The k in formula (2) is an experimentally determined constant, here choose k=0.45. The higher value of "k" improves the appearance of the ridges, filling up small holes in ridges, but having too high value of "k" can result in false joining of ridges. Thus a termination might become a bifurcation. The enhanced image after FFT has the improvements to connect some falsely broken points on ridges and to remove some spurious connections between ridges. The resulting image is also processed again with histogram equalization after the FFT transform.

The area of the image without effective

ridges and furrows is discarded as it holds only background information. Then ROI is extracted

using two-step method. The first step is block (2) Where F-1(F(u,v)) is done by: direction estimation, while the second is intrigued from some Morphological methods. ( (3)

275

276
IX BLOCK DIRECTION ESTIMATION Estimate the block direction for each block of the fingerprint image with vxv in size (v is 16 pixels by default). The algorithm is: I. Calculate the gradient values along xdirection (gx) and y-direction (gy) for each pixel of the block. Two Sobel filters are used to fulfill the task. II. For each block, using following formula to get the Least Square approximation of the block direction. tg2 = 2 (gx*gy)/ (gx2-gy2); expand Images and remove peaks introduced by background noise. CLOSE: The morphological CLOSE of A by B is denoted by A B, is dilation followed by erosion. The CLOSE operation can shrink images and eliminate small cavities. The bound is the subtraction of the closed area from the opened area. Then the algorithm throws away those leftmost, rightmost, uppermost and bottommost blocks out of the bound so as to get the tightly bounded region just containing the bound and inner area. (We can also construct our own structuring element with verity of shapes and size using strel function.) XI MINUTIEA EXTRACTION STAGE The tangent value of the block direction is estimated nearly same as the way illustrated by the following formula. tg2 = 2sin cos / (cos2 -sin2 ) For minutia extraction stage, three thinning

for all the pixels in each block.

algorithms are tested and the Morphological thinning operation is finally bid out with high efficiency and pretty good thinning quality. The minutia marking is a simple task as most literatures reported but one special case is found during our implementation and an additional check mechanism is enforced to avoid

After finished the estimation of each block direction, the blocks without significant

information on ridges and furrows are discarded by using the following formulas: E = {2 W*W* (gx2+gy2) (gx*gy) + (gx2-gy2)}/

such kind of oversight. Thinning is the process of reducing the thickness of each line of patterns to just a single pixel width. The requirements of a good thinning algorithm with respect to a fingerprint are a) The thinned fingerprint image obtained should be of single pixel width with no discontinuities. b) Each ridge should be thinned to its centre pixel. c) Noise and singular pixels should be eliminated. d) No further removal of pixels should be possible after completion of thinning process.

For each block, if its certainty level E is below a threshold, then the block is regarded as a background block. X ROI EXTRACTION BY MORPHOLOGICAL OPERATIONS Two Morphological operations called OPEN and CLOSE are adopted to find ROI. The OPEN and CLOSE operation is combination of Dilation and Erosion. Dilation is an operation that grows or thickens object in a binary image. Erosion Shrinks or Thins object in a binary image. OPEN: The morphological OPEN of A by B denoted by A o B, is simply erosion of A by B, followed by dilation of the result by B. The OPEN operation can

Uses an iterative, parallel thinning algorithm. In each scan of the full fingerprint image, the algorithm marks down redundant pixels in each small image window (3x3). And finally removes all those marked pixels after several scans. But it is tested that such an

276

277
iterative, parallel thinning algorithm has bad ending. Also the average inter-ridge width D is estimated at this stage. The average inter-ridge width refers to the average distance between two

efficiency although it can get an ideal thinned ridge map after enough scans. Uses a one-in-all method to extract thinned ridges from fingerprint images directly. Their method traces along the ridges having maximum gray intensity value. However,

neighboring ridges. The way to approximate the D value is simple. Scan a row of the thinned ridge image and sum up all pixels in the row whose value is one. Then divide the row length with the above summation to get an inter-ridge width. For more accuracy, such kind of row scan is performed upon several other rows and column scans are also conducted, finally all the inter-ridge widths are averaged to get the D. Together with the minutia marking, all thinned ridges in the fingerprint image are labeled with a unique ID for further operation. The labeling operation is realized by using the Morphological operation: bwlabel

binarization is implicitly enforced since only pixels with maximum gray intensity value are remained. The advancement of each trace step still has large computation complexity although it does not require the movement of Pixel by pixel as in other thinning algorithms. Thus the third method is bid out which uses the built-in Morphological thinning function bwmorph in MATLAB. Calling bwmorph with n=inf instruct bwmorph to repeat the operation until the image stops changes. Sometimes this is called repeating an operation until stability. Enhanced thinning algorithm: Step 1: Scanning the skeleton of fingerprint image row by row from top-left to bottom-right. Check if the pixel is 1. Step 2: Count its four connected neighbors. Step 3: If the sum is greater that two, mark it as an erroneous pixel. Step 4: Remove the erroneous pixel. Step 5: Repeat steps 1 4 until whole of the image is scanned and the erroneous pixels are removed.

XII THE POST-PROCESSING STAGE

A) FALSE MINUTIA REMOVAL The preprocessing stage does not totally heal the fingerprint image. For example, false ridge breaks due to insufficient amount of ink and ridge cross-connections due to over inking are not totally eliminated. Actually all the earlier stages themselves occasionally introduce some artifacts which later lead to spurious minutia. These false minutiae will significantly affect the accuracy of

XII MINUTIA MARKING After the fingerprint ridge thinning, marking minutia points is relatively easy. But it is still not a trivial task as most literatures declared because at least one special case evokes my caution during the minutia marking stage. In general, for each 3x3 window, if the central pixel is 1 and has exactly 3 one-value neighbors, then the central pixel is a ridge branch. If the central pixel is 1 and has only 1 one-value neighbor, then the central pixel is a ridge

matching if they are simply regarded as genuine minutia. So some mechanisms of removing false minutia are essential to keep the fingerprint detection system effective. Figure False Minutia Structures

277

278
m1 a spike piercing into a valley. m2 a spike falsely connects two ridges. m3 has two near bifurcations located in the same ridge. m4 The two ridge broken points in this case have nearly the same orientation and a short distance. m5 is alike them 4 case with the exception that one part of the broken ridge is so short that another termination is generated. m6 extends the m4 case but with the extra property that a third ridge is found in the middle of the two parts of the broken ridge. m7 has only one short ridge found in the threshold window. The procedures used by us in removing false minutia are: 1. If the distance between one bifurcation and one termination is less than D and the two minutias are in the same ridge (m1 case). Remove both of them. Where D is the average inter-ridge width representing the average distance between two parallel for a bifurcation needs to be specially considered. All three ridges deriving from the bifurcation point have their own direction, most algorithms simply chooses the minimum angle among the three anticlockwise orientations starting from the x- axis. Both methods cast the other two directions away, so some information loses. Here we propose a novel representation to break a bifurcation into three terminations. The three new terminations are the three neighbor pixels of the bifurcation and each of the three ridges connected to the bifurcation

before is now associated with a termination respectively

1
0

neighboring ridges. 2. If the distance between two bifurcations is less than D and they are in the same ridge, remove the two bifurcations (m2, m3 cases). 3. If two terminations are within a distance D and their directions are coincident with a small angle variation. And they suffice the condition that no any other termination is located between the two terminations. Then the two terminations are regarded as false minutia derived from a broken ridge and are removed. (case m4,m5, m6). 4. If two terminations are located in a short ridge with length less than D, remove the two terminations (m7). As each minutia is completely characterized by the following parameters at last: x-coordinate, ycoordinate, Orientation. The orientation calculation

0 1 0

1
0

Fig-: orientation of image. A bifurcation to three termination Three neighbors become terminations (Left) Each termination has their own orientation (Right) And the orientation of each termination (tx, ty) is estimated by following method: 1. Track a ridge segment whose starting point

278

279
is the termination and length is D. Sum up all x- coordinates of points in the ridge segment. Divide above summation with D to get sx. Then get sy using the same way. 2. Get the direction from: tan ((sy - ty)/(sx - tx)). Up to now we have find all the minutiae. In the validation we remove some of the false minutia manually. The validation GUI window opens from which we can manually uncheck the termination and bifurcation points with the help of the coordinates shown. XIII REFERENCES B) SAVE MINUTIAE: After validation the minutiae co-ordinates specifies numeric values. We save all the values or gradients for the future use. C) MATCHING The fingerprint detected now this can be matched using various technique. There is one of the techniques Using Phase-Based Image Matching for Low-Quality Fingerprints [8] 3. 2. A.K. Jain, L. Hong and R. Boler, Online Fingerprint Verification, IEEE trans, 1997, PAMI-19, (4), pp. 302-314. Ling Hong, Yifei Wan, A.K. Jain, Fingerprint Image Enhancement: Algorithm and 1. C. D. Kuglin and D. C. Hines, The phase correlation image alignment method, Proc. Int. Conf. on Cybernetics and Society, pp. 163165, 1975. orientation image in very low-quality images is also desirable to reduce feature extraction errors. Most of the fingerprint matching approaches introduced in the last four decades are minutiae-based, but recently correlation-based techniques are receiving renewed interest. New texture-based methods have been proposed and the integration of approaches relying on different features seems to be the most promising way to significantly improve the accuracy of fingerprint recognition systems.

Performance Evaluation", IEEE, Tran. On XII CONCLUSION Robust extraction of fingerprint feature remains a challenging problem, especially in poor quality fingerprints. Development of fingerprint-specific image processing techniques is necessary in order to solve some of the outstanding problems For example, explicitly measuring (and restoring or masking) noise such as creases, cuts, dryness, smudginess, and the like will be helpful in reducing feature extraction errors. Algorithms that can extract discriminative minutiae-based features in fingerprint images and in targeted them with the available features and matching strategies will improve fingerprint 6. 5. A. Jain, R. Bolle, and S. Pankanti, Biometrics Personal Identification in Networked Society, Kluwer Academic Publishers New York, Boston, Dordrecht, London, Moscow, pp. 1-64, 2002. D. Maltoni, D. Maio, and A. Jain, S. Prabhakar, 4.3: Minutiae-based Methods (extract) from 4. A.K. Jain, S. Prabhakar, and S. Pankanti, Twin Test: On Discriminability of Fingerprints, Pattern Analysis and Machine Intelligence, vol. 20. pp. 777-789, Aug-1998.

Proc. 3rd International Conference on Audioand Video-Based Person Authentication, pp. 211-216, Sweden, June 6-8, 2001.

matching accuracy. New (perhaps, model-based) methods for computation (or restoration) of the

279

280
Handbook of Fingerprint Recognition,Springer, New York, pp. 141-144, 2003. 7. Plajh etal, Characteristics and Application of the Fingerprint Recognition Systems, MEASUREMENT SCIENCE REVIEW, Volume 3, Section 2, 2003 K. Ito, H. Nakajima, K. Kobayashi, T. Aoki, and T. Higuchi, A fingerprint matching algorithm using phaseonly correlation, IEICE Trans. Fundamentals, vol. E87-A, no. 3, pp. 682691, Mar. 2004. Marcelo De Almeida Oliveira and et al, "Reconnection of fingerprint Ridges based on Morphological Operators and multiscale Directional information", IEEE proceedings of the 17th Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI'04) pp. 122-129, 2004. Approaches A Review (Volume I), Journal of Sci. Engg. & Tech. Mgt. Vol 2 (2), July 2010.

8.

9.

10. M. A. Dabbah, W. L. Woo, and S. S. Dlay, "Secure Authentication for Face Recognition," presented at Computational Intelligence in Image and Signal Processing, 2007. CIISP 2007. IEEE Symposium on, 2007. 11. A. K. Jain, F. Patrick, A. Arun, Handbook of Biometrics, Springer Science+Business Media, LLC, 1st edition, pp. 1-42, 2008 12. Zia Saquib, Santosh Kumar Soni, Sweta Suhasaria, Automated Fingerprint Identification System: Recognition Techniques & Algorithmic

280

Study of Component Based Software Engineering using Machine Learning Techniques


Vivekta Singh Sr. Lecturer GNIT-GIT, Greater Noida vivekta.chauhan@gmail.com
Abstract-Improve business performance often needs the improvement in their software development performance and that is the reason, which enforces the developers and researchers to think towards the adoption of latest technologies. Development from scratch is expensive and takes longer time to complete. This has led to the evolution of a new approach, i.e. Component Based Software Engineering using Machine Learning Techniques. Component Based Development is a software engineering branch which emphasis the separating of concern in respect of wide range of functionality available throughout a given range of software system. Another aspect which deals with the issue of how to build programs that improve their performance at some task through experience called Machine Learning (ML) [5]. As a subfield of AI, ML deals with the issue of how to build computer programs that improve their performance at some task through experience. To develop an intelligent model based on Component Based Software Engineering including the features of Machine Learning techniques to increase the Reusability, performance and quality with reducing the project cycle time. This model should be capable enough to overcome the essential difficulties like complexity, conformity, changeability and invisibility that are inherited in developing the large software.

to develop from scratch but can be reused. Critical applications with strict time limits may lose time to market due to the delay in the development process. The new approach, called component-based development(CBD) which emerge from the failure of object oriented development to support effective reuse as this approach develop software that relies on software reuse[8]. In CBSE, components are more like abstract than classes & can be considered to be standalone service provider. Machine learning algorithms have proven to be of great practical value in a variety of application domains. AI techniques can also play an important role to overcome the difficulties. As a subfield of AI, machine learning (ML) deals with the issue of how to build computer programs that improve their performance at some task through experience. It is dedicated to creating and compiling verifiable knowledge related to the design and construction of artifacts. Subsequent part of this paper describes the CBSE & ML techniques and issues addressed by these. II ISSUES & CHALLENGES Various issues in ML & CBSE can be listed as follows: Types of learning methods are there available. The characteristics and underpinnings of different learning algorithms. To determine which learning method is appropriate for what type of software development or maintenance task. Learning methods can be used to make headway in what aspect of the essential difficulties in software development. Time to attempt learning method to help with an SE task, what are the general guidelines and how can we avoid some pitfalls. State-of-the-practice in ML&SE.

I INTRODUCTION Earlier, systems were developed by using structured approach, which was very Successful, but only for simple applications. Then came the object-oriented (OO) approach based upon encapsulation, inheritance, and polymorphism. Besides several advantages of OO approach, it has many drawbacks like: object orientation is difficult to learn and apply for complex applications. It supports information hiding at class level, but not beyond that. Integrity and confidentiality are other critical issues with OO approach. Many information-based legacy systems contain similar or even identical things, which are not needed

Further effort needed to produce fruitful results.

III COMPREHENSIVE STUDY Software development based on combination of set of existing component was discussed in the 1960s. Component-Based software development (CBSD) which emerges from the failure of object oriented development to support effective reuse. Software reuse has been used as a tool to reduce the development cost and cycle time of the software development. The need for software reuse has become urgent as the size and complexity of software have started to escalate exponentionally. Software reuse is defined as the reuse of everything associated with a software project including knowledge (Basili and Rombach, 1988). It is the process whereby an organization defines a set of systematic operating procedures to specify, produce, classify, retrieve and adapt software artifacts for the purpose of using them in its development activities (Mili et al., 1995). Reusability is the degree to which a component can be reused and reduces the software development cost by enabling less coding and more integration (Wang, 2002). In CBSE components are more abstract than classes & can be considered to be standalone service provider. CBSE including the use of commercial ofthe-shelf software (COTS) is widely seen as a way of reducing some of the problems so often encountered [1]. Application of reuse and the increasing adoption of this technology are considered to be more effective in the long term, at least in some respects, than development of new code [13]. Machine learning algorithms have proven to be of great practical value in a variety of application domains. Not surprisingly, the field of software engineering turns out to be a fertile ground where many software development and maintenance tasks could be formulated as learning problems and approached in terms of learning algorithms [5]. The essential difficulties of complexity, conformity, changeability, and invisibility, inherent in developing large software, still hold true today. To ultimately overcome the essential difficulties, it has been recognized that both the processes and products of software development should be formalized and automated, and AI techniques can play an important role in this effort. A small subset of machine learning algorithms, mostly inductive learning based, applied to the KDD 1999 Cup intrusion detection dataset resulted in dismal performance for user-to-root and remote-to-local attack categories as reported in the recent literature. The software engineering community understands that tool building is an

essential activity of applied research. This seems especially pronounced in the areas of reverse engineering, software visualization, and program comprehension. Researchers also understand that tool building is a major investment. For instance, Nierstrasz et al., who have developed the well-known Moose tool, say that in the end, the research process is not about building tools, but about exploring ideas. In the context of reengineering research, however, one must build tools to explore ideas. Crafting a tool requires engineering expertise and effort, which consumes valuable research resources. AI techniques, machine learning (ML) methods have found their way into the software development in the past twenty years. ML deals with the issue of how to build computer programs that improve their performance at some task through experience [14]. IV COMPONENT BASED SOFTWARE DEVELOPMENT Component-based software engineering (CBSE) comprises of two separate but related processes namely component engineering and application engineering [3]. The former is concerned with the analysis of domains and development of generic and domain-specific reusable components while the latter involves application development using commercial off-the-shelf components (COTS) or components that have been developed in-house. Provided an effective risk management process is in place, this philosophy of constructing systems from reusable parts promises a high degree of customizability and extensibility with increased productivity gains, accelerated time to market and lower development cost. During software development, early identification of critical components is of much practical significance since it facilitates allocation of adequate resources to these components in a timely fashion and thus enhances the quality of the delivered system. The main characteristics of CBD process is separation of system development from component development. CBD has significant influence on the development a maintenance process & require considerable modifications of standard development process. As described in [2] there are three different type of CBD process: 1) Architecture driven component driven 2) Product line development 3) COTS based development. CBSE is expected to meet the requirements of costeffectiveness and flexibility in the development of command support systems [4]. Knowledge-based techniques, on the other hand, provide an opportunity to incorporate adaptively and robustness into

software systems through the use of machinelearning. V MACHINE LEARNING TECHNIQUE Machine learning deals with the issue of how to build programs that improve their performance at some task through experience. Machine learning algorithms have proven to be of great practical value in a variety of application domains [6]. Not surprisingly, the field of software engineering turns out to be a fertile ground where many software development and maintenance tasks could be formulated as learning problems and approached in terms of learning algorithms. To ultimately overcome the essential difficulties, it has been recognized that both the processes and products of software development should be formalized and automated, and AI techniques can play an important role in this effort. Of the many AI techniques, machine learning (ML) methods have found their way into the software development in the past twenty years. ML deals with the issue of how to build computer programs that improve their performance at some task through experience. It is dedicated to creating and compiling verifiable knowledge related to the design and construction of artifacts. ML algorithms offer available alternative to the existing approaches to many software development issues. To better use ML methods as tools to solve real world SE problems, we need to have a clear understanding of both the problems, and the tools and methodologies utilized. It is imperative that we know: Available ML methods at our disposal Characteristics of those methods Circumstances under which the methods can be most effectively applied Theoretical Underpinnings. Since solutions to a given problem can often be expressed (or approximated) as a target function, the problem solving process (or the learning process) boils down to how to find such a function that can best describe the known and unknown cases or phenomena for a given problem domain. Two paradigms exist in ML: inductive learning and analytical learning. Inductive learning formulates general hypotheses that fit observed training data. It is based on statistical inference, requires no prior knowledge, and can fail if there exists scarce data, or incorrect inductive bias. Analytical learning, on the other hand, formulates general hypotheses that fit domain theory. It is based on deductive inference, can learn from scarce data, but can be misled when given incorrect or insufficient domain theory.

Because the availability and utilization of data and domain theory play a pivotal role in these two paradigms, we can use data and domain theory as guiding factors in considering the adoption of learning methods. When a given task is data-rich, methods of inductive learning can be considered. If there exists a well-defined model for a task, then we can adopt analytical learning methods. Two paradigms can be combined to form a hybrid inductive-analytical learning approach. We can utilize hybrid methods in situations where both data and domain theory are less than desirable. Methods of either paradigm will be good candidates if a task has both an adequate domain theory and plenty of data. VI CONCLUSION Today component and service based technology play central role in many aspects of enterprises that are used to define, implement and assemble component have improved significantly over recent years. Techniques for verifying system created from them have changed very little. The reliability & correctness of computer base system are still checked. CBSE is also expected to meet the requirements of costeffectiveness and flexibility in the development of command support systems. It is dedicated to create and compiling verifiable knowledge related to the design and construction of artifacts [15]. ML algorithms offer available alternative to the existing approaches to many software development issues. Machine learning deals with the issue of how to build programs that improve their performance at some task through experience. This has been demonstrated in the body of the existing work at least for narrow areas and for individual cases. VII FUTURE SCOPE ML methods can be used to complement existing CBSE tools and methodologies to make headway in all aspects of essential difficulties. The strength of ML methods lies in the fact that they have sound mathematical and logical justifications and can be used to create and compile verifiable knowledge about the design and development of software artifacts. To avoid potential pitfalls of ML techniques. The single most important factor is to avoid mismatches between the characteristics of an CBSE problem and those of an ML method. What lies ahead? For SE areas that have already witnessed ML applications, efforts will be needed in developing guidelines for issues regarding

applicability, scaling-up, performance evaluation, and tool integration. For those CBSE areas that have not witnessed ML applications, the issue is how to realize the promise and potential ML techniques have to offer. VIII REFERENCES M.M. Lehman and J. F. Ramil Software evolution in the age of component-based software engineering IEE Proc.-Sofiw., Vol. 147, No. 6, Decemher 2000 Ivica Crnkovic, Michel Chaudron and Stig Larsson,Component-based Development Process and Life Cycle, In Proceedings of the International Conference on Software Engineering Advances(ICSEAs 06). Dr Awais Rashid Aspect-Oriented and Component-BasedSoftware Engineering IEE ProceedmngA online no 20010458 D. P. J. Goodburn, T.R Pattison and R.J Vernik Component-Based Engineering of KnowledgeEnabled Systems: Research Vision and Strategy Information Technology Devision, Electronics and Surveillance Research Laboratory. DSTOGD-0271 Du Zhang and Jeffrey J.P. Tsai Machine Learning and Software Engineering Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI02) Du Zhang Introduction to Machine Learning and Software Engineering Dr. Arun Sharma Design and Analysis of Metrics for Component-Based Software Systems

8.

Ian Sommerville Software Engineering Eighth Edition Pearson Education Limited. Tim Menzies, Practical Machine Learning for Software Engineering and Knowledge Engineering Handbook of Software Engineering and Knowedge Engineering Volume 1, World Science Publishing Company.

9.

1.

2.

10. T. Menzies, Practical machine learning for software engineering and knowledge engineering, Handbook of Software Engineering and Knowledge Engineering, World Scientific Publishing Company, 2001. 11. C. Rich and R. Waters (eds.), Readings in Artificial Intelligence and Software Engineering, Morgan Kaufmann, 1986. 12. Maheshkumar Sabnani, Application of Machine Learning Algorithms to KDD Intrusion Detection Dataset within Misuse Detection Context, EECS Dept, University of Toledo Toledo, Ohio 43606 USA. 13. HALLSTEINSEN, S., and PACI, M.: Software evolution and reuse (Springer Verlag, Berlin, 1997). 14. T. Mitchell, Machine Learning, McGraw-Hill, 1997. 15. F. Provost and R. Kohavi, On applied research in machine learning, Machine Learning, Vol.30, No.2/3, 1998, pp.127-132.

3.

4.

5.

6.

7.

Efficient Location-Based Spatial Query (LBSQ) Processing in Wireless Broadcast Environments


K Madhavi Asst Professor of IT VR Siddhartha Engineering College (Autonomous) Kanuru, Vijayawada-7, A.P.
Abstract: Location-Based Spatial Queries (LBSQs) refer to spatial queries whose response relies on the location of the mobile user. Efficient processing of LBSQs is of significant importance with the everincreasing deployment and use of mobile technologies. We present LBSQs has certain unique characteristics, which will not be addressed by the traditional spatial query processing in centralized databases. A significant challenge is presented by wireless broadcasting environments, which have excellent scalability but often exhibit high-latency in accessing the database. In this paper, we present a novel query processing technique that, though maintaining high scalability and accuracy, it manages to reduce the latency significantly in processing of LBSQs. Existing techniques cannot be used effectively in a wireless broadcast environment, where only sequential data access is supported. It may not be scalable to very large user population. In an existing system, to communicate with the server, a client must most likely use cellular-type network to achieve a reasonable operating range. The users must reveal that their current location and send it to the server, which may be undesirable for privacy reasons. We propose a novel approach for reducing the spatial query access latency by leveraging results from nearby peers in wireless broadcast environments. The scheme allows a mobile client to locally verify whether candidate objects received from peers are indeed part of its own spatial query result set. The method exhibits great scalability; the higher the mobile peer density, the more the queries answered by peers. The query access latency can be decreased with the increase in number of mobile users. Our approach is based on peer-to-peer sharing, which enables us to process queries without delay at a mobile host by using query results cached in its neighboring mobile peers. We demonstrate the feasibility of our approach through a probabilistic analysis, and illustrate the appeal of our technique through extensive simulation results.

Dr Narasimham Challa Professor of IT

I INTRODUCTION: Technological advances, especially those in wireless communication and mobile devices, have fueled the proliferation of location-based services (LBS). In return, the demands from sophisticated

applications with large user populations pose new challenges in LBS research systems: As the name implies, LBSs are built on the notion of location.Mobile devices can only provide a computing environment with limited CPU, memory, network bandwidth, and battery resources. As such, mobile clients must be designed to balance the utilization between these resources and the loads between the client and the server. For example, pushing more computation on the client can reduce bandwidth consumption but increase CPU load and memory consumption. Given the rapidly increasing capability of mobile devices, mobile applications must make reasonable assumptions about the clients computational capability and be able to adapt to it.LBSs must be able to handle a large user population and scale up in the face of increasing demands. An LBS server must be able to process a large number of requests simultaneously. Fortunately, requests to the servers are likely to exhibit spatial locality and perhaps to a lesser degree temporal locality as well. Location is an important element in LBSs. A location model allows us to describe the physical space and properties of the objects contained in it. For example, it defines the coordinate system and therefore the locations of the objects and their spatial relationships. In addition, operations such as location sensing, object counting, browsing, navigation, and search are affected by the location model chosen. Existing location models can be classified into geometric or symbolic models. A geometric model represents the physical space as a Euclidean space and the objects therein as points, lines, shapes, and volumes in the Euclidean space. With the coordinates of each object defined in the Euclidean space, operations such as distance and shortest path computations can be supported. A symbolic models main objective is to capture the semantics of the entities in a physical space by expressing not only the capturing of the objects but also the relationship among them in some form of graph.

II COMMON SPATIAL QUERY TYPES There are several common spatial query types. In this section, we only cover the ones which are related with this research. Nearest Neighbor Queries : During the last two decades, numerous algorithms for k nearest neighbor queries have been proposed. In this section I roughly divide these solutions into three groups, regular k nearest neighbor queries, continuous k nearest neighbor queries, and spatial network nearest neighbor queries. Regular Nearest Neighbor Queries A k nearest neighbor (kNN) query retrieves the k (k 1) data objects closest to a query point q. The Rtree and its derivatives have been a prevalent method to index spatial data and increase query performance. To nd nearest neighbors, branch-andbound algorithms have been designed that search an R-tree in a depth-rst manner or a best-rst manner . The NN search algorithm proposed is optimal, it only visits the node necessary for obtaining the nearest neighbors, and incremental, i.e., it reports neighbors in ascending order of their distance to the query point. Both algorithms can be easily extended for the retrieval of k nearest neighbors. Continuous Nearest Neighbor Queries The NN algorithms discussed in the previous paragraph are mainly designed for searching stationary objects. With the emergence of mobile devices, attention has focused on the problem of continuously nding k nearest neighbors for moving query points. A naive approach might be to continuously issue kNN queries along the route of a moving object. This solution results in repeated server accesses and nearest neighbor computations and is therefore inefficient. Sistla et al. rst proposed the importance of the continuous nearest neighbor queries, the modeling methods, and related query languages, however they did not discuss the processing methods. Song et al. proposed the rst algorithm for continuous NN queries. Their approaches are based on performing several point NN queries at predefined sample points. Saltenis et al. propose a time parameterized R-tree, an index structure for moving objects, to address continuous kNN queries for moving objects. Tao et al. in present a solution for continuous NN queries via performing

one single query for the entire route based on the time parameterized R-tree. The main shortcoming of this solution is that it is designed for Euclidean spaces and users have to submit predefined trajectories to the database server. Spatial Network Nearest Neighbor Queries Initially, nearest neighbor searches were based on Euclidean distance between the query object and the sites of inter- est. However, in many applications objects cannot move freely in space but are constrained by a network (e.g., cars on roads, trains on tracks). Therefore, in a re- realistic environment the nearest neighbor computation must be based on the spatial network distance, which is more expensive to compute. A number of techniques have been proposed to manage the complexity of this problem . III QUERY PROCESSING: BROADCAST VS. POINT-TO-POINT LBSs by and large assume wireless communication since both the clients and the data (e.g., vehicles being tracked) move. Wireless communication supports two basic data dissemination methods. In periodic broadcast, data are periodically broadcast on a wireless channel accessible to all clients in range. A mobile client listens to the broadcast channel and downloads the data that matches the query. In on-demand access, a mobile client establishes a point-to-point connection to the server and submits requests to and receives results from the server on the established channel. The role of periodic broadcast is crucial in LBSs. This is because both the number of clients and the amount of requests to be supported by the LBS could be huge. Periodic broadcast can be used to disseminate information to all participants at very low cost. Early work on data broadcast focused on non-spatial data, but in order to be useful in LBSs, a data broadcast system must support spatial queries and spatial data. For example, in an intelligent transportation system, the locations of all monitored vehicles can be broadcast so that an on board navigation system can decide autonomously how to navigate through a crowded downtown area. Similar needs can be envisaged in the coordination of mobile robots on a factory floor. In this section, we investigate the spatial query processing technique in a broadcast environment followed by locationbased spatial queries processing in a demand access mode. IV GENERAL SPATIAL QUERY PROCESSING ON BROADCAST DATA Search algorithms and index methods for wireless broadcast channels should avoid back-

tracking. This is because, in wireless broadcast, data are available on air in the sense that they are available on the channel transiently. When a data item is missed, the client has to wait for the next broadcast cycle, which takes a lot of time. Existing database indexes were obviously not designed to meet this requirement because data are stored on disks and can be randomly accessed any time. Hence, they perform poorly on broadcast data. Since most spatial queries involve searching on objects that are close to each other, a space-filling curve such as a Hilbert Curve can be applied to arrange objects on the broadcast so that the objects in proximity are close together. Algorithms can be developed to answer window and kNN queries on the broadcast . Given a query window, we can identify the first and the last objects within the window that the Hilbert Curve passes through. kNN queries can be answered in a similar manner by first estimating the bounds within which the k nearest neighbors can be found followed by a detailed checking of the Euclidean distances between the candidate objects and the query. The adoption of a space-filling curve can avoid the clients back-tracking the broadcast channel several times when retrieving objects for spatial queries. This is of essential importance to save the power consumption of the clients and improve the response time. However, in on-demand access mode, the processing of location-based spatial queries raises different issues due to the mobility of the clients. V LOCATION-BASED SPATIAL QUERIES In contrast to conventional spatial processing, queries in LBSs are mostly concerned about objects around the users current position. Nearest neighbor queries are one example. In a nonlocation-based system, responses to queries can be cached at the clients and are reusable if the same queries are asked again. In LBSs, however, users are moving frequently. Since the result of a query is only valid for a particular user location, new queries have to be sent to the server whenever there is a location update, resulting in high network transfer and server processing cost. To alleviate this problem, the concept of validity region can be used. The validity region of a query indicates the geographic area(s) within which the result of the query remains valid and is returned to the user together with the query result. The mobile client is then able to determine whether a new query should be issued by verifying whether it is still inside the validity region. The utilization of validity region reduces significantly the number of new queries issued to the server and thus the communication via the wireless channel.

VI SCHEDULING SPATIAL QUERIES PROCESSING One unmistakable challenge that LBSs have to face is the huge user population and the resulting large workload generated. Traditional spatial database research focuses on optimizing the I/O cost for a single query. However, in LBSs where a large spatial locality could be expected, queries received in an interval may access the same portion of the database and thus share some common result objects. For instance, users in a busy shopping area may want to display on their PDAs a downtown map with shopping malls overlaid on it. These are equivalent to a number of window queries around the users positions. Although the windows are different, they likely overlap, resulting in the accessing of overlapping objects from the map database. Interquery optimization can be utilized at the server to reduce the I/O cost and response time of the queries. To achieve this objective, multiple spatial window queries can be parallelized, decomposed, scheduled, and processed under a real-time workload in order to enhance system runtime performance; for example, I/O cost and response time. Query locality can be used to decompose and group overlapping queries into independent jobs, which are then combined to minimize redundant I/Os. The essential idea is to eliminate duplicate I/O accesses to common index nodes and data pages. In addition, jobs may be scheduled to minimize the mean query response time. In principle, processing queries close to one another will save more I/O cost because there is a high chance that they can share some MBRs in the R-tree index as well as data objects. An innovative method to quantify the closeness and degree of overlapping in terms of I/O has been developed based on window query decomposition . In addition, in a practical implementation where a fair amount of main memory is available, caching can be used to reduce the I/O cost. For spatial data indexed by an R-tree, high-level R-tree nodes can be cached in memory. VII SCHEDULING AND MONITORING SPATIAL QUERIES One unmistakable challenge that LBSs have to face is the huge user population and the resulting large workload generated. Traditional spatial database research focuses on optimizing the I/O cost for a single query. However, in LBSs where a large spatial locality could be expected, queries received in an interval may access the same portion of the database and thus share some common result objects. For instance, users in a busy shopping area may want to display on their PDAs a downtown map with

shopping malls overlaid on it. These are equivalent to a number of window queries around the users positions. Although the windows are different, they likely overlap, resulting in the accessing of overlapping objects from the map database. Interquery optimization can be utilized at the server to reduce the I/O cost and response time of the queries. To achieve this objective, multiple spatial window queries can be parallelized, decomposed, scheduled, and processed under a real-time workload in order to enhance system runtime performance; for example, I/O cost and response time. Query locality can be used to decompose and group overlapping queries into independent jobs, which are then combined to minimize redundant I/Os. The essential idea is to eliminate duplicate I/O accesses to common index nodes and data pages. In addition, jobs may be scheduled to minimize the mean query response time. In principle, processing queries close to one another will save more I/O cost because there is a high chance that they can share some MBRs in the R-tree index as well as data objects. An innovative method to quantify the closeness and degree of overlapping in terms of I/O has been developed based on window query decomposition in [HZL03]. In addition, in a practical implementation where a fair amount of main memory is available, caching can be used to reduce the I/O cost. For spatial data indexed by an R-tree, high-level R-tree nodes can be cached in memory. VIII MONITORING CONTINUOUS QUERIES ON MOVING OBJECTS Most of the existing LBSs assume that data objects are static (e.g., shopping malls and gas stations) and the location-based queries are applied on these static objects. As discussed in Section 2, applications such as location-based alerts and surveillance systems require continuous monitoring of the locations of certain moving objects, such as cargos, security guards, children, and so on. For instance, we may issue a window query, monitor the number of security guards in the window, and send out an alert if the number falls below a certain Q4 Q1 S1 Q3

threshold. Likewise, we can issue several nearest neighbor queries centered around the major facilities and monitor the nearest police patrols so that they can be dispatched quickly should any problem arise. Generally, we are interested in the result changes over time for some spatial queries. This is the problem known as monitoring continuous spatial queries. Most of the existing work assumed that clients are autonomous in that they continuously measure their positions and report location updates to the server periodically. Thus, server-side indexing and processing methods have been devised to minimize the CPU and I/O costs for handling these updates and re-evaluating the query results. In these methods, the update frequency is crucial in determining the system performance high update frequency would overload both the clients and servers in terms of high communication and processing costs while low update frequency would introduce errors into the monitored results. An alternative to periodic location update is to let the clients be aware of the spatial queries that are being monitored so that they would update their locations on the server only when the query results were about to change. The main idea is that the server maintains a safe region for each moving client. The safe region is created to guarantee that the results of the monitoring queries remain unchanged and as such need not be re-evaluated as long as the clients are moving within their own safe regions. Once a client moves out of its safe region, it initiates a location update. The server identifies the queries being affected by this update and re-evaluates them incrementally. At the same time, the server recomputes a new safe region of this client and sends it back to the client. The number of clients inside each window is being monitored. It is clear that when a client (shown as a dot in the figure) moves within the shaded rectangle S1, it will not affect the results of the four queries. However, when it moves out of S1, its new location must be sent to the server in order to re-evaluate the queries and re-compute a new safe region for the client accordingly.

Q4 Q3 Q1

Q2

S2 Q2 b)Mobility Pattern Effect

(a) Computing Safe Regions

For kNN queries, the exact locations of the clients must be known in order to determine which client is closer to the query point. However, since a client can move freely inside a safe region without informing the server, the safe regions can only be used to identify a number of potential candidates and the server still needs to probe these candidates to request their exact positions. In order to reduce the communication cost between the clients and the server, we need to reduce the number of location updates and server probes during monitoring. As such, we need to find the largest possible safe regions for the clients to reduce the number of location updates and devise efficient algorithms for re-evaluating the query results to reduce the number of server probes. When computing the safe regions, we also need to consider the effect of the clients mobility pattern. This is because a safe region does not have to occupy the largest area to be the best. For example, in Fig. S1 is the largest safe region for the client. However, as shown in Fig. 3 (b), if we know that the client is moving toward the west, S2 is a better safe region than S1 because it takes longer for the client to move out of S2 and thus defer the next location update.

8o 2 O that overlap with w 6: if w & MV R then 7: return W Q 8: else 9: W Q [ query results returned from the on-air window query with w0 . {if w 6& MV R , utilize w0 to compute the new search bounds and results.} 10: return W Q 11: end if

IX CONCLUSION : A novel approach for reducing the spatial query access latency by leveraging results from nearby peers in wireless broadcast environments. Significantly, our scheme allows a mobile client to locally verify whether candidate objects received from peers are indeed part of its own spatial query result set. X REFERENCES S. Acharya, R. Alonso, M.J. Franklin, and S.B. Zdonik, Broadcast Disks: Data Management for Asymmetric Communications Environments, Proc. ACM SIGMOD 95, pp. 199-210, 1995. D. Barbara, Mobile Computing and Databases: A Survey, IEEE Trans. Knowledge and Data Eng., vol. 11, no. 1, pp. 108-117, Jan./Feb. 1999. N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles, Proc. ACM SIGMOD 90, pp. 322-331, 1990. J. Broch, D.A. Maltz, D.B. Johnson, Y.-C. Hu, and J.G. Jetcheva, A Performance Comparison of MultiHop Wireless Ad Hoc Network Routing Protocols, Proc. ACM MobiCom 98, pp. 85-97, 1998. Bychkovsky, B. Hull, A.K. Miu, H. Balakrishnan, and S. Madden, A Measurement Study of Vehicular

Algorithm 1: NNV 1: P peer nodes responding to the query request issued from q. ; 2: MV R 3: for 8p 2 IP do O 4: MV R [ 14 p:V R and O [ 14 p:O 5: end for O, 6: sort according to kq; oi k 7: Compute kq; es k, where edge es has the shortest distance to q among all the edges of MV R 8: i 14 1 9: while jHj < k and i jO do Oj 10: if kq; oi k kq; es k then 11: H:verified [ 14 oi 12: else 13: H:unverified [ 14 oi 14: i 15: end if 16: end while 17: return H Algorithm 2: 1: P peer nodes responding to the query request issued from q. 2: for 8p 2 P do 3: MV R [ 14 p:V R and O [ 14 p:O 4: end for 5: W Q

Internet Access Using In Situ Wi-Fi Networks, Proc. ACM MobiCom 06, Sept. 2006. C.-Y. Chow, H. Va Leong, and A. Chan, Peer-toPeer Cooperative Caching in Mobile Environment, Proc. 24th IEEE Intl Conf. Distributed Computing

Systems Workshops (ICDCSW 04), pp. 528-533, 2004.

A 3D Face Recognition using Histrograms


Sarbjeet Singh1,Meenakshi sharma2, Dr. N Suresh Rao3, Dr. Zahid Ali4 Sri Sai College Of Engg. & Tech., Pathankot1,2,4 , Jammu University3 ,Mtech CSE 4th sem1 , Professor3 , SSCET2,4 sarbaish@gmail.com1
Abstract : We present a process of 3D face recogntion, which relies on the analysis of the three-dimensional facial surface. This process occurs in two steps , the first step is fully automatic normalization stage followed by a histogram-based feature extraction algorithm. The tip and the root of the nose are detected during normalisation and PCA analysis is then applied for checking the symmetry of the face and various calculations like curvature of the face. Then after the face is realligned system derived from the nose tip and the symmetry axis, the result we obtain will be the 3D model of the face A simple statistical method is used for the analysis of the actual region. This area is split into disjoint horizontal subareas and the distribution of depth values in each subarea is exploited to characterize the face surface of an individual. Our analysis of the depth value distribution is based on a straightforward histogram analysis of each subarea. When comparing thefeature vectors resulting from the histogram analysis we apply three different similarity metrics. Keywords : Face recognition ,PCA, LDA Histogram.

1.Introduction :
Face recognition is one of the most active and widely used technique[1-2] because of its reliability and accuracy in the process of recognizing and verifying a persons identity. The need is becoming important since people are getting aware of security and privacy. For the Researchers Face Recognition is among the tedious work. It is all because the human face is very robust in nature; in fact, a persons face can change very much during short periods of time (from one day to another) and because of long periods of time (a difference of months or years). One problem of face recognition is the fact that different faces could seem very similar; therefore, a discrimination task is needed. On the other hand, when we analyze the same face, many characteristics may have changed. These changes might be because of changes in the different parameters. The parameters are: illumination, variability in facial they give a very good idea about the spread of the data and shape. Let us try drawing a histogram of percentage scores in a test . The scores are as follows :-

expressions, the presence of accessories (glasses, beards, etc); poses, age, finally background. We can divide face recognition[7-8] techniques into two big groups, the applications that required face

identification and the ones that need face verification. The difference is that the first one uses a face to match with other one on a database; on the other hand, the verification technique tries to verify a human face from a given sample of that face.

2. Histogram
Histogram, or Frequency Histogram is a bar graph. The horizontal axis depicts the range and scale of observations involved and vertical axis shows the number of data points in various intervals ie. the frequency of observations in the intervals. Histograms are popular among statisticians. Though they do not show the exact values of the data points 82.5, 78.3, 76.2, 81.2, 72.3, 73.2, 76.3, 77.3, 78.2, 78.5, 75.6, 79.2, 78.3, 80.2, 76.4, 77.9, 75.8, 76.5, 77.3, 78.2 When any data is provided to XLMiner, it decides the size and number of intervals amongst which the

data should be distributed. It uses "Nicing" to decide the number of intervals. Five to Twenty intervals are fixed on the dataset depending on its range. Now see the histogram of the same data.

Face recognition [5]techniques can be used to browse videodatabase to find out shots of particular people. Also for face images model with for a compact

parameterizedfacial

low-bandwidth

communication applicationssuch as videophone and teleconferencing.Recently, as the technology has matured, commercial productshave appeared on the market. Despite the commercialsuccess of those face recognition products, a few researchissues remain to be explored. 3.1 General face recognition system

The values on the horizontal axis are the upper limits of bins (intervals) of data points, and not the mid-points of the intervals, although they may appear to be so. This is in keeping with the way the Analysis Toolpak of Excel works. As an example, the bar shown against 78 has a frequency of 7. That means 7 data points lie in the range above 76 and upto (including) 78. As is evident, the histogram gives a fairly good idea about the shape and spread of data at a glance. Figure : Block Diagram for Face Recognition System

4. Histogram Method used for Face Detection


As per [9], RGB colour space is commonly used in image processingbecause of its basic synthesis property and direct application inimage display. According to the requirements of different

3. Face Recognition
Face recognition is one of the few biometric methods thatpossess the merits of both high accuracy and low intrusiveness.It has the accuracy of a physiological approach withoutbeing intrusive. For this reason, since the early 70's, face recognition has drawn the attention of researchersin fields from security, psychology, and image processing, tocomputer vision. Numerous algorithms have been proposedfor face recognition; While network security and access control are it most widelydiscussed applications, face recognition has also proven useful in other

imageprocessing tasks, RGB colour space is often transformed to othercolour spaces. From a visual perception's point of view, hue,saturation and value are often employed to manipulate colour,such as desaturation or change of colourfulness. When

thecolour is quantized to a limit number of representative colours,one will have to deal with two problems. The first is how to bestmatch the distance[3-4] of data representation to human perception. Itis desirable that numerical colour distance is proportional toperceptual difference. The

multimedia information processing areas.

second problem is how to bestquantize the colours such that the reproductions from thesequantized colours is the most faithful to the original. In thiswork, we adopt a perceptually meaningful colour space, theHMMD colour space, and used a carefully worked outquantization scheme of the MPEG-7 standard

frequencies from the stored vectors is calculated and are stored in another vectors for later use in testing phase. This mean vector is used for calculating the absolute differences among the mean of trained images and the test image. Finally the minimum difference found identifies the matched class with test image. Recognition accuracy is of 95 in our case. 6. Experimental Results Our database contain the face images of the postgraduate students.There were 650 phtos in the training set. 3000 training face samples are generated by rotating the faces upto certain degree. In all tests, we search faces of 14 different sizes, ranging from the smallest of 18 x 23 pixels to the biggest of 280 x 350 pixels. For each of these 14 sizes, the window is scaled up or down to the standard size of 64 x 80 for detection. A face is counted as correctly detected if

Fig. 2The schematic of the new face recognition/detection method 5. Proposed work and Algorithm: Recognizing objects from large image databases, histogram based methods have proved simplicity and usefulness in last decade. Initially, this idea was based on color histograms .This algorithm presents the first part of our proposed technique named as Histogram processed Face Recognition compared to detection use in [9] Histogram techniques are well designed for face detection[6] as shown above.But in our case we apply histogram calculation for face recognition .The algorithm given below worked for face recognition with success rate of 95%. For training, grayscale images with 256 gray levels are used. Firstly, frequency of every gray-level is computed and stored in vectors for further as

the two eyes and the month are within the detection window. If a declared detection window does not include any face feature inside it, then that incident is counted as a false detection. If a declared detection window that covers part of the face but does not have the eyes and mouth fully inside the window then we do not count it as a miss nor as a false detection, we call this situation partial detection. Based on this criterion, detection results are shown in Table 1.
Table 1.

Correct Detection

False Detection

Missed Detection

Partial Detection

85% 1% 4% 10% It is seen that the correct detection rate is high (85%) and the only 4% were completely missed. There are 27 false detections, which should be viewed in relation to the number of detection windows (> 1 million after colour detection) examined by the algorithm

processing. Secondly, mean of consecutive nine

7. Algorithm Steps:
Step 1: Take input image I Step 2.Test the gray level For I1=1: N %where N is number of Images Step3: Compute frequency For I2=1: N Step 4: Make frequency vector ForI3=1:M %where M is the dimension of frequency vector and taken as M=9 Step5: Calculate mean or mean difference Md Md=Trained image Test image If Md= 0 then Matched Got to Step 7 Else %Again check for the next image Go step 4 Endif Endfor&Goto step 3 Endfor&goto step 2 Endfor& got to step 6 Step 6: Print Not Matched & Stop Step 7: Show the Mapped Output in GUI & Stop to

9. References.
[1] A. M. Martinez and A. C. Kak, PCA versus LDA, IEEE Trans. On pattern Analysis and Machine Intelligence,Vol. 23, No. 2, pp. 228-233, 2001. [2] Boualleg, A.H.; Bencheriet, Ch.; Tebbikh, H Automatic Face recognition using neural networkPCA Information and Communication

Technologies, 2006. ICTTA '06. 2nd Volume 1, 2428 April 2006 [3] Byung-Joo Oh Face recognition by using neural network classifiers based on PCA and LDA Systems, man & Cybernetics,2005 IEEE international conference. [4] Francis Galton, Personal identification and description, In Nature, pp. 173177, June 21, 1888. [5] W. Zaho, Robust image based 3D face recognition, Ph.D. Thesis, Maryland University, 1999. [6] R. Chellappa, C. L. Wilson, and S. Sirohey, Human and machine recognition of faces: A survey, Proc. IEEE, vol. 83, pp. 705741, May 1995. [7] T. Riklin-Raviv and A. Shashua, The Quotient image: Class based recognition and synthesis under varying illumination conditions, In CVPR, P. II: pp. 566-571,1999. [8] G.j. Edwards, T.f. Cootes and C.J. Taylor, Face recognition using active appearance models, In ECCV, 1998. [9] A COLOUR HISTOGRAM BASED

8.Conclusion :
In this paper the histogram technique is used for the face recognition.Since different faces have different facial features , therefore we have developed the multiple color histogram technique .The LDA algorithm ,PCA algorithms can be well used in order to improve the results of the face recognition process and video sequences.

APPROACH TO HUMAN FACE DETECTION Jianzhong Fang and GuopingQiu School of

Computer Science, The University of Nottingham

An Application of Eigen Vector in Back Propagation Neural Network for Face expression Identification
1 Ahsan Hussain, M.Tech (Scholar), R.K.D.F.I.S.T Bhopal (M.P) 2 Prof.Shrikant Lade, I.T.Department, R.K.D.F.I.S.T Bhopal (M.P)
Abstract: This paper presents a methodology to identifying a facial expression of human being based on the information theory approach of coding. The above task work is consist of two major phases 1) Identifying maximum matching face from database. 2) Extracting a facial expression from matched image. Phase one consist of feature extraction using Principal Component analysis & face reorganization using feed forward back propagation Neural Network. We have tested the above task over 1000 to 10000 images including both color, grayscale images of same & different human faces & got the 80 to 90 % accurate result. Index ItemFace recognition, Principal component analysis (PCA), Artificial Neural network (ANN), Eigenvector and Eigenface.Expression Evaluation.

I INTRODUCTION The Face is a primary focus of attention of human being learning. A human being can observe thousands of face since from his birth and his ability to identify a faces anywhere is really remarkable. Even though human can identify faces, but it could be quite difficult to identify an expression from his face. This skill is quite robust, despite of large changes in the visual stimulus due to viewing conditions, expression, aging, and distractions such as glasses, beards or changes in hair style. Developing a computational model of face expression identification is quite difficult, because faces are Complex, multi-dimensional visual stimuli. For face expression identification the starting step involves extraction of the relevant features from facial images. II RELATED WORK Identifying a facial expression task consist of 1) Face reorganization from database. 2) Extraction Face expression of maximum matching image. A face reorganization task consist of two basic Methods. The first method is based on extracting feature vectors from the basic parts of a face such as eyes, nose, mouth, and chin, with the help of

deformable templates and extensive mathematics. Then key information from the basic parts of face is gathered and converted into a feature vector. Another method is based on the information theory concepts viz. principal component analysis method. In this method, information that best describes a face is derived from the entire face image. Based on the Karhunen-Loeve expansion in pattern recognition, Kirby and Sirovich [5], [6] have shown that any particular face can be represented in terms of a best coordinate system termed as "eigenfaces". These are the eigen functions of the average covariance of the ensemble of faces. Later, Turk and Pentland [7] proposed a face recognition method based on the eigenfaces approach. An unsupervised pattern recognition scheme is proposed in this paper which is independent of excessive geometry and computation. Recognition system is implemented based on eigenface, PCA and ANN. Principal Component analysis for face recognition is based on the information theory approach in which the relevant information in a face image is extracted as efficiently as possible. Further Artificial Neural Network was used for classification. Neural Network concept is used because of its ability to learn from observed data. III PROPOSED TECHNIQUE The proposed technique is coding and decoding of face images, emphasizing the significant local and global features. In the language of information theory, the relevant information in a face image is extracted, encoded and then compared with a database of models. The face recognition system is as follows:

distributed in this huge image space and thus can be described by a relatively low dimensional subspace. The main idea of the principal component analysis is to find the vectors that best account for the distribution of face images within the entire image space. The face can also be approximated using only the best M eigen faces. Let the training set of face images be 1, 2, 3. TM then the average of the set is defined by

Each face differs from the average by the vector i =i (2)

Calculating Eigen Values For Training Data Set Images The face library entries are normalized. Eigenfaces are calculated from the training set and stored. An individual face can be represented exactly in terms of a linear combination of eigenfaces. The face can also be approximated using only the best M eigenfaces, which have the largest Eigen values. It accounts for the most variance within the set of face images. Best Meigenfaces span an M-dimensional subspace which is called the "face space" of all possible images. For calculating the Eigen face PCA algorithm was used. Let a face image I(x, y) be a two-dimensional N x N array. An image may also be considered as a vector of dimension N2, so that a typical image of size 92 x 112 becomes a vector of dimension 10,304, or equivalently a point in 10,304- dimensional space. An ensemble of images, then, maps to a collection of points in this huge space. Images of faces, being similar in overall configuration, will not be randomly distributed in this huge image space and thus can be described by a relatively low dimensional subspace. The main idea of the principal component analysis is to find the vectors that best account for the distribution of face images within the entire image space. CALCULATING EIGEN VALUE FOR BOTH TRAINING & TESTING SET IMAGES Let a face image I(x, y) be a two-dimensional N x N array. An image may also be considered as a vector of dimension N2, so that a typical image of size 92 x 112 becomes a vector of dimension 10,304, or equivalently a point in 10,304-dimensional space. An ensemble of images, then, maps to a collection of points in this huge space Images of faces, being similar in overall configuration, will not be randomly A)

Fig 2: Training Set Images An example training set is shown in Figure 2, with the average face . i =Ti (2) This set of very large vectors is then subject to principal component analysis, which seeks a set of M orthonormal vectors, un , which best describes the distribution of the data. The kth vector, uk , is chosen such that Face Acquisition Pre-Processing Face Library Formation Training Images Testing Images Eigen Face Formation Select Eigen Faces Face Descriptor of training images Select Image New Face Descriptor Start

is a maximum,

(4) Subject to The vectors uk and scalar k are the eigenvectors and eigen values, respectively of the covariance matrix

where the matrix A = [ 1, 2,........M ] The covariance matrix C, however is N2 x N2 real symmetric matrix, and determining the N2 eigenvectors and eigen values is an intractable task for typical image sizes. We need a computationally feasible method to find these eigenvectors. If the number of data points in the image space is less than the dimension of the space ( M < N2 ), there will be only M-1,rather than N2, meaningful eigenvectors. The remaining eigenvectors will have associated Eigen values of zero. We can solve for the N2 dimensional eigenvectors in this case by first solving the eigenvectors of an M x M matrix such as solving 16 x 16 matrix rather than a 10,304 x 10,304 matrix and then, taking appropriate linear combinations of the face images i. Consider the eigenvectors vi of ATA such that Premultiplying both sides by A, we have from which we see that Avi are the eigenvectors of C =AAT.Following these analysis, we construct the M x M matrix L= ATA, where Lnm = m T n, and find the M eigen vectors, I, of L. These vectors determine linear combinations of the M training set face images to form the Eigen faces ui.

A) EXPERIMENTAL RESULT: Fig 3: Testing Set

(7) the calculations become quite manageable. The associated Eigen values allow us to rank the eigenvectors according to their usefulness in characterizing the variation among the images. Process Implemented for image reorganization & Expression identification is as shown in following figure

Images

Fig 4: Result showing output of Project Result Analysis

1) Result of Artificial Neural Network trained over Images of Same Person.

V REFERENCES [1] Yuille, A. L., Cohen, D. S., and Hallinan, P. W.,"Feature extraction from faces using deformable templates", Proc. of CVPR, (1989). [2] S. Makdee, C. Kimpan, S. Pansang, Invariant range image multi pose face recognition using Fuzzy ant algorithm and membership matching score Proceedings of 2007 IEEE International Symposium on Signal Processing and Information Technology,2007, pp. 252- 256. [3] Victor-Emil and Luliana-Florentina, Face Rcognition using a fuzzy Gaussian ANN, IEEE 2002. Proceedings, Aug. 2002 Page(s):361 368 [4] [Howard Demuth,Mark Bele,Martin Hagan, Neural Network Toolbox [5] Kirby, M., and Sirovich, L., "Application of theKarhunen-Loeve procedure for thecharacterization of human faces", IEEE PAMI, Vol.12, pp. 103-108, (1990). [6] Sirovich, L., and Kirby, M., "Lowdimensionalprocedure for the characterization of human faces", J.Opt. Soc. Am. A, 4, 3,pp. 519524, (1987). [7] Turk, M., and Pentland, A., "Eigenfaces for recognition", Journal of Cognitive Neuroscience, Vol. 3, pp. 71-86, (1991). [8] S. Gong, S. J. McKeANNa, and A. Psarron, Dynamic Vision, Imperial College Press, London, 2000. [9] Manjunath, B. S., Chellappa, R., and Malsburg, C., "A feature based approach to face recognition", Trans. Of IEEE, pp. 373-378, (1992)

IV CONCLUSION: This Paper Present a new technique for face expression identification & overall Result as shown in a result section shows that accuracy can be achieved when neural network is trained over the several images of same person. Table 1 shows that, accuracy rate is more when network is trained over same person images than the network which is trained over the images of different Person. Finally, we are compared our project result with the previous technique & got the better result than previous one.

Next Generation Cloud Computing Architecture


Ahmad Talha Siddiqui Shahla Tarannum Tehseen Fatma
M.Tech (CS) M.Tech (CS) M.Tech (CS) Deptt. of Computer Science Deptt. of Computer Science Deptt. Of Computer Science Jamia Hamdard University Jamia Hamdard University Jamia Hamdard University ahmadtalha2007@gmail.com shahlatarannum@gmail.com asad249@yahoo.co.in resources are driving the need for a more dynamic IT Abstract-Cloud computing is fundamentally infrastructure that can respond to rapidly changing altering the expectations for how and when requirements in real-time. This need for real-time computing, storage and networking resources dynamism is about to fundamentally alter the data should be allocated, managed and consumed. center landscape and transform the IT infrastructure End-users are increasingly sensitive to the as we know it [1]. latency of services they consume. Service Instead the computer in the cloud ideally comprises a pool of physical compute resources i.e. Developers want the Service Providers to processors, memory, network bandwidth and storage, ensure or provide the capability to dynamically potentially distributed physically across server and allocate and manage resources in response to geographical boundaries which can be organized on changing demand patterns in real-time. demand into a dynamic logical entity i.e. a cloud Ultimately, Service Providers are under computer, that can grow or shrink in real-time in pressure to architect their infrastructure to order to assure the desired levels of latency sensitivity, performance, scalability, reliability and enable real-time end-to-end visibility and security to any application that runs in it. At a dynamic resource management with fine fundamental level, virtualization technology enables grained control to reduce total cost of the abstraction or decoupling of the application ownership while also improving agility. What payload from the underlying physical resource [2]. is needed is a rethinking of the underlying What this typically means is that the physical operating system and management resource can then be carved up into logical or virtual resources as needed. This is known as provisioning. infrastructure to accommodate the ongoing 2. The on-demand, self-service, pay-by-use model transformation of the data center from the The on-demand, self-service, pay-by-use nature of traditional server-centric architecture model to cloud computing is also an extension of established a cloud or network-centric model. This paper trends. From an enterprise perspective, the onproposes and describes a reference model for a demand nature of cloud computing helps to support network-centric data center infrastructure the performance and capacity aspects of service-level objectives. The self-service nature of cloud management stack that borrows and applies computing allows organizations to create elastic key concepts that have enabled dynamism, environments that expand and contract based on the scalability, reliability and security in the workload and target performance parameters. And telecom industry, to the computing industry. the pay-by-use nature of cloud computing may take the form of equipment leases that guarantee a Keywords-Cloud Computing, Datacenter, minimum level of service from a cloud provider. Virtualization is a key feature of this model. IT Distributed Computing, Virtualization organizations have understood for years that virtualization allows them to quickly and easily create copies of existing environments sometimes 1. INTRODUCTION involving multiple virtual machines to support Everyone has an opinion on what is cloud computing. test, development, and staging activities. The cost of It can be the ability to rent a server or a thousand these environments is minimal because they can servers and run a geophysical modelling application coexist on the same servers as production on the most powerful systems available anywhere. environments because they use few resources. This The unpredictable demands of the Web 2.0 era in lightweight deployment model has already led to a combination with the desire to better utilize IT Darwinistic[3] approach to business development

where beta versions of software are made public and the market decides which applications deserve to be scaled and developed further or quietly retired. Cloud computing extends this trend through automation. Instead of negotiating with an IT organization for resources on which to deploy an application, a compute cloud is a self-service proposition where a credit card can purchase compute cycles, and a Web interface or API is used to create virtual machines and establish network relationships between them. Instead of requiring a long-term contract for services with an IT organization or a service provider, clouds work on a pay-by-use, or pay by- the-sip model where an application may exist to run a job for a few minutes or hours, or it may exist to provide services to customers on a long-term basis. 2.1. Cloud computing infrastructure models There are many considerations for cloud computing architects to make when moving from a standard enterprise application deployment model to one based on cloud computing. There are public and private clouds that offer complementary benefits, there are three basic service models to consider, and there is the value of open APIs versus proprietary ones. 2.2. Architectural layers of cloud computing Cloud computing is an inclusive one: cloud computing can describe services being provided at any of the traditional layers from hardware to applications. In practice, cloud service providers tend to offer services that can be grouped into three categories: software as a service, platform as a service, and infrastructure as a service. 3. Software as a service (SaaS) Software as a service features a complete application offered as a service on demand. A single instance of the software runs on the cloud and services multiple end users or client organizations. The most widely known example of SaaS is salesforce.com, though many other examples have come to market, including the Google Apps offering of basic business services including email and word processing. Although salesforce.com preceded the definition of cloud computing by a few years, it now operates by leveraging its companion force.com, which can be defined as a platform as a service. 4. Platform as a service (PaaS) Platform as a service encapsulates a layer of software and provides it as a service that can be used to build higher-level services. There are at least two perspectives on PaaS depending on the perspective of the producer or consumer of the services: 4.1. Someone producing PaaS might produce platform by integrating an OS, middleware, application software, and even a development environment that is then

provided to a customer as a service. For example, someone developing a PaaS offering might base it on a set of Sun xVM hypervisor virtual machines that include a NetBeans integrated development environment, a Sun GlassFish Web stack and support for additional programming languages such as Perl or Ruby. 4.2. Someone using PaaS would see an encapsulated service that is presented to them through an API. The customer interacts with the platform through the API, and the platform does what is necessary to manage and scale itself to provide a given level of service. Virtual appliances can be classified as instances of PaaS. A content switch appliance, for example, would have all of its component software hidden from the customer, and only an API or GUI for configuring and deploying the service provided to them. Commercial examples of PaaS include the Google Apps Engine, which serves applications on Googles infrastructure. PaaS services such as these can provide a powerful basis on which to deploy applications, however they may be constrained by the capabilities that the cloud provider chooses to deliver. 5. Infrastructure as a service (IaaS) Infrastructure as a service delivers basic storage and compute capabilities as standardized services over the network. Servers, storage systems, switches, routers, and other systems are pooled and made available to handle workloads that range from application components to high-performance computing applications. Commercial examples of IaaS include Joyent, whose main product is a line of virtualized servers that provide a highly available ondemand infrastructure. 6. Server Operating Systems and Virtualization Whereas networks and storage resources - thanks to advances in network services management and SANs, have already been capable of being virtualized for a while, only now with the wider adoption of server virtualization do we have the complete basic foundation for cloud computing i.e. all computing resources can now be virtualized. With server virtualization, we now have the ability to create complete logical (virtual) servers that are independent of the underlying physical infrastructure or their physical location. All of this has helped to radically transform the cost structure and efficiency of the data center [4]. Despite the numerous benefits that virtualization has enabled we are yet to realize

the full potential of virtualization in terms of cloud computing. This is because: 6.1. Traditional server-centric operating systems were not designed to manage shared distributed resources: The Cloud computing paradigm is all about optimally sharing a set of distributed computing resources whereas the server-centric computing paradigm is about dedicating resources to a particular application. The servercentric paradigm of computing inherently ties the application to the server. The job of the server operating system is to dedicate and ensure availability of all available computing resources on the server to the application. 6.2. Current hypervisors do not provide adequate separation between application management and physical resource management: Todays hypervisors have just interposed themselves one level down below the operating system to enable multiple virtual servers to be hosted on one physical server [5, 6]. While this is great for consolidation, once again there is no way for applications to manage how, what and when resources are allocated to themselves without having to worry about the management of physical resources. 6.3. Server virtualization does not yet enable sharing of distributed resources: Server virtualization presently allows a single physical server to be organized into multiple logical servers. However, there is no way for example to create a logical or virtual server from resources that may be physically located in separate servers. It is true that by virtue of the live migration capabilities that server virtualization technology enables, we are able to move application workloads from one physical server to another potentially even geographically distant physical server. 7. Storage Networks & Virtualization Before the proliferation of server virtualization, storage networking and storage virtualization enabled many improvements in the data center. The key driver was the introduction of the Fibre Channel (FC) protocol and Fibre Channel-based Storage Area Networks (SAN) which provided high speed storage connectivity and specialized storage solutions to enable such benefits as server-less backup, point to point replication, HA/DR and performance optimization outside of the servers that run applications. However, these benefits have come with increased management complexity and costs [7]. 8. Network Virtualization The virtual networks now implemented inside the physical server to switch between all the virtual servers provide an alternative to the multiplexed, multi-pathed network channels by trunking them

directly to WAN transport thereby simplifying the physical network infrastructure. Systems Management Infrastructure Present day management systems are not cut out to enable the real-time dynamic infrastructure needed for cloud computing [8].

Proposed Reference Architecture Model If we were to distil the above observations from the previous section, we can see a couple of key themes emerging. That is: The next generation architecture for cloud computing must completely decouple physical resources management from virtual resource management; and Provide the capability to mediate between applications and resources in real-time. Distributed Services Assurance Platform: This layer will allow for creation of FCAPS-managed virtual servers that load and host the desired choice of OS to allow the loading and execution of applications. Since the virtual servers implement FCAPSmanagement, they can provide automated mediation services to natively ensure fault management and reliability (HA/DR), performance optimization, accounting and security. Distributed Services Delivery Platform: This is essentially a workflow engine that executes the application which - as we described in the previous section, is ideally composed as business workflow that orchestrates a number of distributable workflow elements. This defines the services dial tone in our reference architecture model. Distributed Services Creation Platform: This layer provides the tools that developers will use to create applications defined as collection of services which can be composed, decomposed and distributed on the fly to virtual servers that are automatically created and managed by the distributed services assurance platform. Legacy Integration Services Mediation: This is a layer that provides integration and support for existing or legacy application in our reference architecture model. Deployment of the Reference Model Any generic cloud service platform requirements must address the needs of four categories of stake holders (1) Infrastructure Providers, (2) Service Providers. (3) Service Developers, and (4) End Users. Below we describe how the reference model we described will affect, benefit and are deployed by each of the above stakeholders. 1. Infrastructure providers: These are vendors who provide the underlying computing, network and storage resources that can be

carved up into logical cloud computers which will be dynamically controlled to deliver massively scalable and globally interoperable service network infrastructure. The infrastructure will be used by both service creators who develop the services and also the end users who utilize these services. 2. Service providers: With the deployment of our new reference architecture, service providers will be able to assure both service developers and service users that resources will be available on demand. They will be able to effectively measure and meter resource utilization end-to-end usage to enable a dial-tone for computing service while managing Service Levels to meet the availability, performance and security requirements for each service. The service provider will now manage the applications connection to computing, network and storage resource with appropriate SLAs. 3. Service Developers: They will be able to develop cloud based services using the management services API to configure, monitor and manage service resource allocation, availability, utilization, performance and security of their applications in real-time. Service management and service delivery will now be integrated into application development to allow application developers to be able to specify run time SLAs. 4. End Users: Their demand for choice, mobility and interactivity with intuitive user interfaces will continue to grow. The managed resources in our reference architecture will now not only allow the service developers to create and deliver services using logical servers that end users can dynamically provision in real-time to respond to changing demands, but also provide service providers the capability to charge the end-user by metering exact resource usage for the desired SLA. CONCLUSION

In this paper, we have described the requirements for implementing a truly dynamic cloud computing infrastructure. Such an infrastructure comprises a pool of physical computing resources i.e. processors, memory, network bandwidth and storage, potentially distributed physically across server and geographical boundaries which can be

organized on demand into a dynamic logical entity i.e. cloud computer, that can grow or shrink in real-time in order to assure the desired levels of latency sensitivity, performance, scalability, reliability and security to any application that runs in it. We identified some key areas of deficiency with current virtualization and management technologies. In particular we detailed the importance of separating physical resource management from virtual resource management and why current operating systems and hypervisors which were born of the server-computing era, are not designed and hence ill suited to provide this capability for the distributed shared resources typical of cloud deployment. We also highlighted the need for FCAPS-based (Fault, Configuration, Accounting, Performance and Security) service mediation to provide global management functionality for all networked physical resources that comprise a cloud irrespective of their distribution across many physical servers in different geographical locations. We then proposed a reference architecture model for a distributed cloud computing mediation (management) platform which will form the basis for enabling next generation cloud computing infrastructure. We showed how this infrastructure will affect as well as benefit key stakeholders such as the Infrastructure providers, service providers, service developers and end-users. We believe that what this paper has described is significantly different from most current cloud computing solutions that are nothing more than hosted infrastructure or applications accessed over the Internet. The proposed architecture described in this paper will dramatically change the current landscape by enabling cloud computing service providers to provide a next generation infrastructure platform which will offer service developers and end-users unprecedented control and dynamism in real-time to help assure SLAs for service latency, availability, performance and security.

REFERENCES
[1] Rao Mikkilineni, Vijay Sarathy "Cloud Computing and Lessons from the Past", Proceedings of IEEE WETICE 2009, First International Workshop on Collaboration & Cloud Computing, June 2009 Rajkumar Buyyaa, Chee Shin Yeoa, , Srikumar Venugopala, James Broberga, and Ivona Brandicc, Cloud computing and emerging IT platforms: Vision, th hype, and reality for delivering computing as the 5 utility, Future Generation Computer Systems, Volume 25, Issue 6, June 2009, Pages 599-616 Introduction to Cloud Computing architecture White Paper 1st Edition, June 2009

[4]

[5]

[2]

[6] [7] [8]

[3]

Adriana Barnoschi, Backup and Disaster Recovery For Modern Enterprise, 5th International Scientific Conference, Business and Management 2008, Vilnius, Lithuvania. Jason A. Kappel, Anthony T. Velte, Toby J. Welte, Microsoft Virtualization with Hyper-V, McGraw Hill, New York, 2009 David Chisnall, guide to the xen hypervisor, First edition, Prenticen Hall, Press, NJ, 2009 Gartners 2008 Data Center Conference Instant Polling Results: Virtualization Summary March 2, 2009 Graham Chen, Qinzheng Kong, Jaon Etheridge and Paul Foster, "Integrated TMN Service Management", Journal of Network and Systems Management, Springer New York, Volume 7, 1999, p469-493

Virtualization Of Operating System using Xen Technology


Annu Dhankhar Faculty of BMIET annudhankhar13@gmail.com Abstract-Virtualization enables installation and running of multiple virtual machines on the same computer system.Operating system that communicates directly with hardware is known as the host operating system whereas virtual operating systems have all the features of a real operating system, but they run inside the host operating system. A virtual machine is separated from the computer hardware resources and it runs on the emulated hardware. Performance of the virtual operating system running on the same computer system hardware depends on the performance of the host operating system.The virtualization can enhance system flexibility by enabling the concurrent execution of an application operating system and real-time operating system on the same processor means we can run more than one operating system simultaneously.In this paper,we use linux as the host operating system that provide virtualization using Xen technology. The Xen hypervisor is a small, lightweight, software virtual machine monitor (VMM), for x86compatible computers. The Xen hypervisor securely executes multiple virtual machines on one physical system. Keywords-virtualization,VMM,Xen hypervisor. Siddharth Rana SystemEngineer in TCS siddharth3013@gmail.com physical resources such as CPU, I/O, memory and storage. Xen [1], VMware [2],Virtual PC [3], KVM [4] and Denali [5] are among popular virtualization techniques, which are popular in both academics and industry and applied to many areas.Each operating system environment is customized and encapsulated into a virtual machine (VM), and thus create the illusion of multiple VMs running on one real hardware platform. Users can easily managed VMs as a normal file such as create, copy, save, rollback, and migration. Virtual machine monitor (VMM) [6] plays the most important role in virtualization technology. VMM inserts a software layer between the operating systems and physical machine resources, and it acts as the complete controller of the real machine resources. As a monitor,VMM can intercept and capture the operating system calls for infrastructure resources and thus control the distribution of hardware resources. Here we explain virtualization approach using Xen virtualization technique.

2.Operating System Virtualization


Operating system virtualization is becoming more popular for servers enabling one machine to easily host multiple operating systems (figure 1). All modern operating systems now support virtualization, the ability to run a guest operating system within the operating system. There are several different forms of operating system virtualization, but the most interesting in terms of performance is paravirtualization. Paravirtualization requires the guest operating system to be modified so that the host

1.Introduction

Virtualization technology provides an approach to run concurrently multiple operating systems in a single host to make full use of the ample

operating system can gracefully manage the guest operating systems. With modern processors adding hardware support for virtualization, the modifications required for paravirtualization have reduced significantly. 2.1 why virtualize With increased server provisioning in the datacenter, several factors play a role in stifling growth. Increased power and cooling costs, physical space constraints, man power and interconnection complexity all contribute significantly to the cost and feasibility of continued expansion.Commodity hardware manufacturers have begun to address some of these concerns by shifting their design goals. Rather than focus solely on raw gigahertz performance,manufacturers have enhanced the feature sets of CPUs and chip sets to include lower wattage CPUs, multiple cores per CPU die, advanced power management, and a range of virtualization features. By employing appropriate software to enable these features,[7] several advantages are realized: a.Server Consolidation: By combining workloads from a number of physical hosts into a single host, a reduction in servers can be achieved as well as a corresponding decrease in interconnect hardware. Traditionally, these workloads would need to be specially crafted, partially isolated and well behaved, but with new virtualization techniques none of these requirements are necessary. b.Reduction of Complexity: Infrastructure costs are massively reduced by removing the need for physical hardware, and networking. Instead of having a large number of physical computers, all networked together, consuming power and administration costs, fewer computers can be used to achieve the same goal.Administration and physical setup is less time consuming and costly. c.Isolation: Virtual machines run in sand-boxed environments. They cannot access each other,

so if one virtual machine performs poorly, or crashes, it does not affect any other virtual machine. d.Platform Uniformity: In a virtualized environment, a broad, heterogeneous array of hardware components is distilled into a uniform set of virtual devices presented to each guest operating system. This reduces the impact across the IT organization: from support, to documentation, to tools engineering. e.Legacy Support: With traditional bare-metal operating system installations, when the hardware vendor replaces a component of a system, the operating system vendor is required to make a corresponding change to enable the new hardware (for example, an ethernet card). As an operating system ages, the operating system vendor may no longer provide hardware enabling updates. In a virtualized operating system, the hardware remains constant for as long as the virtual environment is in place, regardless of any changes occurring in the real hardware, including full replacement.

3. Domains, Guests and Virtual Machines


The terms domain, guest and virtual machine are often used interchangeably, but they have subtle differences. A domain is a configurable set of resources, including memory, virtual CPUs, network devices and disk devices, in which virtual machines run. A domain is granted virtual resources and can be started, stopped and rebooted independently. A guest is a virtualized operating system running within a domain. A guest operating system may be paravirtualized or hardware virtualized. Multiple guests can run on the same Oracle VM Server. A virtual machine is a guest operating system and its associated application software.Xen supports running two different types of guests. Xen guests are often called as domUs (unprivileged domains). Both guest types (PV, HVM) can be used at the same time on a single Xen system.[8]

3.1 Xen Paravirtualization (PV) Paravirtualization is an efficient and lightweight virtualization technique introduced by Xen. Paravirtualization doesn't require virtualization extensions from the host CPU. However paravirtualized guests require special kernel that is ported to run natively on Xen, so the guests are aware of the hypervisor and can run efficiently without emulation or virtual emulated hardware. 3.2 Xen Full virtualization (HVM) Fully virtualized (Hardware Virtual Machine) guests require CPU virtualization extensions from the host CPU . Xen uses modified version of Qemu to emulate full PC hardware, including BIOS, IDE disk controller, VGA graphic adapter, USB controller, network adapter etc for HVM guests. Fully virtualized guests don't require special kernel, so for example Windows operating systems can be used as Xen HVM guest. Fully virtualized guests are usually slower than paravirtualized guests, because of the required emulation.

originally created by researchers at Cambridge University, and derived from work done on the Linux kernel.[7] 4.1 What is Xen The Xen hypervisor, the powerful open source industry standard for virtualization, offers a powerful, efficient, and secure feature set for virtualization of x86, x86_64, IA64, ARM, and other CPU architectures. It supports a wide range of guest operating systems including Windows, Linux, Solaris, and various versions of the BSD operating systems.[9]

Source:http://www.xen.org/files/Marketing/WhatisXen .pdf

4 Xen Technology
The Xen hypervisor is a small, lightweight, software virtual machine monitor, for x86compatible computers. The Xen hypervisor securely executes multiple virtual machines on one physical system. Each virtual machine has its own guest operating system with almost native performance. The Xen hypervisor was

4.2 Why Xen Here I address the benefits of the Xen hypervisor and why they matter in selecting a hypervisor.The following points makes Xen better to other. a.Xen has thin hypervisor model b.No device drivers and keeps domains/guests isolated c.2 MB executable d.Relies on service domains for functionality 4.3 What is Xen Hypervisor

The Xen hypervisor is a layer of software running directly on computer hardware replacing the operating system there by allowing the computer hardware to run multiple guest operating systems concurrently. Support for x86, x86-64, Itanium, Power PC, and ARM processors allow the Xen hypervisor to run on a wide variety of computing devices and currently supports Linux, NetBSD, FreeBSD, Solaris, Windows, and other common operating systems as guests running on the hypervisor.[9] a.computer running the Xen hypervisor contains three components: a Xen Hypervisor b.Domain 0, the Privileged Domain (Dom0) Privileged guest running on the hypervisor with direct hardware access and guest management responsibilities c.Multiple DomainU, Unprivileged Domain Guests (DomU) Unprivileged guests running on the hypervisor; they have no direct access to hardware (e.g. memory, disk, etc.)

It not only needs to modify system architectures, but also could not detect some kernel rootkits and have more system overhead. This paper presents a mechanism to monitor and control running behaviors of GOS in a hypervisor, and give a prototype based on Xen.Xen supports running two different types of guests. Xen guests are often called as domUs (unprivileged domains). Both guest types (PV, HVM) can be used at the same time on a single Xen system

6.References
[1] P. Barham, B. Dragovic, K. Fraster, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, Xen and the art of virtualization, in Proc. ACM Symposium on Operating Systems Principles, pp. 164-177, 2003. *2+ VMware: Virtualization http://www.vmware.com/, 2008. Software,

*3+Microsoft Virtual PC 2007,http://www.microsoft.com/windows.virt ualpc/default.mspx, 2007. *4+ Kernel-based Virtual http://kvm.qumranet.com, 2009. Machine,

[5] A. Whitaker, M. Shaw, and S. Gribble, Lightweight virtual machines for ditributed and networked applications, in Proc. of the USENIX Technical Conference, Monterey, 2002. Source: http://www.xen.org/files/Marketing/WhatisXen .pdf [6] R. Figueiredo, P. Dinda, and J. Fortes, Resource virtualization renaissance, IEEE Computer, vol. 38, no. 5, pp. 28-31, 2005. *7+Book Oracle VM Server Users Guide, Release 2.1.1 written by Alison Holloway,Kurt Hackel, Herbert van den Bergh. [8]http://wiki.xensource.com/xenwiki/XenOver view

5.Conclusion
In general, traditional methods mainly focus on high-assurance execution environments onto operating systems. For example, introduce many authentication mechanisms and encryption algorithms to do intrusion detection.

[9]http://www.xen.org/files/Marketing/Whatis Xen.pdf

Quality Metrics For TTCN-3 And Mobile-Web Applications


Anu Saxena#1, Kapil Saxena*2
1

Diptt of Computer Science, 2 Diptt of Computer Science SRMSCET Bareilly(U.P.),Gautaum Buddha Technical University Lucknow(U.P.)INDIA 2 KCMT BAREILLY(U.P.)INDIA
anu140686@gmail.com kapilsaxena27@gmail.com

Abstract Web-based application is essentially a client-server system, which combines traditional effort logic and functionality, usually server based. This paper has been designed to predict the Web metrics for evaluating the efficiency and maintainability of hyperdocuments in termes of Testing and Test Control Notation (TTCN-3)and mobile-wireless web application.In the modern era of Information and Communication Technology (ICT), Web and the Internet, have brought significant changes in Information Technology (IT) and their related scenarios. The quality of a web application could be measured from two perspectives: programmers view and users view and here maintainability perceived by the programmers, and efficiency experienced by the end-user.

different points in the development life cycle of Web-based systems, to estimate effort, and these have been compared based on several predictions. The main objective of design metrics is to provide basic feedback of the design being measured. In this paper we are introduced some new matrices for quality factors in terms of TTCN-3 and mobilewireless web application. Here we are calculated only two quality-factor,Maintainability and Efficency.

Keywords Web-based effort estimation, Web-based

design, Web metrics, E-commerce , web application. IX. INTRODUCTION The diverse nature of web applications makes it difficult to measure these using existing quality measurement models. Web applications often use large numbers of reusable components which make traditional measurement models less relevant. Through a client Web browser, users are able to perform business operations and then to change the state of business data on the server. The range Web based applications varies enormously, from simple Web sites (Static Web sites) that are essentially hypertext document presentation applications, to sophisticated high volume ecommerce applications often involving supply, ordering,payment, tracking and delivery of goods or the provision of services (i.e. Dynamic and Active Web sites). We have focused on the implementation and comparison of effort measurement models for Web-based hypermedia applications based on implementation phase of development life cycle. For this work, we studied various size measures at

X. QUALITY FACTORS Many software quality factors have already defined and in this paper we have defined quality factor for web metrics for evaluating the efficiency and maintainability. It is already well established that a website should be treated as a set of components. Our interest is to consider the nature of these components, and how they affect the web site's quality. We are putting a lot of emphasis on maintainability and efficiency in this paper, since for most of the life of a web site; it is being actively maintained .Web sites differ from most software systems in a number of ways. They are changed and updated constantly after they are first developed. As a result of this, almost all of the effort involved in running a web site is maintenance. We will use the following criteria for estimating maintainability and efficiency (from ISO 9126): A. Maintainability: 1. Analysability 2. Changeability 3. Stability 4. Testability B. Efficiency: 1. Time based efficiency 2. Resource based efficiency

We now look at each of these criteria and their metrics in the following subsections which are suitable for TTCN-3 specific metrics and mobile web application.

A. Number of lines of TTCN-3 source code including blank lines and comments, i.e. physical lines of code . 1. Number of test cases 2. Number of functions 3. Number of altsteps, 4. Number of port types 5. Number of component types 6. 7. 8. Number of data type definitions Number of templates. Template coupling, which is computed as follows:

to

be

where stmt is the sequence of behavior entities referencing templates in a test suite, n is the number of statements in stmt, and stmt(i)denotes the i th statement in stmt. Template coupling measures the dependence of test behaviour and test data in the form of TTCN-3 template definitions. On the basis of Template coupling we have calculated the quality factors and its sub -factor of the maintainability and efficiency. A. Maintainability XI. METRICS FOR TTCN-3 Testing and Test Control Notation (TTCN-3), has shown that this maintenance is a non-trivial task and its burden can be reduced by means of appropriate concepts and tool support. The test specification and test implementation language TTCN-3 has the look and feel of a typical generalpurpose programming language, i.e. it is based on a textual syntax, referred to as the core notation.For assessing the overall quality of software, metrics can be used. Since this article treats quality characteristics such as maintainability of TTCN-3 test specifications, only internal product attributes are considered in the following. For assessing the quality of TTCN-3 test suites in terms of analysability and changeability and for locating issues, an initial set of appropriate TTCN-3 metrics has been developed.To ensure that these metrics have a clear interpretation, their development was guided by the Goal Question Metric approach. First the goals to achieve were specified, e.g. Goal 1:Improve changeability of TTCN-3 source code or Goal 2: Improve analysability of TTCN-3. coupling metrics are used to answer the question of Goal 1 and counting the number of references for answering the questions of Goal 2. The resulting set of metrics not only uses well known metrics for general-purpose programming languages but also defines new TTCN-3-specific metrics. As a first step, some basic size metrics and one coupling metric are used: Web-based software applications have a higher frequency of new releases, or update rate. Maintainability is a set of attributes that bear on the effort needed to make specified modified modifications (ISO 9126: 1991, 4.5).this is the ability to identify and fix a fault within a software component is what the maintainability characteristic addresses. Sub characteristics of the maintainability areanalysability,changeability,stability,testability. 1) Analysability Analysability is measured as the attributes of the software that have a bearing on the effort needed for diagnosis and modification of deficiencies and causes of failures. For optimal Analysability most templates may be inline templates. - metric : complexity violation :=

2) Changeability For changeability we are interested in how easily the data, formatting and program logic in the website can be changed. For good changeability a decoupling of test data and test behavior might be advantageous.

3) Stability Stability is the tolerance of the application towards unexpected effects of modifications. This metric measures the number of all component variables and timers referenced by more than one function,testcase,or altstep and relates them to the overall number of component variables and timers. - metric : global variable and timer usage :=

research develops a methodology to define and quantify the quality components of such systems. In this section we describes the metrics development process and presents examples of metrics. In this section we have to measure only two quality attributes,Maintainability and Efficiency. A. Maintainability Increasing the quality of the development processes and products/program code in the areas of maintainability will help to lower the cost when adding a new target platform .Maintainability (ISO-9126) includes the analyzability, changeability, stability, and testability sub-characteristics.This set of characteristics reflects mainly the technical stakeholders' viewpoint, such as the developers and maintenance people. o Use of standard protocol

4) Testability Testability is the effort for validating modification. There are only a few special considerations that should be made when measuring testability for a web site. Since the site can be tested through a web browser exactly like black box testing.While most of these metrics mainly describe the overall quality of test suites (an example is the Template coupling metric), some of them can also be used to improve a test suite by identifying the location of individual issues. B. EFFICIENCY Efficiency is a set of attributes that bear on the relationship between the level of performance of the software and the amount of resources used, under stated conditions(ISO 9126: 1991, 4.4). A set of attributes that bear on the relationship between the level of performance of the software and the amount of resources used, under stated conditions.. This characteristic is concerned with the system resources used when providing the required functionality. The amount of disk space, memory, network etc. provides a good indication of this characteristic. The sub-characteristics of the efficiency is time and resource 1) Time based efficiency The time behaviour describes for instance processing times and throughput rates. 2) Resource based efficiency resource behaviour means the amount of resources used and the duration of use. XII. METRICS
FOR MOBILE-WIRELESS WEB APPLICATION

Mobile and wireless devices and networks enable "any place, any time" use of information systems,providing advantages, such as productivity enhancement, flexibility, service improvements and information accuracy.This

B. Efficency Efficiency (ISO-9126) includes the time behavior and resource utilization sub-characteristics. 1) Time efficiency Time behavior sub-characteristic is very important in the mobile-wireless applications because the price of each minute of data transferring is very high, and the users will avoid expensive systems. Response time to get information from server

[1] A. Stefan, M. Xenos: A model for assessing the quality of e-commerce Response time to get information from Client systems,Proceedings of the PC-HCI 2001 Conference on Human Computer 2) Resource efficiency Interaction, Mobile devices include small memory and low processing 2001. resources, so applications must be aware of these restrictions and optimize resource [2] Asunmaa, P., Inkinen, S., Nyknen, P., Pivrinta, S., Sormunen, T., & utilization. Suoknuuti, M. (2002). Introduction to mobile internet technical architecture. Size of application in mobile device Wireless Personal Communications, 22, 253259. Size of memory in mobile device [3] Boehm, B.W., J.R. Brown, J.R. Kaspar, M. Lipow & G. Maccleod, Characteristics of Software Quality (Amsterdam: North Holland. 1978). Device memory cleanup after completing the task [4] Bache, R., Bazzana, G., Software Metrics for Product Assessment, Network throughput Mcgraw-Hill, 1994.

Finally, the limited processing and network resources require efficient use of the available resources. XIII. CONCLUSION & FUTURE WORK The paper has discussed the ISO 9126 norm with respect to the development of mobile web applications and TTCN-3. This paper introduced two subjects with respect to quality attributes (ISO-9126). First, TTCN-3 described the metrics for quality factor (ISO-9126). Second, mobile-wireless information systems which also used for measuring the quality factors (ISO-9126). In this paper we have calculate only two quality factors, Maintainability and Efficiency in terms of TTCN-3 and mobile-wireless web application. The research can be expanded to calculate other quality factors such as Functionality,Reliability, Usability, Portability in term of TTCN-3 and mobile-wireless web application. REFERENCES

[5]Calero, C., Ruiz, J., & Piattini, M. (2004). A web metrics survey using WQM. Proceedings ICWE 2004, LNCS 3140, Springer-Verlag Heidelberg, 147160. [6] D. Coleman, D. Ash, B. Lowther, P. Oman, Using Metrics to Evaluate Software System Maintainability, Computer Vol. 27, No..8, pp. 44-49. [7] Ejiogu, L., Software Engineering with Formal Metrics, QED Publishing, 1991. [8] G. M. Weinberg: The Psychology of Computer Programming, 1979. [9] Hordijk, W., & Wieringa, R. (2005). Surveying the factors that influence maintainability. Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering ESECFSE05, Lisbon, Portugal, 385-388. [10] ISO9000. (2000). Quality management systems Requirements. Geneva, Switzerland: International Organization for Standardization. [11] J. Eisenstein, J. Vanderdonckt, A. Puerta, Applying Model-Based Techniques to the Development of UIs for Mobile Computers, Proceedings on Intelligent User Interfaces, Santa Fe, 2001 [12] J. Offutt, Quality Attributes of Web Software Applications, IEEE Software, IEEE, March/April 2002, pp. 25-32. [13] M. Satyanarayanan: Fundamental Challenges in Mobile Computing, Symposium on Principles of Distributed Computing, 1996. [14] Mccall, J.A., P.K. Richards & G.F. Walters, Factors in Software Quality, Vol. 1,2, AD/A-049-014/015/05, And Springfield, VA: National Technical Information Service, 1977. [15] S McConnell, Real Quality for Real Engineers, IEEE Software, March/April 2002, pp. 5-7.

Future of ICT Enable Services for Inclusive Growth in Rural Unprivileged Masses
Bikash Chandra Sahana , Lalu Ram
National Institute of Technology Patna
sahana.nitp@gmail.com laluram_nitp@yahoo.co.in Abstract ICT is a cutting edge technology which has an active country dependent on agriculture .So use of ICT role in development of people of rural areas. A great percentage of based support in the field of agriculture is highly rural people are dependent on agriculture, dairy& poultry .So solicited. agriculture is a key area behind development of our nation .We may boost the productivity by proper utilization of our resources using ICT services. Agriculture depends on some components like XV. SALIENT FEATURES OF ICT FOR soil type, fertiliser, pesticides, irrigation, favourable weather AGRICULTURAL NETWORK(AN) conditions ,farm machinery ,healthy farmers ,farming experience ,proper planning ,awareness of latest information .Human being ICT based agricultural network should consists of has potentials to do the things in proper way but awareness of the reliable & scalable service-based technologies with facts is most important .In near future ICT based initiatives will play crucial role for over all development of people of rural familiar & easy to use interfaces .It should support unprivileged masses. Government should take initiative to user mobility (cell phone interface)tools for development of an unique agriculture network using ICT. It should proactive management help. It should include connect all the villages of our nation. Government should setup security features for protection of sensitive data. It information room in every village. It should have internet facilities (3G services ).Agricultural network should give informations in all should hold optimized server, cost effective regional languages .Productivity depends on key agricultural architecture , 3G support and non conventional parameters and tools used. These parameters varies with the power back up . variation of location because different regions has different soil nature ,water availability ,seed qualities used, fertiliser & pesticides availability , weather conditions .There should be realXVI. PROPOSED AGRICULTURAL NETWORK BLOCK time interaction provision of farmers with the experts. It can be DIAGRAM only be possible using ICT services .Agriculture network should be connected with different banks where people can get real time updates and schemes floated for the farmers. There should be connectivity with fertiliser suppliers ,pesticides suppliers ,irrigation department ,meteorological department ,soil testing organizations & transport agencies to enable the best services. Marketing is a important issue .Agriculture network (AN) should have an platform for buying & selling agricultural products. Farmers has to be registered in that network .Then they can post his products .Now any person can purchase the products directly from the farmers. Farmers will get motivation for farming as there is no uncertainty in gating the agricultural support through agricultural network backed by ICT technology.

Keywords Agriculture Network ,3G Services, Optimized Server, Expert Help Desk, User mobility

XIV.

INTRODUCTION

Policy makers aim is to develop India as a knowledge based society . Here ICT can play a significant role . Maximum population of our

Fig. 1 Basic block diagram of agricultural network

In Fig.1 user terminals are connected with the important departments like IMD, BDO office ,police stations, banks, market terminals, scientist help desk ,transport agencies and land record department. So there would not be any communication gap and farmers can avail updated informations from every department and they can take help from every terminals .They can observe the market demands and they can post the products online . Farmers have to registered themselves for accessing the network facility. Registration should be free of cost . Agriculture network (AN) should have an platform for buying & selling agricultural products. Farmers has to be registered in that network .Then they can post his products .Now any person can purchase the products directly from the farmers. Farmers will get motivation for farming as there is no uncertainty in gating the agricultural support through agricultural network backed by ICT technology.

Fig.3 Demand and Supply space in AN

IV CONCLUSION
Improvement of strategies of farming is possible with the help of ICT based Agricultural network. In the age of information technology it is not so difficult to realize the practical network .If government takes initiative to realize the network then benefits will be manifold . Not only farmers will be benefited with this but also policy makers will get real time data from root level . REFERENCES

[1] Microsoft Health ICT Resource Center [2]M Schneider A implementing an information society in central Europe.

Fig.2 Proposed Agricultural network structure In fig.2 we have proposed a network structure for agricultural network .Its centre block is the central server unit . .A,B,C,D are the intermediate processing units These units are responsible for all kind of operations in different states .Then E,F are the units responsible for district level data. Finally I,J.G,H are the unit responsible for block level data. Connection of I,J.G,H with other units may be through WLL , wired or through microwave links.

Conversion of Sequential Code to Parallel An Overview of Various Conversion Methods


Danish Ather#1, Prof. Raghuraj Singh*2
Sr. Lecturer,Dept. of Computer Appliactions,Teerthanker Mahaveer Institute of Management and Technology Delhi Road, Moradabad,India. danishather@gmail.com *Professor,Computer Science & Engineering Department Harcourt Butler Technological Institute Nawabganj Kanpur,India raghurajsingh@rediffmail.com ABSTRACT-With the availability of multiple processing makes programs run faster because there are more elements (PE) on a platform, high throughput can be engines (CPUs or cores) running it. In practice, it is gained by executing parts of an application often difficult to divide a program in such a way that concurrently on each PE or subset of PEs. An example separate CPUs or cores can execute different portions of such a platform is a multiprocessor system on chip without interfering with each other. (MPSoC) where each processing element has its local
memory as well as a common shared memory which all of them share. All PEs can run independently in a decentralized fashion or can run as a centralized model where one processor acts as a master. An application specified using an imperative model of computation can run only on a single processor and is not suited for multi-processor platform. The application needs to be modified to make it suitable for running on multiprocessor platform by identification and extraction of parallel tasks from the sequential code and efficiently mapping these tasks onto a multiprocessor platform. Manual identification and extraction of parallel tasks from sequential code is very time consuming and error prone as it requires complete analysis of sequential source code which is complex if it spans thousands of lines.In this paper we have tried to discuss the various methods of transformation of sequential application to a parallel. KEYWORDS :MPSoC,PE,MatParser,DgParser,Pan
#

da I. INTRODUCTION A. PARALLEL PROCESSING BY THE BRAIN Parallel processing is the ability of the brain to simultaneously process incoming stimuli of differing quality. This becomes most important in vision, as the brain divides what it sees into four components: color, motion, shape, and depth. These are individually analyzed and then compared to stored memories, which helps the brain identify what you are viewing. The brain then combines all of these into one image that you see and comprehend. B. PARALLEL PROCESSING IN COMPUTERS The simultaneous use of more than one CPU or processor core to execute a program or multiple computational threads. Ideally, parallel processing

LXXIX. C. MPSoC The multiprocessor System-on-Chip (MPSoC) is a system-on-a-chip (SoC) which uses multiple processors (see multi-core), usually targeted for embedded applications. It is used by platforms that contain multiple, usually heterogeneous, processing elements with specific functionalities reflecting the need of the expected application domain, a memory hierarchy (often using scratchpad RAM and DMA) and I/O components. All these components are linked to each other by an on-chip interconnect. These architectures meet the performance needs of multimedia applications, telecommunication architectures, network security and other application domains while limiting the power consumption through the use of specialised processing elements and architecture.

Figure 1 A Multiprocessor Soc Architecture Diagram The problem of transformation of sequential application for mapping on multiple processors is not new and has been studied by many researchers in different contexts: Selecting model of computation for parallel execution. Automation for transformation from sequential model of computation to parallel model of computation. D. SELECTING MODEL OF COMPUTATION FOR
PARALLEL EXECUTION

In parallel models of computation, data is communicated between N processors in two ways: Shared memory Message passing In shared memory communication each processor has access to shared memory. If a processor updates the value of the variable stored in shared memory then the other processor can read the value of that variable from shared memory. All processors can gain access to shared memory simultaneously if memory locations they are trying to read from or write to are different. In Message passing communication each processor has only local memory. Processors communicate by sending and receiving messages in the form of data. The communication can be point-to-point where data is only transferred between exactly two processors or it can be broadcast where a processor sends data to all other processors. E.AUTOMATION FOR TRANSFORMATION FROM
SEQUENTIAL MODEL OF COMPUTATION TO PARALLEL MODEL OF COMPUTATION

various output forms like process network, executable image, transaction level models etc. Transformation tools like Compaan [2] and Pn [10] extract only data parallelism and divide loops into tasks. Sprint [6] extracts only functional parallelism and divides functions into tasks. Harmonic [11] performs both functional and data parallelism but does not use process network model for transformation. Various Transformation tools: 1. Compaan 2. Sprint 3. Harmonic II. RELATED WORK A. COMPAAN Compilation of Matlab to Process Networks (Compaan) Compaan is an effort to automatically compile a subset of imperative programs into a concurrent representation. Compaan uses Matlab language as the imperative language and compiles programs in this language into a concurrent representation: a particular version of Process Networks. The Compaan work is motivated by the advent of a new kind of embedded architectures that is composed of a microprocessor, some memory, and a number of dedicated coprocessors that are linked together via some kind of programmable interconnect. These architectures are devised to be used in real-time, high-performance signal processing applications. Thy have in common that they exploit parallelism using instruction level parallelism offered by the microprocessor and coarse-grained parallelism offered by the coprocessors. Given a set of applications, the hardware/software codesign problem is to determine what needs to execute on the microprocessor and what on the coprocessors and furthermore, what should each coprocessor contain, while being programmable enough to support the set of applications. Compaan tries to provide a tool that can help designers in answering this tough question of how to partition an application into hardware and software candidates. A motivation why Process Networks are a good match for these new embedded architectures can be found in kienhuis et at. [2000]. Compaan consists of three different tools, which are shown here below. [2]

Many researchers [2] [6] [11] [4] [10] have built transformation tools that convert an application specified in sequential model of computation to

Figure 2 Compilation of Matlab to Process Networks The three tools are: 1. MatParser. MatParser is a sequential to parallel compiler. It extracts all available parallelism present in a Matlab description. The Matlab however needs to be confined to a particular type called Nested Loop Programs. MatParser uses a very advance parametric integer programming (PIP) technique to find all available parallelism in terms of the parameters of the original program. DgParser DgParser converts a single assignment program, generated by MatParser, into a Polyhedral Reduced Dependence Graph (PRDG) representation. A PRDG representation is much more amenable to mathematical manipulation. Panda. Panda transforms the PRDG description of an algorithm into a network of parallel running Processes, the desired Kahn Process Network. This tool uses the Polylib library extensively. Using Polylib, Panda can generate Ehrhart polynomials that give a symbolic expression for the number of integer points available in arbitrary polyhedra. The conversion from the PRDG to a Process Network happens in three steps; domain scanning, domain reconstruction and linearization. [2]

application after partitioning. SPRINT has been used for several multimedia designs, including an Embedded Zero Tree coder [3] and anMPEG-4 video encoder [7,8]. With SPRINT, the validation of the concurrent behaviour of the complete design was obtained in less than six minutes, including the generation of the model itself. This fast verification path provides the possibility to identify conceptual design errors in an early stage of the design, avoiding expensive design iterations and resulting is a significant speed-up of the design. Moreover, the automated generation of the concurrent model enables the exploration of different parallelization alternatives, ultimately resulting in an improved implementation of the embedded system.

2.

3.

B. SPRINT SPRINT is part of a systematic C-based design flow targeting the implementation of advanced streaming applications on multiprocessor platforms. It benefits from pre-processing and high-level optimizations applied to the C code before partitioning and support the independent development and testing, potentially by multiple team members in parallel, of parts of the

Figure 3 Systematic C-based flow[6] Figure 3 shows the applied design flow, starting from a system specification (typically reference code provided by an algorithm group or standardization body like MPEG) that will gradually be refined into the final implementation: a netlist with a set of executables. Two major phases are present: (i) a sequential phase where the initial specification is preprocessed analysed, and high-level optimized and (ii) a parallel phase in which the application is divided into parallel processes and defined to a register transfer level for hardware implementation or implementation code for software implementation. The design flow consists of five logically sequenced steps. (1) Pre-processing and analysis prunes the reference code to retain only the required functionality for a given application profile. An initial complexity analysis identifies bottlenecks and points out the first candidates for optimization. This step is supported by the Atomium tool suite [1]. (2) High-level optimization reduces the overall complexity with a memory centric focus, as data transfer and storage have a dominant impact on the implementation efficiency of multimedia applications [7, 14]. The code is reorganized into functional

modules with localized data processing and limited communication. This prepares the code for an efficient use of the communication primitives in the next step and facilitates parallelization. (3) Partitioning splits the code into concurrent tasks and inserts the appropriate communication channels. The design is partitioned into concurrent tasks for a combination of reasons, including throughput requirements, power requirements (parallel processors can run at a lower clock rate), design complexity (small tasks are easier to implement, and hardware synthesis tools may yield better results) and the exploitation of heterogeneous processors (tasks can be assigned to the best suited processor, or an efficient custom processor can be designed for a task). (4) Software tuning and HDL translation refine each task of the partitioned system independently, either by mapping it to a target processor or by writing register transfer level HDL to be synthesized on an FPGA or as custom hardware. In addition, the communication channels between the tasks are mapped on the available communication resources of the target platform. (5) Integration gradually combines and tests the refined components to obtain a complete working system. C. HARMONIC Figure 4 illustrates the key components of the Harmonic toolchain. The Harmonic toolchain is built on top of the ROSE open source compiler framework [3], which provides support for source-level transformations, analysis and instrumentation of C/C++ code, and supports a number of additional frontends. The toolchain receives as input the complete C source project as defined by a set of .c and .h source files. There are no restrictions on the syntax or semantics of the C project. By default, all source is compiled and executed on the reference processing element (usually a GPP), which serves as the baseline for performance comparison. To improve performance, Harmonic distributes parts of the program to specialised processing elements in the system. To maximise the effectiveness of the tool chain in optimising an application, it is desirable that the C code is written in a way that it is supported by as many types of processing elements as possible in order to uncover opportunities for optimisation. [11] III. COMPARATIVE ANALYSIS Existing tools such as Compaan [2], Sprint [6] and Harmonic [11] automate such transformation but each has its own characteristics, advantages and drawbacks. We have emphasized on above 3 tools and compared them as they are representative of the major work done in automating the transformation of sequential application for mapping on multiprocessor

platform. Here we discuss details and characteristics of each tool and compare them on various parameters.

Figure 4 An overview of the Harmonic toolchain design flow [11] Compaan: The Compaan tool converts a sequential Matlab application into process network in a systematic automatic way. It converts a Matlab application into a polyhedral reduced dependence graph which is a compact representation of a dependence graph(DG) using parametrized polyhedra. Compaan works on partitioning of loops into tasks. It is confined to operate on affine nested loop programs (ANLP) [9]. Sprint: Converts a sequential C application to executable concurrent model in SystemC. It facilitates generation of KPN as per designer specification by automatically detecting and inserting required communication channel in accordance with designer-specified task boundaries. Sprint only performs function level partitioning and does not handle loops. Harmonic: It receives a C application as an input and generates an executable image for reference general purpose processor. It performs both functional and data level partitioning. It treats each function as a separate task. Harmonic does not work on KPN model of computation as it does not include channel generation step. Task partitioning and mapping tasks to general purpose processors (GPP) are the two phases implemented by Harmonic. In the following table we have compared 3 approaches to automated transformation on these parameters: Input application type: Specifies the language in which input applicationis written. Final output type: Specifies the output form of the transformed application.

Type of parallelism handled: Specifies if type of parallelism is functional or data. Transformation based on KPN: Specifies if the transformation generates a KPN model of computation. Extent of automation: Specifies if the process of transformation is completely automatic(requiring no user decision). Generation of user report: Specifies if any report is generated for user understanding and review. TABLE |I Parameters Compaa Sprin Harmoni n t c Basic Features No Directly transform sequential source Perform Code No profiling No Perform Partitioning analysis No Channel Placement Analysis Use KPN Yes computational model Provide data Yes(best) parallelism(loops ) No Provide functional parallelism Requires users Yes understanding of Sources Perform Code Optimizations Code Inlining Loop Outlining Generates task and channel report No Not Required No Yes Yes

Compaan does not implement function parallelism. Harmonic does not use KPN as its final parallel model. None of the above tool Generates graphical output. So on the basis of following outcome we would like to propose the various kind of users that in case of data parallelism Compaan and Harmonic approaches can be adopted, while in case of functional parallelism Sprint approach can be adopted. Moreover there is a scope to design and develop a parallel conversion model which accommodates both data and functional parallelism followed by KPN model.

No No

No No

1. 2.

No

No 3.

Yes

No

No

Yes(basic ) 4. Yes

Yes

Yes

Yes 5.

No No No

No No No

6.

7.

IV.CONCLUSION We suggest the following features which could be an improvement over existing approaches like Sprint, Compaan and Harmonic: Sprint does not implement data parallelism.

8.

V. R EFERENCES Atomium, http://www.imec.be/atomium. B.Kienhauis, E., and E.F.Deprettere. Compaan : Driving process networks from Matlab for embedded signal processsig architecture. In proceedings of Eighth International Workshop CODES (2000). B. Vanhoof,M. Peon, G. Lafruit, J. Bormans,M. Engels, and I.Bolsens, A scalable architecture for MPEG-4 embedded zero tree coding, in Proceedings of the 21st IEEE Annual Custom Integrated Circuits Conference (CICC 99), pp. 6568, San Diego, Calif, USA, May 1999. Dave, B., Lakshminarayana, G., and Jha, N. Cosyn: Hardware-software co-synthesis of heterogeneous distributed embedded systems.Very Large Scale Integration (VLSI) Systems, IEEE Transactions on 7,1 (Mar 1999), 92-104. F. Catthoor, E. de Greef, and S. Suytack, CustomMemoryManagement Methodology, Kluwer Academic Publishers, Boston, Mass, USA, 1998. J. Cockx, K.Donolf, B., and R.Stahi. SPRINT:A tool to generate concurrent transaction-level models from sequential code. In EURASIP Journal on Applied Signal Processing (January 2007), vol. 1, p. 213. K. Denolf, et al., A systematic design of an MPEG-4 video encoder and decoder for FPGAs, in Proceedings of the Global Signal Processing Expo and Conference (GSPx 04), Santa Clara, Calif, USA, September 2004. K. Denolf, C. De Vleeschouwer, R. Turney, G. Lafruit, and J.Bormans, Memory centric

design of an MPEG-4 video encoder, IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 5, pp. 609 619, 2005. 9. P.Held. Functional design of data-ow networks. PhD thesis, Delft University of Technology, 2009. 10. Verdoolaege, S., Nikolov, H., and Stefanov, T. pn: a tool for improved derivation of process networks. EURASIP J. Embedded Syst. 2007, 1 (2007), 19-19. 11. W.Luk, J.G.F coutinho, T., Y.M.Lam, W., and K.W.Susanto,O.Liu, W. A high level compilation toolchain for hetrogeneous systems. In IEEE international SOC conference (Sept 9-11 2009).

INNOVATION and ENTREPRENEURSHIP in INFORMATION and COMMUNICATION TECHNOLOGY


Deepak Sharma1, Nirankar Sharma2, Nimisha Shrivastava3
#123

Computer Applications department

Subharti University
2

dp_hpr@yahoo.co.in nirankarsharma2004@rediffmail.com 3 nimisrivas@yahoo.co.in

Abstract

This paper describes ICT (information and communication technology) and the new innovations which are related to it. ICT as we all know is a wider perspective of information technology. Information technology deals with unified communications (UC), integration of telecommunications and the audio-visual systems in modern IT. This paper illustrates new innovations by Cisco Company for both the larger and smaller organization. Unified communications are the important aspect related with ICT. These are integrated to optimized business processes. Unified communications integrates real time and as well as non real time communication with business processes and requirements.ICT is a powerful tool for the development of various business and IT activities. It concentrates harder for the economic issues related with business and digital era. Later in this paper the focus is on the telecommunication. Innovations and the digital enterpenureship will provide better chances to rise in ICT field.. Telecommunication link and medium play a vital role in the smooth life of any business as well as in the success of ICT. Digital enterpenureship describes the relationship between an entrepreneur and the digital world. .

New Innovations in ICT: An innovation starts as a concept that is


refined and developed before application. Innovations may be inspired by reality. The innovation process, which leads to useful technology, requires: Research Development Production Marketing Use In this section we are defining some of the greatest and newer innovations in the field of unified communication, telecommunications and audio Visual audio-visual systems. .

Microsofts Lite Green IT Project:


The Microsoft research labs in India have been working on a project: Lite Green is used to reduce the bill and be energy efficient. It is very important innovation in the field of ICT just because of the desktop. When running at full capacity consume close to 100 220 watts and 62-80 watts when running at close to zero percent CPU usage .This project is as most effective during weekends and overnight. The energy saving is close to 80 percent in such cases[6].

Keywords communication, technology, economy, unified, Cisco XVII. INTRODUCTION

The world of information and communications technology is constantly changing without stopping anywhere .ICT consists of all technical issues related to handle information and aid communication, including computer and network hardware, communication as well as necessary software. In other words, ICT consists of IT as well as telephony, broadcast media, all types of audio and video processing and transmission and network based control and monitoring functions. The term ICT is now also used to refer to the merging of audio-visual and telephone networks with computer networks through a single cabling or link system[1]. There are large economic incentives (huge cost savings due to elimination of the telephone network) to merge the audio-visual, building management and telephone network with the computer network system using a single unified system of cabling, signal distribution and management .

Software Defined Radio: Software defined Radio is a radio


communication system where components that have been typically implemented in hardware for example filters, amplifiers, modulators/demodulators, detectors, etc. are instead implemented by means of software on a personal computer or embedded computing devices. This brings benefits to any actor involved in the telecommunication market manufactures operator users. The advantage for users is priority to room their communication to other cellular system and the tape advantage of worldwide mobility and coverage. A basic SDR system may consist of a personal computer equipped with a sound card, or other analog-to-digital converter, preceded by some form of RF front end. Software radios have significant utility for the military and cell phone services, both of which must serve a wide variety of changing radio protocols in real time.

IPTV:
Internet Protocol television (IPTV) is a system through which Internet television services are delivered using the architecture and networking methods of the Internet Protocol Suite over a packetswitched network infrastructure, e.g., the Internet and broadband Internet access networks, instead of being delivered through traditional radio frequency broadcast, satellite signal, and cable television (CATV) formats. There are lot of regulations are coming in the Information and entertainment sector due to the changing technological scenario coupled with digitization of broadcasting industries. This changing environment led to the growing popularity of IPTV at the international level[4]. The scope of IPTV in India is not highly recognized. IPTV services may be classified into three main groups: 1. Live television with or without interactivity related to the current TV show. 2. Time-shifted programming: Catch up TV that replays a TV show that was broadcast hours or days ago, start-over TV works over that replays the current TV show from its beginning. 3. Video on demand (VOD): browse a catalog of videos, not related to TV programming

Digital Enterpenureship: Entrepreneurship is the act of being an


entrepreneur, which can be defined as "one who undertakes innovations, finance and business activities in an effort to transform innovations into economic goods"[4] The digital enterpenureship term is introduced to define any organization digitally. Each and every detail of enterprise is termed digitally.

Conclusion
In this paper the final output is to focus on emerging field in innovations of Information and communication technology. Finally you can conclude that not only in area of wireless devices for example mobile TV but also in the other fields of ICT like audio visual system may be wire included may the information and communication technology is successful.

References:
1. http://foldoc.org/Information+and+Communication+Technology 2http://specials.ft.com/lifeonthenet/FT3NXTH03DC.html 3.www.microsoft.com/education/MSITAcademy/curriculum/roadm ap/default.mspx 4. www.google.co.in 5. www.wikipedia.org 6. www.sciencedaily.com

Mobile TV:
Mobile TV is expected to accept sufficient growth in the Asia pacific region. The mobile TV has already arrived in INDIA and its future is bright. The mobile TV is considered as to be wireless device and wireless services provide a large success in India with both the urban areas and rural areas are rising steadily[5]. IPTV when introduced in the country and was considered to be the next big technology driven in the telecom industry. However the service did not as pick up as considered to be up to the mark. This was due to some factors including low broadband penetration and slow internet access speed.

Unified Communications 300 Series:


The Unified Communications 300 Series (UC300) is part of Ciscos Foundational Unified Communications (UC) offering which provides basic or foundational UC features for small businesses, typically voice, data and wireless integration, plus some basic integrated messaging applications[6]. UC300 is positioned for businesses that require more basic UC features at an affordable price. Ciscos earlier Unified Communications 500 Series (UC500) also for smaller businesses belongs to the Advanced UC offering which has a more advanced feature set, including video, enhanced security and mobility[4].

Cisco Unified Communications Manager Business Edition 3000


The Cisco Unified Communications Manager Business Edition 3000 is an all-in-one solution specifically designed for mid-sized businesses with 75-300 users (400 total devices) and up to 10 sites (nine remote sites) with centralized call processing. Cisco Unified Communications Manager software, the Cisco Unity Connection messaging solution (12 ports) and Cisco Unified Mobility are preinstalled on a single Cisco Media Convergence Server[6].

FUZZY CLASSIFICATION ON CUSTOMER RELATIONSHIP MANAGEMENT


Mohd. Faisal Muqtida1, Ashi Attrey2, Diwakar Upadhyay3
Computer Science,M.Tech, SRMSCET Bareilly, India
faisal.muqtida@gmail.com ashiattrey@gmail.com diwakaru9@gmail.com

Abstract In fact fuzzy logic and fuzzy classification is well known in many technical disciplines like electronics or Engineering. Much scientific research has been done in the field of fuzzy control over the years/it is well known also in mathematics, statistics, informatics and data mining. But in marketing and business management fuzzy classification is still largely unknown and rarely used in, in practice. This motivated us to develop a system that is beneficial for Customer Relationship Management (CRM) in business activities. In contrast to other data mining and statistical methods, fuzzy classification allows the classification of customer into more than one class at the same time. The application of the fuzzy portfolio analysis within the scope of performance measurement is specially suited for classifying, analyzing, evaluating and improving important monetary customer performance indicators like turnover, contribution margins, and profit and customer equity. And non monetary indicators such as customer value, satisfaction, loyalty and retention. To avoid misclassification to improve the quality of customer evaluation and to exploit customer potential, it is suggested to classify all indicators fuzzily. Keywords Fuzzy classification, Customer Relationship

Management (CRM), analytical CRM (aCRM), Customer performance indicators, fuzzy customer segmentation, management tools, fuzzy portfolio analysis, fuzzy credit rating. I. INTRODUCTION A. Motivation Since Zadeh first published the article fuzzy sets in the journal information and control in 1965, much scientific research has been done in the field of fuzzy control over all the years. In basic, research, many publications have been written on fuzzy logic, fuzzy sets or on fuzzy classification. On different mathematical definitions of the fuzzy classification approach on its implementation in information systems and on diverse applications in the field of engineering. In fact, fuzzy logic and fuzzy classification is well known in many rather technical disciplines Like electronics or engineering, but also in mathematics, statistics, informatics and data mining. In contrast, in marketing and business management fuzzy classification is still largely unknown And rarely used, in both theory and practice. This gap motivated to write this master thesis about the potential and benefit of fuzzy classification in business activities. B. Objective We are using Fuzzy Classification approach in customer segmentation Credti rating is the example of customer segmentation, which ends

in a rating of the subject (for instance a loan applicant) in a rating or risk class. To manage the credit business profitably, a bank has to classify loan applicants according to their real creditworthiness that means according to their default risk. It is every banks concern to evaluate loan applicants and their creditworthiness as good as possible. Creditworthiness is the ability, intention and financial capability of a borrower to repay debt. A borrower is considered as personally creditworthy, if he deserves confidence due to its reliability. Professional qualification and its business acumen. Material creditworthiness is supposed, if the current or expected economical circumstances of the borrower guarantee the payment of interest and the repayment of the loan. Credit scoring is a quantitative approach used to measure and evaluate the creditworthiness of a loan applicant. An aim of credit scoring is to determine the credit risk that assumed for the possible of credit extended. There are number of criteria about the creditworthiness of a commercial or private borrower. In retail banking, following information about the personal creditworthiness of a loan applicant for a consumer loan are examined: Age, civil and family status; number and age of the children Professional and qualification Employer, job circumstances and duration of employment Recovery of claims and garnishment Reason for credit Considering the material creditworthiness, following information is evaluated: Monthly net income, additional incomes and security of income. Loan securities.

Account information at the internal or external bank. Alimonies Rent or mortgage rate. Leasing rates, other financial obligation and dept service. Credit agreements often contain the following basic points: amounts of credit, interest rate, repayment conditions, credit period(duration of the credit) and loan securities.
II. BACKGROUND A.Sharp Credit Rating Another example of customer or market segmentation is the process of credit rating, which ends in a rating of the rating subject (for instance a loan applicant) in a rating or risk class. To manage the credit business profitably, a bank has to classify loan applicants according to their real creditworthiness (I 68), that means according to their default risk. It is in every banks concern to evaluate loan applicants and their creditworthiness as good as possible. Creditworthiness is the ability, intention and financial capability of a borrower to repay debt. Credit scoring is a quantitative approach used to measure and evaluate the creditworthiness of a loan applicant [Hofstrand 2006]. An aim of credit scoring is to determine the credit risk, that risk assumed for the possible nonpayment of credit extended. In Anglo-Saxon literature on credit scoring, often the Cs of credit are mentioned, e.g. Character, Capital, Capacity, Collateral and Condition. In German literature, a distinction is often made between personal and material creditworthiness. A borrower is considered as personally creditworthy, if he deserves confidence due to its reliability, professional qualification and its business acumen. Material creditworthiness is supposed, if the current or expected economical circumstances of the borrower guarantee the payment of interests and the repayment of the loan.

The following sections discuss the internal rating of banks. Considering conventional methods of credit rating, subjective expertise is compared to statistical methods. B.Disadvantages of Sharp Credit Rating:All considered methods of credit rating have certain advantages, and also disadvantages: Heuristics and checklists are subjective and it is often discretionary to bring contradictory statements together. Different credit experts mostly evaluate and weight the same credit with identical facts quite different. Logistic regression analyses have the drawback that non-linear correlations are difficult to model. Time invariance and representativeness of the models and the data often is not guaranteed. Statistical methods in general do not consider specific and individual circumstances and particularities. As a result, statistical methods are often combined with subjective expertise to hybrid methods in credit rating practice. Another disadvantage of subjective expertise and statistical methods of credit rating is that they usually result in sharp classifications of loan applicants. The material or personal creditworthiness, or other criteria for creditworthiness of loan applicants often are rated sharply aworthy or as not worthy. In critical or border cases, if a credit applicant is rated near the cut-off-score (the dividing value which separates the creditworthy borrowers from the rest), sharp credit rating may lead to incorrect and wrong decisions,
III.

THEORITICAL ASPECTS

A. Fuzzy Set The notion of fuzzy set stems from the work of Zadeh published in 1965. Zadeh observed the gap between mental representation of the reality that is human concepts and usual mathematical approaches. Human concepts are

represented by natural language terms like young man or high price. Such concepts are useful to describe the reality and to summarize the human perception of the world. However, they are vague and subjective and contextdependant. On other hand, mathematical concepts are sharp and sometimes difficult to understand. Therefore sharp representations are not adequate to describe the vagueness of the human perception. As natural language terms are vague, a gradual notion is needed to design those terms. The way of defining mental representation is no more all nothing but is expressed by the notion of membership which is represented by a value, called membership value, in the interval[0,1]. The specify of fuzzy sets to capture this idea of partial membership whereas classical crisp sets are reduced to binary membership value of 1 and the outside elements have a membership degree of 0. Therefore, a crisp set does not need to label each element with a membership value of 0.6. This same element X can also be part of one more other fuzzy sets with a different membership degrees. Thus the notion of membership allows an element to be part of several fuzzy sets. Each fuzzy set is represented by a function called membership function. A membership function associates a memebership value with each element in the refrential of the realted fuzzy set. Therefore, a fuzzy set is often understood as a membership function. As figure 2.1 shows, a membership function can define the set of all the young men. If a crisp set is used, only the men between 20 to 36 years of age are considered to be young men. On the other hand, the fuzzy set is not so precise: the men between 24 and 32 are young with a membership degree of 1, but the one between 16 and 24 and between 32 and 40 are also young men, however with partial memebership degrees.

same equivalence class. This means that all the context-redundant tuples are part of a same class. Therefore the context model leads to a classification. Figure 2.2 shows an example of contexts for a database table containing three columns: the providers name, the quality of service and the delay in delivering the service. The idea is to use the context model for classifying the providers according to their quality and delay.

To extend the functionalities of fuzzy sets, operators of the classical set theory like intersection, union, complement, inclusion and so on have been adapted to the concept of fuzzy sets. For instance, the union of two fuzzy sets can be performed by taking the point wise maximum of their membership functions and their intersection by the point wise minimum of their membership functions. It is important to note that, due to the new potential of the fuzzy sets theory, these operations can also be defined in different ways. B. Context Model In the relational data model, every attribute Aj is defined on a domain D(Aj). The enhancement of the context model consists in a context K(Aj) assigned to each attribute Aj . A context K(Aj) is a partition of D(Aj) into equivalence classes. Thus a relational database schema with contexts R(A,K) is composed by a set A = (A1, ...,An) of attributes and a set K = (K(A1), ...,K(An)) of associated context. If each equivalence class contains only one element, the context model has the same effect as the relational data model. Shenoi [26] introduced the context-based fuzzy equivalence which generalizes the classical equality and enhances the notion of tuples redundancy found in the relational model. In contrast to the relational model, two context-redundant tuples are not identical but equivalent. Two tuples t and t0 are context-redundant if for each component ti of t, the corresponding component t0 i of t0 belongs to the

Figure 2.2: Providers example with contexts

The contexts have been defined on the three attributes Quality, Delay and Provider. The quality values low and sufficient are equivalent and thus they are in the same equivalence class. The same applies to the quality values average and high, as well as the delay intervals [1, 5] and [6, 10]. In contrast, all the providers are included in a same and unique equivalence class corresponding to the referential of the Provider attribute. Once the context-redundant tuples are detected, a merge operation on each group of context-redundant tuples creates the resulting classification. Figure 2.3 shows this classification for the providers example, after the merge operation. All the attributes values of tuples t1 and t3 belong to the same equivalence classes. Indeed, the providers Dewag and KBA are in the same equivalence class and the qualities low and sufficient are equivalent as well as the delays 3 and 5. Therefore the two tuples are contextredundant and belong to the same class called

C4 as shown in figure 2.4. Figure 2.4 graphically represents the four different classes in the classification space defined by the attributes Delay and Quality and generated by those contexts.

name. Often, for the sake of simplicity, the name of the linguistic variable is the same as its related attribute. Then the termsacceptable and unacceptable represent each an equivalence class defined on the domain of the related attribute.

Figure 2.3: classification of providers after merging

Figure 2.5: Using linguistic variables and terms in the context model

Figure 2.4: Graphical representation of the classification space

C. Fuzzy Classification As explained in the previous subsections, the fuzzy sets theory allows partial memberships to sets. On the other hand, a class can be seen as a fuzzy set. The main idea of the fuzzy classification [10] [26] is to consider equivalence classes generated by the context model as fuzzy sets. To create fuzzy classes, a verbal term is assigned to each class. Since verbal terms are vague, the classes become vague too. The notion of linguistic variable introduced by Zimmermann [30] is useful to perform this mapping. By definition, a linguistic variable takes a value in a set of verbal terms. As shown in figure 2.5, the linguistic variable Delay can take one of the acceptable or unacceptable terms values. Each linguistic variable is related to an attribute of a database table. In the example of figure 2.5, the linguistic variable Delay is related to the attribute with the same

Since terms can be seen as fuzzy sets defined by a membership function, the membership to an equivalence class is no more binary like in the context model but can be partial. The tuples can be part of several classes with different membership degrees. Based on the providers example of the previous subsection, figure 2.6 shows the four classes using terms and membership functions. Each class can have its own semantic. For instance, the class of providersC4 could be labeled improve quality as the quality is bad. The class C2 could represent the providers whose relationship has to be reconsidered because of their bad quality and inacceptable delay. The two linguistic variables are Quality and Delay and their respective set of terms are {good, bad} and {acceptable, unacceptable}. The terms good and bad are described by their membership function, respectively good and bad. Those two functions partition the domain D(Quality) in two fuzzy sets. The membership of the provider MAM (quality average and delay 7) to the fuzzy set good is given by the following formula:

considering a customer in more than one class at the same time for calculating the true Interest rate(IR) that is aggregate of all the classes. The prediction of Interest rate(IR) will be varies according to values of Credit amount, Saving and Time period. We are giving interest rates for different classes in the following manner: Class C1
Figure 2.6: Context-based classification using fuzzy sets

Interest Rate (IR) 8% 10% 12% 14% 5% 7% 8% 9%

C2 C3 C4 C5 C6 C7 C8

M(MAM|good) = good(average) = 0.67 In a same way, its membership to the fuzzy set bad is given by: M(MAM|bad) = bad(average) = 0.33 The membership value of an element to a specific class is defined by the aggregation of all the terms of linguistic variables defining the class. In the providers example, the four classes are defined by the two Quality and Delay linguistic variables. Since the class C4 is defined by an acceptable delay and a bad quality, the membership degree to the fuzzy set acceptable and the one to the fuzzy set bad have to be aggregated. Depending on the operator performing the aggregation, more importance is given to one or several particular linguistic variables. The operator chosen in fCQL, called gamma operator, allows an effect of compensation between the different linguistic variables. The following formula shows how to calculate the aggregated membership of the provider MAM to the class C4: M(MAM(average, 7)|C4) = faggregation(bad(average), acceptable(7)) D. Proposed System In any CRM it is required to predict the value of customer we are applying a strategy in credit rating in banking sector using fuzzy logic we are classifying the customer in eight(8) different classes. Each class have its own unique interest rate(IR) and through this system we arre

Table : Showing the intrest rate given in different classes Methods of Fuzzy Credit Rating[18,19] Whether a loan applicant is creditworthy or not, is not a question that can be answered easily with yes or no, but rather with more or less. It is in the nature of credit rating that information and data are inconstant, incomplete, imprecise and fuzzy. The fuzzy set creditworthiness is defined as a composition of the other fuzzy sets. The model includes remarkable findings. By testing empirically the discussed compensatory operators and the weights , the prediction results of the creditworthiness of test loan applicants with fuzzy classification were significantly better than with sharp classification. The approach of fuzzy classification allows to work with continuous and discrete variables and the definition of colloquial terms of credit rating, like definitely creditworthy, rather creditworthy or insufficient creditworthy.

Considering continuous variables, even more precise statement can be made: for example 30% moderately creditworthy, 60% creditworthy,80% creditworthy, and so on. The (gamma)operator
m Ai (1 i i 1 ) m

( x)

( x)

1
i 1

(1

( x))

Where x X, 0 1 Here we are assuming that gamma is 0.5 ( = 0.5 ) Here = personal creditworthy as well as material creditworthy. The used operator of the aggregation is the operator, the compensatory and, which was suggested and empirically tested by[Zimmermann and Zysno].

Elseif saving <=10 & amount <=10 & time > 24 then Goto C8 Elseif saving <=10 & amount >10 & time > 24 then Goto C7 Elseif saving > 10 & amount <=10 & time > 24 then Goto C6 Elseif saving >=10 & amount <=10 & time <= 24 then Goto C5 Step6: Calculate Absolute membership degree
m Ai ( x) i 1 i ( x) (1 ) m

1
i 1

(1

( x))

E. Algorithm Step1: Initialise the IR for each class and Aggregation operator Gamma() = 0.5 Step2: Input income & expenditure of customer. Step3: Calculate saving Saving = Income Expenditure. Step4: Input Credit amount and Time period. Step5: Check the conditions If saving <=10 & amount <=10 & time <=24 then Goto C4 Elseif saving <=10 & amount >10 & time <= 24 then Goto C3 Elseif saving >=10 & amount <=10 & time <= 24 then Goto C2 Elseif saving >=10 & amount > 10 & time <= 24 then Goto C1

Step7: Aggregate all the Absoulte membership degree Where n = 8

Aggregate = Ai (x)
Step8: Calculate Normalized membership degree. N.M.D =(Absolute membership value) / (Aggregate of Absolute values) Step9: Calculate Interest Rate Aggregate IR= (Normalized membership degree)*(Predefiined IR of each case) Step10: Return Aggregate Interest Rate. IV. CONCLUSION Fuzzy logic is a form of multi-valued logic derived from fuzzy set theory to deal with reasoning that is approximate rather than precise. Just as in fuzzy set theory the set membership values can range (inclusively) between 0 and 1, in fuzzy logic the degree of truth of a statement can range between 0 and 1

and is not constrained to the two truth values {true, false} as in classic predicate logic. And when linguistic variables are used, these degrees may be managed by specific functions The fuzzy classification, by giving a more precise information of the classified elements, allows to: 1. Avoid the gaps between the classes 2. No more inequities 3. Treat the customers to their real value 4. No ejection of potentially good customers and better retention of the top customers 5. Better determine a subset of customers for a special action 6. More efficient marketing campaign 7. Monitor the customers evolution
V. ACKNOWLEDGEMENT This work was partially supported by the Computer Science Department H.O.D Mr. Zubair Khan. Mr. Zubair Khan has been continuously supporting by giving its valuable comments and attention to improve the content and by creating interest in this Research. We feel highly indebted to him for helping us to improve the effectiveness of the Paper. VI. REFERENCE [1]. Darius Zumstein, Customer Performance Measurement, University of Fribourg, Switzerland. [2]. Andreas Meier, Nicolas Werro, 1700 Fribourg, Switzerland. [3]. Martin Alberchet, Miltiadas Sarakinos, 3050 Bern, Switzerland [4]. Software Engineering, Rozer.S.Pressman [5]. Mastering in Visual Basic, Rosch Publishers. [6]. Weber, R:Customer Segmentation for Banks and Insurance Groups with Fuzzy Clustering Techniques, In [Baldwin 1996]. [7]. Werro, N. Stormer, H., Meier, A Hierarchy Fuzzy Classification of Online Customers. [8]. Zimmermann, H-J,: Fuzzy Set Theory and its Application.

[9]. Zimmermann, H-J, Practical Application of Fuzzy Technologies, Kluwer, Academic Publishers, London, 1992. [10]. Zumstein, D., Werro, N., Meier, A.: Fuzzy Portfolio Analysis for Strategic Customer Relationship Management, Internal Working Paper, Univertisy of Fribuorg, 2007.

New Technology Acceptance Model to Predict Adoption of Wireless Technology in Healthcare


Gaurav Singh
Deptt. Of Computer Science SRMSCET, Bareilly, U.P., India
Er.gauravsingh061188@gmail.com

Manisha Yadav
Deptt. Of Computer Science SRMSCET , Bareilly, U.P. ,India

Shivani Rastogi
Deptt. Of Computer Applications KCMT , Bareilly, U.P. ,India

yadavmanisha04@gmail.com

shivani.rastogi15@gmail.com

A.

AbstractAdoption of new technologies is researched in Information Systems (IS) literature for the past two decades, starting with the adoption of desktop computer technology to the adoption of electronic commerce technology. Issues that have been researched comprise of how users handle various options available in software environment, their perceived opinion, barriers and challenges to adopting a new technology, IS development procedures that are directly impacting any adoption including interface designs and elements of human issues. However, literature indicates that the models proposed in the IS literature such as Technology Acceptance Model (TAM) are not suitable to specific settings to predict adoption of technology. Studies in the past few years have strongly concluded that TAM is not suitable in healthcare setting because it doesnt consider a myriad of factors influencing adoption technology adoption in healthcare This paper discusses the problems in healthcare due to poor information systems development, factors that need to be considered while developing healthcare applications as these are complex and different from traditional MIS applications and derive a model that can be tested for adoption of new technology in healthcare settings. The contribution of this paper is in terms of building theory that is not available in the combined areas of Information Systems and healthcare. Keywords Healthcare, Information Systems, Adoption Factors

healthcare. In healthcare, specific issues relating to the failures of Information Management are being addressed using frontier technologies such as RF Tags and Wireless Handheld Devices. The main focus in using these technologies is to collect patient related information in an automated manner, at the point of entry, so as to reduce any manual procedures needed to capture data. While no other discipline relies more heavily on human interactions than healthcare, it is in healthcare that technology in the form of wireless devices has the means to increase not decrease-the benefits derived from the important function of human interaction. Essential to this is the acceptance of this wireless handheld technology as this technology enables to collect data at the point of entry, with minimal manual intervention, with a higher degree of accuracy and precision. When it comes to the Management of IS, development and implementation of a hospital IS is different from traditional IS due to the life critical environment in hospitals. Patient lives are dependent upon the information collected and managed in hospitals and hence smart use of information is crucial for many aspects of healthcare. Therefore, any investigation conducted should be multidimensional and should cover many aspects beyond technical feasibility and functionality dictated by traditional systems. Successful implementation of health IS includes addressing clinical processes that are efficient, effective, manageable and well integrated with other systems. While traditional IS address issues of integration with other systems, this is more so important in hospital systems because of the profound impact these systems have on short and long term care of patients .Reasons for failure in Information Systems developed for healthcare

INTRODUCTION

Institution of Medicine (IOM) in the United States has recognized that frontier technologies such as wireless technology would improve access to information in order to achieve quality healthcare. A report released by the IOM in 2003 outlined a set of recommendations to improve patient safety and reduce errors using reporting systems that based on Information Systems (IS). While it is widely accepted that IS assists health related outcomes, how this can be efficiently achieved is an under researched area. Therefore, conflicting outcomes are reported in healthcare studies as to the successful role of IS. In essence, research is needed to investigate the role, and perhaps the use of frontier technologies in improving information management, communication, cost and access to improve quality

include lack of attention paid to the social and professional cultures of healthcare professionals, underestimation of complex clinical routines, dissonance between various stakeholders of health information, long implementation time cycles, reluctance to support projects financially once they are delivered and failures to learn from past mistakes. Therefore, any new technologies should address these reasons in order to be accepted in the healthcare setting.
Unsuitability of current technology Acceptance models to healthcare: The acceptance of new technologies has long been an area of inquiry in the MIS literature. The acceptance of personal computer applications, telemedicine, e-mail, workstations, and the WWW are some examples of technologies that have been investigated in the MIS literature. User technology acceptance is a critical success factor for IT adoption and many studies have predicted this using Technology Acceptance Model (TAM), to some extent, accurately by means of a host of factors categorized into characteristics of the individuals, characteristics of the technology and the characteristics of the organizational context. Technology Acceptance Model, specifically measures the determinants of computer usage in terms of perceived usefulness and perceived ease of use. While perceived usefulness has emerged as a consistently important attitude formation, studies have found that perceived ease of use has been inconsistent and of less significant. The literature suggests that a plausible explanation for this could be the continued prolonged users exposure to technology leading to their familiarity, and hence the ease in using the system. Therefore users could have interpreted the perceived ease of use as insignificant while determining their intention to use a technology. The strengths of TAM lies in the fact that it has been tested in IS with various sample sizes and characteristics. Results of these tests suggest that it is capable of providing adequate explanation as well predicting user acceptance of IT. Strong support can be found for the Technology Acceptance Model (TAM) to be robust in predicting user acceptance However, some studies criticize TAM for its examination of the model validity with students who have limited computing exposure, administrative and clerical staff, who do not use all IT functions found in software applications. Studies also indicate that the applicability of TAM to specific disciplines such as medicine is not yet fully established. Further, the validity and reliability of the TAM in certain professional context such as medicine and law is questioned. Only limited information is found in the healthcare related literature as to the suitability of TAM. Similarly, in the literature related to the legal field, especially where IT is referred, limited information can be found on TAM. Therefore, it appears that the model is not fully tested with various other professionals in their own professional contexts.

Therefore, it can be argued that, when it comes to emerging technology such as wireless handheld devices, TAM may not be sufficient to predict the acceptance of technology because the context becomes quite different. It should be noted that the current context in healthcare related Information Systems is not only the physical environment but also the ICT environment as wireless technology is markedly different from Desktop technology. A major notable change is the way in which information is accessed using wireless technology as the information is pushed to the users as opposed to users pulling the information from desktop computers. In the Desktop technology, users have the freedom to choose what they want to access and the usage behavior is dependent upon
their choice. On the other hand, using wireless devices, it is possible for the information whether needed or not to reach these devices assume significant importance because of the setting in which these devices are used. For example, in an operation theatre patient lives assume importance and information needs must reflect this. If wireless handheld devices dont support data management that are closely linked with clinical procedures due to device restrictions such as screen size and memory, despite their attractions, users would discard these devices. Therefore, applications developed onto these devices must address complex clinical procedures that can be supported by these devices. Another major consideration in the domain of wireless technology is the connectivity. While this is assumed to be always available in a wired network environment, this can not be guaranteed in a wireless technology due to mobility the network connectivity. As users carry the device and roam, the signal strength may change from strong to weak and this may interrupt user operations. Therefore, to accomplish smart information management, certain technical aspects must also be addressed. Current users of wireless technology are concerned with their security and privacy aspects associated in using this technology. This is because they need to reveal their identity in order to receive information. While the privacy is concerned with the information that they provide to others, security threats fall under the categories of physical threat and data threat. Due to the infancy stages and hardware restrictions, handheld devices are not able to implement these features to the expected level on the devices as found in desktop computers. In a healthcare setting, any leak in the privacy issues would have potential adverse impact on the stakeholders. Further, due to other devices that may be using radio frequency or infra-red frequency in providing healthcare to patients, there may be practical implementation restrictions in the usage of wireless devices for ICT.

Our own experience in providing wireless technology solutions to a private healthcare in Western Australia yielded mixed responses. The wireless technology developed and implemented for the Emergency Department was successful in terms of software development and deployment. The project was well accepted by the users in the healthcare. However, the wireless solution provided to address problems encountered in the Operation Theatre Management System was not well received by the users, despite the superiority in design, functionality and connectivity. Users were reluctant to use the application due to the hardware and database connectivity restrictions, despite scoring a high level of opinion on acceptance for usefulness and ease of use.

necessitates a radically new model in order to predict the acceptance of wireless handheld technology in specific professional settings.
Ingredients for a New Model to predict acceptance of new Technology: Some of the previous models measured actual use through the intention to use and input to these models are perceived usefulness, perceived ease of use, attitude, subjective norm, perceived behavioral control, near term use, short term use, experience, facilitating conditions and so on. In recent years, factors that impacting technology acceptance included job relevance, output quality and result demonstrability. In the field of electronic commerce and mobile commerce, factors such as security and trust are considered as factors of adoption of these technologies. In end user computing, factors such as user friendliness and maintainability appear to be influencing the applications. Therefore, any new model to determine the acceptance of wireless technology would include some of the above factors. In addition to these, when it comes to wireless technology, any acceptance factors should hinge on two dominant concepts hardware (or device) and applications that run o the hardware as the battle continues to accommodate more applications on a device that is diminishing in size, but improving in power. Further, mobile telephones and PDAs, appear to be accepted based on their attractiveness, hardware design, type of key pad that they provide, screen color and resolution, ability to be carried around etc. In effect, the hardware component appears to be an equally dominant factor in the adoption of wireless technology. Once the hardware and software applications are accepted, the third dominant factor in the acceptance of wireless technology appears to be the telecommunication factor. This factor involves various services provided by telecommunication companies, the cost involved in such services, the type of connectivity, roaming facilities, ability to access the Internet, provision for Short Messaging Services (SMS), ability to play games using the mobile devices etc. These factors are common to both mobile telephones and emerging PDAs. Some common features that the user would like to see appear to be alarming services, calendar, scheduler, ability to access digital messages both text and voice etc. Therefore, studies that investigate the adoption of wireless technology should aim to categories factors based on hardware, applications and telecommunication as these appear to be the building blocks of any adoption of this technology. Specific factors for applications, perhaps, could involve portability across various hardware, reliability of code, performance, ease of use, module cohesion across different common applications, clarity of code etc,. In terms of hardware, the size of the device, memory size, key pad, resolution of screen, various voice tones, portability, attractiveness, brand names such as Nokia, capability such as alarms, etc. would be some of the factors of adoption or acceptance. In terms of service provision, plan types, costs, access, free time zones, SMS provision, cost for local calls, cost to access the Internet, provision to share

Now, let us assume that TAM is correct in claiming that the intention to use a particular system is a very important factor in determining whether users will actually use it. Let us also assume that the wireless systems developed for the private healthcare provider in Western Australia exhibited that there were clear intentions to use a the system. However, despite a positive affect on perceived usefulness and perceived ease of use, the wireless system was not accepted by users. It
should be noted that the new system mimicked the current traditional system, and yet did not yield any interest in terms of user behaviors. While searching for reasons for this hard to explain phenomena, who argued, after studying TAM, that perceived usefulness should also include near-term and longterm usefulness in order to study behavioral intentions. Other studies that have examined the utilization of the Internet Technology have also supported view. This has given us a feeling that TAM may not be sufficient to predict the acceptance of wireless technology in specific healthcare setting. A brief review of prior studies in healthcare indicated that a number of issues associated with the lack of acceptance of wireless handheld devices are highlighted but not researched to the full extent that they warrant. For example, drawbacks of these devices in healthcare included perceived fear for new learning by doctors, time investment needed for such learning, cost involved in setting up the wireless networks and the cost implications associated with the integration of existing systems with the new wireless system .A vast majority of these studies concur that wireless handheld devices would be able to provide solutions to the Information Management problems encountered by healthcare. While these studies unanimously agree that the information management would be smarter using wireless technology and handheld devices, they seldom provided details of those factors that enabled the acceptance of wireless technology specific to healthcare setting. MIS journals appear to be lagging behind in this area.

Therefore, it is safe to assume that current models that predict the acceptance of technology based on behavioral intentions are insufficient. This

information stored between devices etc. appear to be dominant factors. Factors such as security etc form a common theme as all the three dominant categories need to ensure this factor. Factors mentioned above are crucial to determine the development aspects of Wireless Information Systems (WIS) for healthcare as these factors dictate the development methodology, choice of software language, user interface design etc. Further, the factors of adoption in conjunction with methodology would determine the integration aspects such as coupling the new system with existing systems. This would then determine the implementation plans. In essence, an initial model that can determine the acceptance of wireless technology in healthcare can be portrayed as follows:

2. Software factors have a direct effect on the development, integration and implementation of wireless technology in healthcare for data management 3. Telecommunication factors direct effect on the development, integration and implementation of wireless technology in healthcare for data management 4. Factors influencing wireless technology in healthcare setting have direct positive effect on usage, relevance and need 5. User perception of new technology is directly affected by usage, relevance and need 6. User perception of new technology has a direct effect on user attitude in using such technology 7. User attitude has a direct effect on intentions to use a new technology 8. Usage behavior is determined by intentions to use a new technology

Diagram 1: Proposed Model for Technology Adoption in Healthcare Settings

In the above model, the three boxes in dark borders show the relationship between various factors that influence the acceptance of technology. The box on the left indicates various factors influencing wireless technology I any given setting. The three categories of factors hardware, software and telecommunication affect the way in which wireless technology is implemented. The factors portrayed in the box are generic and their role to specific healthcare setting varies depending upon the level of implementation. Once the technology is implemented, it is expected to be used. In healthcare settings, it appears that the usage, relevance and need are the three most important influencing factors for the continual usage of new technology When the correct balance is established, users exhibit positive perceptions about using a new technology such as wireless handheld devices for data management purposes. This, in turn, brings out positive attitude towards using the system, both short and long term usage. The positive usage would then determine the intentions to use, resulting in usage behavior. The usage behavior then determines the factors that influence the adoption of new technology in a given setting. This is shown by the arrow that flows from right to left. Based on the propositions made in the earlier paragraphs, it is suggested that any testing done to predict the acceptance of new technology in healthcare should test the following hypotheses:

Instruments: The instruments typically would constitute two broad categories of questions. The first category of questions would be related to the adoption and usage of wireless applications in healthcare for data collection purposes. The second category would consist of demographic variables, as these variables determine the granularity of the setting. Open ended questions can be included in the instrument to obtain unbiased and non-leading information. Prior to administering the questions, a complete peer review and a pilot study are insisted in order to ascertain the validity of the instrument. A two stage approach can be used in administering the instrument, where the first stage would gather information about the key factors influencing users decisions to use wireless applications and the second stage on the importance of those key factors. This approach would complement the open ended questions so as to determine the importance of the individual factors determining the adoption and usage of wireless devices and applications
Data Collection: In order to perform validity and reliability tests, a minimum of 250 samples are required. Any study to test the model should consider the randomness of the samples to avoid any collective bias. Similarly, about 50 samples may be required to undergo the interview process, with each interview to last for 60 minutes.

1. Hardware factors have a direct effect on the development, integration and implementation of wireless technology in healthcare for data management

Any instruments developed for testing the model should be able to elicit responses of 'how' and 'why'. This is essential in order to discern differences between adoption and usage decision of wireless handheld applications. In addition, comparing responses to the question about adoption and questions about use would provide evidence that respondents were reporting their adoption drivers and not simply their current behavior. The interview questions should be semi structured or partially structured to guide the research. There are variations in qualitative interviewing techniques such as informal, standardized and guided. Structured interviews and partially structured interviews can be subjected to validity checks similar to those done in quantitative studies. Samples could be asked about their usage of wireless devices including mobile telephones and other hospital systems during the initial stages of the interview. They could be interviewed further so as to identify factors that would lead to the continual usage of these devices and any emerging challenges that they foresee such as training. The interviews can be recorded on a digital recording system with provision to convert automatically to a PC to avoid any transcription errors. This approach would also minimize transcription time and cost. The interview questions should be developed in such as way that both determinants and challenge factors could be identified. This then increases or enhances the research results, which is free of errors or bias. Data Analysis: Data should be coded by two individuals into a computer file prior to analysis and a file comparator technique should be used to resolve any data entry errors. A coding scheme should also be developed based on the instrument developed. The coders should be given sufficient instructions on the codes, anticipated responses and any other detail needed to conduct the data entry. Coders should also be given a startlist that will include definitions from prior research for the categories of the construct. Some of the categories would include utilitarian outcomes such as applications for personal use and barriers such as cost and knowledge. Data should be analyzed using statistical software applications using both quantitative and qualitative analyses. Initially a descriptive analysis needs to be conducted, including a frequency breakdown. This should then be followed by a detailed cross sectional analysis of the determinants of behavior. A factor analysis should also be conducted to identify factors of adoption. Once this is completed, tests for significance can be performed between various factors. CONCLUSIONS

because current models available in the Information Systems domain are yet to fulfill this need. Based on our experience and available literature, we identified some initial factors that can influence and determine acceptance of technology. We also proposed a theoretical model that can be tested using these initial factors. In order to be complete, we suggested a proposed methodology for testing the model.
REFERENCES
Davies, F. D., Bagozzi, R. P., & Warshaw, P. R. (1989). User acceptance of computer technology: A comparison of two theoretical models. Communications of the ACM, 35(8), 982-1003. Davis, G. B. (1985). A typology of management information systems users and its implication for user information satisfaction research. Paper presented at the 21st Computer Personnel Research Conference, Minneapolis. Dyer, O. (2003). Patients will be reminded of appointments by text messages. British Medical Journal, 326(402), 281. Freeman, E. H. (2003). Privacy Notices under the Gramm-Leach-Bliley Act. Legally Speaking (May/June), 5-9. Goh, E. (2001). Wireless Services: China (Operational Management Report No. DPRO-94111): Gartner. Hu, P. J., Chau, P. Y. K., & Liu Sheng, O. R. (2002). Adoption of telemedicine technology by health care organizations: An exploratory study. Journal of organizational computing and electronic commerce, 12(3), 197-222. Hu, P. J., Chau, P. Y. K., Sheng, O. R. L., & Tam, K. Y. (1999). Examining the technology acceptance model using physician acceptance of telemedicine technology. Journal of Management Information Systems, 16(2), 91-112. Kwon, T. J., & Zmud, R. W. (Eds.). (1987). Unifying the fragmented models of information systems implementation. New York: John Wiley. Oritz, E., & Clancy, C. M. (2003). Use of information technology to improve the Quality of Health Care in the United States. Health Services Research, 38(2), 11-22. Remenyi, D., Williams, B., Money, A., & Swartz, E. (1998). Doing Research in Business and Management. London: SAGE Publications Ltd. Rogers, E. M. (1995). Diffusion of Innovation (4th ed.). New York: Free Press. Rozwell, C., Harris, K., & Caldwell, F. (2002). Survey of Innovative Management Technology (Research Notes No. M-15-1388): Gartner Research. The nature and determinants of IT acceptance, routinization, and infusion, 6786 (1994). Sausser, G. D. (2003). Thin is in: web-based systems enhance security, clinical quality. Healthcare Financial Management, 57(7), 86-88. Simpson, R. L. (2003). The patient's point of view -- IT matters. Nursing Administration Quarterly, 27(3), 254-256. Smith, D., & Andrews, W. (2001). Exploring Instant Messaging: Gartner Research and Advisory Services. Sparks, K., Faragher, B., & Cooper, C. L. (2001). Well-Being and Occupational Health in the 21st Century Workplace. Journal of Occupational and Organizational Psychology, 74(4), 481-510. Tyndale, P. (2002). Taxonomy of Knowledge Management Software Tools: Origins and Applications, 2002, from www.sciencedirect.com Wiebusch, B. (2002). First response gets reengineered: Will a new sensor and the power of wireless communication make us better prepared to deal with biological attacks? Design News, 57(11), 63 - 68. Wisnicki, H. J. (2002). Wireless networking transforms healthcare: physician's practices better able to handle workflow, increase productivity (The human connection). Ophthalmology Times, 27(21), 38 - 41. Yampel, T., & Eskenazi, S. (2001). New GUI tools reduce time to migrate healthcare applications to wireless. Healthcare Review, 14(3), 15-16.

We saw in this case study that there is a necessity for a new model to accurately predict the adoption of new technologies in specific healthcare setting

Entrepreneurship through ICT for disadvantaged communities


1. Ms. Geetu Sodhi, Assistant Professor,
JIMS,Vasant Kunj, New Delhi.
Email: geetu.sodhi@yahoo.com

2. Mr. Vijay Gupta, Assistant Professor,


JIMS,Vasant Kunj, New Delhi
Email:vijay84g@yahoo.com Industrial society has moved into an era of advanced technological innovation, affecting the way developed countries run their businesses, their institutions and lead their lives. One of the areas in which these technological advances are dramatically influencing peoples lives is information technology and telecommunications hence the claim that we are in the midst of a digital revolution that is driving us towards an information society. As during previous societal revolutions based on technological advances there remain many countries and people that are largely unaffected by the changes that are taking place. This paper investigates the role that information and communication technologies (ICTs) have to play in developing countries, focusing particularly on those rural areas that are currently least affected by the latest advances in the digital revolution. Section one aims to look beyond the current digital divide debate which focuses on information disparities to assess the potential role of ICTs in the context of current rural development paradigms. This section addresses the current divergence between the technology drivers and the potential beneficiaries in rural areas in developing countries, together with the opportunities arising from the continued convergence of ICTs, old and new. The section considers some alternative approaches that are being pioneered to harness ICTs for development goals including private sector, public sector and NGO-based initiatives. This leads on to a discussion of changing approaches to technology transfer drawing on lessons from agricultural extension experience to illustrate how ICTs could be harnessed for rural development. This theme is further developed in Section 2, which focuses on how ICTs can play a more strategic role in rural development. It assesses the potential for pluralistic approaches to encourage widespread adoption of ICTs. The need for flexible and decentralised models for using ICTs is discussed in the context of content and control. The challenge of achieving rural development goals by supporting knowledge and information systems is analysed through an epistemological perspective illustrated by case studies from the literature and the authors research on the operation of these systems at the community level. The concept of building partnerships at the community level based around information exchange is explored, using ICTs to improve systems for the exchange of information sources that already exist locally and also providing established information intermediaries with the facilities to enhance their capacity for information sharing. Responsibility for incorporating technological innovation in ICTs into development strategies has traditionally fallen to those with the mandate for infrastructure within governments and development agencies. This is largely due to the large scale and high costs of building telecommunication, electricity and to a certain extent broadcasting networks. As the technology becomes more powerful and more complex, with satellite-based and fibre optic cable networks encircling the globe with increasing density, the position of ICTs within this infrastructure mandate is unlikely to

ABSTRACT
Applying entrepreneurship to accelerate the pace of development and poverty eradication by innovative use of ICT is a promising approach requiring new ways of working. Entrepreneurs are innovators who build the new economy through a process of destroying the old economy by creative use of new knowledge. Innovators also apply new knowledge in the form of technology in a disruptive manner, according to the theory of Christensen. Creative destruction and disruptive technologies introduce major discontinuities in society, and has prompted ICT use to be likened to a digital tsunami. All countries, but more so developing countries, face formidable challenges in managing this discontinuity. Women entrepreneurs make a substantial contribution to national economies through their participation in start-ups and their growth in informal businesses. Sectors that are traditionally accessible for women often face high competition, however, and are characterized by low productivity and low profit margins. Their competitiveness is constrained by limited access to information and resources to support the development and marketing of their products. E-business, characterized by the use of the Internet to conduct business, can address this limiting factor. E-business allows process innovation by either simplifying or making more efficient the way business transactions are conducted; it promotes product innovations by creating new products and even new industries; it transforms conventional business operations by creating new markets that were not previously existent, and therefore it can help in empowering women by facilitating womens entrepreneurship. Access to Information and Communications Technology (ICT) applications and services and systematic knowledge sharing in disadvantaged communities and rural areas is either non-existent or very difficult. Individual and household access to ICTs remains out of reach of those disadvantaged communities1 and in particular to women. Providing access to Information and Communication Technologies (ICTs) can have dramatic impact on poverty alleviation for rural women and for achieving socio economic development goals. Rural women need to treat ICTs as an empowerment tool and a means to a living. The use of mobile communication devices and internet are changing the way agricultural activities are managed by farmers nowadays. Rural womens lack of mobility and less hands-on computer experience might hinder womens welfare and empowerment. This paper analyses how use of the telephony (both cellular and land line), internet and other ICTs can benefit rural women and disadvantaged communities in educational, business and economic sector.

A. INTRODUCTION

diminish. ICTs, however, also consist of a wide range of equipment nowadays that can be operated individually or within small, local networks that do not require vast infrastructure investments. Long lasting batteries, solar and wind-up power sources are now being used to enable ICTs to operate in remote areas. This paper focuses principally on the role of ICTs as flexible and powerful tools for social development through small scale strategic interventions, linking to, and extending beyond, formal and centralised systems operating on a larger scale. It is in this role as tools for social development that much of the experimentation at the community level is currently taking place, to harness the existing capacity of many off the shelf ICT products to serve community development needs. This paper explores how ICTs could have a greater role in future rural development strategies through the integration of available technologies and the diverse institutional and knowledge landscapes that exist in developing countries. The paper concludes that there are numerous, well established barriers to improving information exchange. Knowledge capture, the high cost of information access and infrastructure constraints all affect the equitable distribution of information in rural areas. However, technological advances in ICTs have reduced the cost and increased the quantity and speed of information transfer dramatically. This is set to continue and the technologies are already being designed to accommodate a wide range of user choices. This flexibility points towards a potential for adaptation to the diverse needs of rural areas in developing countries that responds directly to the current paradigmatic emphasis on democratic decentralisation and pluralistic approaches, with participatory and demand-driven, market-based and diversified developmental change. The contradiction between the potential for ICTs to address the challenges faced by rural development and the current failure to harness them for this purpose is striking. To pursue universal access and one size fits all applications to bridge the digital divide is to ignore the real potential of ICTs to be used locally, in order to enable those individuals and institutions that are the priorities of rural development strategies to access the information that is relevant to their own multidimensional livelihoods. The need for a concerted effort to build knowledge partnerships and to engage the private sector and technology drivers in the pursuit of rural development goals is paramount if ICTs are to have a role in future strategies. This paper is aimed to highlight the Entrepreneurship through ICT for disadvantaged communities and case study of eSagu. IIIT, Hyderabad, A.P., India had developed eSagu model of extension system and implemented it for the cotton crop in three villages of Oorugonda, Gudeppad and Oglapur covering 749 farmers and 1041 farms during 2004-05 crop season. The objective of this case is to discuss how farmers are being empowered through mobile revolution by taking the above mentioned case. The case discusses the concept and business model adopted, the operational viability, business process flow, critical success factors and the constraints and challenges for its implementation. The data for the same has been collected by both the secondary as well as primary sources. The companys media reports, websites and other documents have proved useful for this case study. The study has been supplemented with the interviews of the stakeholders i.e. company officials and farmers.

Today, from the time we awaken in the morning to the time before we sleep, we are surrounded by media, such as newspapers, radio, television, and computers. Sometimes we are not even aware that we are surrounded by these media. All these media come under the overall umbrella of what are known as todays ICTs. Knowing and using ICTs is important in todays fast changing knowledge society, but we very often are confused about what these media are. Information and Communication Technologies (ICTs) are often associated with the most sophisticated and expensive computerbased technologies. But ICTs also encompass the more conventional technologies such as radio, television and telephone technology. While definitions of ICTs are varied, it might be useful to accept the definition provided by United Nations Development Programme (UNDP): ICTs are basically information-handling tools- a varied set of goods, applications and services that are used to produce, store, process, distribute and exchange information. They include the old ICTs of radio, television and telephone, and the new ICTs of computers, satellite and wireless technology and the Internet. These different tools are now able to work together, and combine to form our networked world a massive infrastructure of interconnected telephone services, standardized computing hardware, the internet, radio and television, which reaches into every corner of the globe. ICT in convergence with other forms of communication have the potential to reach those women thereby empowering them to participate in economic and social progress, and make informed decision on issues that affect them.Information about markets and technology is helpful to poor households in rural areas. Mobile phone and internet empowers the rural poor through the easy access to information.ICT also have great use in providing healthcare and education.ICT is also helpful in facilitating efficient governance and state intervention particularly local bodies.

C. CHALLENGES
Much progress has been made in the past few years in understanding the contribution that ICT can make to fostering economic growth, combating poverty, and addressing the specific needs of the poor. The Millennium Development Goals (MDGs) provide a framework and benchmarks for poverty reduction efforts, and improvements in education, health and the environment. As each goal has a significant information and communications dimension, ICT can help their attainment through its power to create and transfer knowledge, improve the efficiency and transparency of institutions and markets, and facilitate the participation and empowerment of the poor. A variety of experiments and pilot projects have demonstrated ICTs impact in specific development sectors such as health, education, environment, and public sector reform, and their value in achieving specific development goals. Yet progress even at this first level has been uneven. Most ICT-for-development applications are still heavily dependent on the initiative of the "already converted and are often not mainstreamed beyond their area of initiative or responsibility. These applications, furthermore, often underperform and prove unsustainable if they are not part of a broader national ICT strategy. And most

B. The Potential of Information and Communication Technologies (ICTs)

mainstream development practitioners and analysts are still not aware of the full potential of ICT. Even less progress has been made on the other two levels integrating ICT into development programmes more broadly and our understanding of the development process. It is important to "stay the course" in fostering ICT as tools of development in the wake of sceptics who see ICT as an expensive distraction, rather than a powerful tool for empowerment of the poor. Nor should the international community be deterred by the recent slump in the global ICT economy. The need to strengthen ICT strategies is sharpened by the fact that, economically and geopolitically, the world faces difficult times. Globalisation is proceeding at an uneven pace, and many feel left behind and disenfranchised excluded both from the economic benefits of globalisation and from the political processes that help shape it. The vital role of information and knowledgeand the ways that ICT can help the poor create, access, share and deploy information and knowledge to improve their lives intellectually accepted and yet poorly understood by many development professionals. Other challenges include the grave concern over the rural people capabilities in using the new technology and in using the information procured by it. Pathetic rural infrastructure including deficient telecommunication networks, low penetration of personal computers and poor internet connectivity is also one of the major challenge. Cost of telephone or internet connection in India as a share of household income considerably higher than those of countries like South Korea and china Responding to these challenges requires a new approach based on commonalities of interest, clear focus, and tangible results. International collaborative efforts such as the DOT Force and the UN ICT Task Force have helped to focus co-operative efforts on tangible objectives for realising the full potential of ICT as tools for addressing the broader challenges represented by the MDGs. Poverty reduction strategies are a major vehicle for addressing these challenges in each country.

dispersing pensions to widows directly. M-wallet can also be used for disbursement of credits under schemes like NREGA. II Weather Forecasting Information and communication technology helps farmers cope with drought . Various organization like ICRISAT, IMD and other workers with NGO and volunteers in villages to fight with drought. How it Works Organization like ICRISAT , IMD and Volunteers takes surveys in villages to calculate the water deposits in lakes, ponds, underground water and the need of the villages. After that all this data is compiled with forecasted rainfall in the respected areas. Then the compiled data in the form of charts send to the computer hubs setup in villages for discussion with the villagers in advance III Agriculture Assistance The application of Information and Communication Technology (ICT) in agriculture is increasingly important. E-Agriculture is an emerging field focusing on the enhancement of agricultural and rural development through improved information and communication processes. More specifically, eAgriculture involves the conceptualization, design, development, evaluation and application of innovative ways to use information and communication technologies (ICT) in the rural domain, with a primary focus on agriculture. E-Agriculture is a relatively new term and we fully expect its scope to change and evolve as our understanding of the area grows. E-Agriculture is one of the action lines identified in the declaration and plan of action of the World Summit on the Information Society (WSIS). The "Tunis Agenda for the Information Society," published on 18 November 2005, emphasizes the leading facilitating roles that UN agencies need to play in the implementation of the Geneva Plan of Action. The Food and Agriculture Organization of the United Nations (FAO) has been assigned the responsibility of organizing activities related to the action line under C.7 ICT Applications on EAgriculture. The main phases of the agriculture industry are: Crop cultivation, Water management, Fertilizer Application, Fertigation, Pest management, Harvesting, Post harvest handling, Transporting of food/food products, Packaging, Food preservation, Food processing/value addition, Food quality management, Food safety, Food storage, Food marketing. All stakeholders of agriculture industry need information and knowledge about these phases to manage them efficiently. Any system applied for getting information and knowledge for making decisions in any industry should deliver accurate, complete, concise information in time or on time. The information provided by the system must be in user-friendly form, easy to access, costeffective and well protected from unauthorized accesses.

D.MODELS
ICT can provide following services to rural India M-Banking Weather Forecasting Agriculture Assistance Pricing of Agriculture Healthcare Education I. Mbanking Rural people especially women can conduct basic banking functions through their phones.M-Banking can enable banking for the unbanked Sample M-banking applications. If a rural customer is allowed to keep money in his prepaid account and then convert this amount physically into cash at an establishment. This is very useful to women, when men folk are not around. Micro-finance on mobile will also help in repayment of loans, planning of finances, trading and selling local goods to local distributors and large retailers. E.g. Unilever Microfinancing through m-wallet is also useful for the government in

Information and Communication Technology (ICT) can play a significant role in maintaining the above mentioned properties of information as it consists of three main technologies. They are: Computer Technology, Communication Technology and Information Management Technology. These technologies are applied for processing, exchanging and managing data, information and knowledge. The tools provided by ICT are having ability to: Record text, drawings, photographs, audio, video, process descriptions, and other information in digital formats, Produce exact duplicates of such information at significantly lower cost, Transfer information and knowledge rapidly over large distances through communications networks. Develop standardized algorithms to large quantities of information relatively rapidly. Achieve greater interactivity in communicating, evaluating, producing and sharing useful information and knowledge. IV. Healthcare The Tele-clinics way Healthcare service is one of the important basic needs. Ill health could affect the living standards directly and indirectly. Healthcare service in the rural areas where more than 70 % of Indians live, is abhorrently inadequate. Bundelkhand region in Central India, which includes districts of Madhya Pradesh and Uttar Pradesh, is the most backward region in India with a lack of proper healthcare infrastructure. A majority of sickness in Bundelkhand villages is treated by untrained personnel. This is a general phenomenon in many of the villages in rural Madhya Pradesh (M.P.) and Uttar Pradesh (U.P.). It is a dichotomy to see an overwhelming of highly technology based hospitals and dispensaries in urban areas while the rural villages do not even have basic minimum public health facilities. Even within cities the poor do not have access to high tech healthcare facilities because of various reasons, mainly financial limitations. Many of the public healthcare services like Public Health Centres (PHCs) and sub-centres in rural areas are not equipped and staffed to provide quality healthcare to the rural poor. This suggests the yawning divide between rural and urban healthcare services, between the rural poor and the well off. The new developments in healthcare have not percolated to the rural areas and this is a matter of great concern. While public healthcare system in India has the best professionals and one of the best systems (decentralized up to the sub centre level) there is a need to explore the ways and means to bring equity in access to health professionals and institutions. Information and communication technology has a very important role to play in facilitating quality healthcare to the rural poor in a cost effective manner. In an age of high-tech medical care, those excluded from the mainstream healthcare service could be provided with the benefits of medical professionals through the use of an appropriate ICT kiosk. This needs a joint commitment from both private and public sector.

Telemedicine is used as a means to provide health access to people world wide through the use of various kiosks. However, this has not become popular among the rural poor because of inadequate know-how on the use of various kiosks. The countries in Asia have less than 10 % Internet users and less than 20 % telephone users in their rural localities, while in India the use of Internet in rural areas is less than 1%. In a situation where largescale tech-nology illiteracy exists, it is important to promote appropriate technology kiosks that would be easy for the poor to use. Use of telephones could be a starting point for rural areas. Even operation of a telephone is complicated for many living in rural areas. Tele-clinic Project of Christian Hospital is an example where a telephone is used to give access to quality medical care. Health workers are trained to make the communi-cation more qualitative to enable the doctor to better diagnose and advise treatment. Tele-clinics Combination of ICT and social protection Tele-clinic initiated by Christian Hospital in Bundelkhand is one of the innovative mixtures of technology and health protection supplement. It is an attempt to introduce ICT in healthcare to improve the access to specialty care to those living in remote rural areas. The communication between a doctor and a patient is enabled through the use of a telephone. Tele-clinic is a telephone enabled closed network of rural people, trained health workers and medical professionals of Christian Hospital. This network enables communication between doctor and a patient in a remote rural village with the help of a telephone. A trained health worker facilitates the communication between a doctor and a patient through a WLL phone provided by the BSNL (government owned telecommunication agency). A trained health worker is recruited in all the call centers. These call centers provide services like primary healthcare, ambulance service, telephone consultation, emergency drugs and so on. One call centre covers three to five surrounding villages. V. Education ICT for education should more concern about upliftment of rural community in this connection the Vision is: Integrated Development for Education and Economic Empowerment for Rural Students The integration should be concentrate rural life condition as well as provide information about urban areas educational developments. The ICT for education programme not only provide computer education to rural students but also it should provide information on higher education, employment opportunities in various fields. In school education of Tamilnadu, for example there is separate syllabus for moral class or life education it has included some vocation training class like farming, vocational training of empraiding, tailoring and weaving etc. But most of the schools did not follow effectively these classes. So this ICT for education programme can provide these same training and awareness through computer based education technologies with effectively. Also the computer based education

will disseminate information on new technological developments from local to global level. It will be a good approach to understand to the rural students about the social and technological development of world also they can easily understand to connect with their rural life condition. This kind of ICT related educational programme will provide employment opportunity to computer and other educated youths in rural as well as areas. Also it will help to rural school students to understand computer related training and wide knowledge about resent developments in world. Infrastructure facilities are in one of the major challenges in rural schools ICT programmes, especially in internet connectivity. But in initial period without internet connectivity also we can provide some training and information through computer with effectively in rural areas. Nowadays usage of CD (Compact Disk) is not major expensive and technical aspect. All the developmental programmes are ride in the CDs and also install in computers. After that based on the standard (classes) we could framed the syllabus in rural schools then to educate students. The second objective of linkage the government training institutions for ICT programme. The same CD method can be followed to this programme. All the practical and theoretical works of the exports from the different fields has to collect in the CDs and display in schools through computers. Here the challenge is computer knowledge of the instructors who are working in the schools. So the selection of the instructors must have basic knowledge of various technologies related to development aspects. Another major challenge is knowledge of the local resources and its utilisation. The knowledge on local resources can be acquired from elders of rural areas and related research institutions, historical events, books. But it should be arrange like a syllabus and provide as information to students. It should be in simple and understandable to all students. Finally, the important aspect is involvement and interest of teachers, education department and the end user of the student community in rural areas. These two things can achieved through continuous motivation and provide better awareness about the importance of the ICT programmes. Also another major challenge is monitoring and evaluation of the over all programme. This has to be done by the concern school education department. The government can be appointing suitable persons to monitor the ICT programme in schools. But the person should have better knowledge on all over the programmes like computer skill, technical knowledge on various fields, and knowledge on local resources and its management.

The advice is provided on a regular basis (typically once a week) from sowing to harvesting which reduces the cost of cultivation and increases the farm productivity as well as quality of agricommodities. In eSagu, the developments in IT such as (database, Internet, and digital photography) are extended to improve the performance of agricultural extension services. In e-Sagu, rather than visiting the crop, the agricultural experts generate the expert advice based on the crop situation received in the form of both text and digital photographs sent by educated and experienced farmers. The development of eSagu was started during Kharif season of 2004. The eSagu system was implemented by delivering advisory to 1051 cotton farms for the farmers of three villages in Warangal district in Andhra Pradesh. The experiment was successful. During 2005- 06, eSagu system for 5000 farms has been implemented for Cotton, Chilies, Rice, Groundnut, Castor, and Redgram farms in 35 villages spread over six districts in Andhra Pradesh. In addition, the eSagu system is developed to deliver advisory for fish. The results were very impressive. The results indicate that it is possible for the agricultural expert to deliver the advice by seeing the crop status information in the form of digital photographs and text information. Most important, eSagu facilitates several agricultural experts to stay at one place to prepare more effective expert advice as compared to the advice provided by visiting the crop in person. The also results show that the expert advice has helped the farmers to improve input efficiency by encouraging integrated pest management (IPM) methods, judicious use of pesticides. Regarding benefit to the farmers, the evaluation results showed that benefit-cost ratio ranges between 3 and 4. It has also been found out that farmer's knowledge levels have been increased significantly. I Model of eSaghu The farmer can register the agriculture farms into the system directly or through a coordinator. The farmer (or coordinator) visits the agriculture farm at regular intervals and sends the crop status photographs and the text either in on-line manner or offline manner. The agricultural experts access crop status data from the IASP portal and deliver the expert advice regarding steps to be taken to improve the crop productivity. The advice can be downloaded from the IASP portal by both the farmer and the coordinator. Also, the advice is sent to the farmer's and coordinator's cell phone through SMS. The farmers/coordinators can use the system through mobile phones.

E. THE CASE OF eSaghu: A Model for delivering scientific expert advice directly to the farmers.
("Sagu" means cultivation in Telugu language) which is a scalable and personalized agroadvisory system to deliver personalized agricultural expert advice to each farmer in a timely manner by exploiting advances in Information Technology. It aims to improve farm productivity by delivering high quality personalized (farm-specific) agro-expert advice in a timely manner to each farm at the farmers door-steps without farmer asking a question.

any language. Even if the experts are not using English, their advice can be translated and stored into the database in English as the link language commonly used for scientific knowledge. IV Benefits The model enables the farmer to cultivate crops with expertise, as that of an agricultural scientist, by disseminating both crop and location specific expert advice in a personalized and timely manner. It is scalable and can be spread all over India and in fact since there is no direct one-to-one communication, there is no limitation of language or of location since it is internet based. The information database and the agricultural experts do not need to be physically located within India but can, through tie-ups with leading agricultural research labs/bodies across the world, successfully address the problems plaguing Indian agriculture. It is a scalable system which can be incrementally developed and extended to cover all the farmers (crops) of India in a cost-effective manner on the available infrastructure in the country. It enables the farmer to cultivate crops with expertise, as that of an agricultural scientist, by disseminating both crop and location specific expert advice in a personalized and timely manner. The lag period between research efforts to practice can be reduced significantly4. The results of the eSagu prototype to deliver personalized expert advice for 1051 cotton farms were very heartening. In an evaluative study of the pilot project, it was found that: 1. It is possible for the agricultural expert to deliver the advice by seeing the crop status information in the form of digital photographs and text information. The expert advice of the agricultural expert can be more effectively delivered through this channel as compared to the advice provided by visiting the crop in person. Such expert advice has helped the farmers to improve input efficiency by encouraging integrated pest management (IPM) methods, judicious use of pesticides and fertilizers rather than indiscriminate usage. In actual terms it was calculated that with the help of eSagu, the total benefit flow to farmer was about INR 3,820.00 per acre with savings in fertilizers (0.76 bags) per acre: INR 229.70 per acre, savings in pesticide sprays (2.3): INR 1,105.00 per acre, Extra yield (1.56 quintal): INR.2,485.00 per acre.

Fig 1. Model of eSagu


II The Role of the Coordinator There are approximately 80 farms under one coordinator. Thecoordinators are selected from experienced farmers or educated young persons who have an agricultural background. They also have or are provided training in basic data input. The farmer of the corresponding farm registers into the system by supplying the relevant information including soil data, water resources, and capital availability through coordinator. Also, a coordinator visits the farm on a daily or weekly basis and sends the crop details in the form of text and digital photographs through communication system. By accessing the soil data, farmer's details, crop database, and the information sent by the coordinators, the AEs [Agricultural Experts] prepare the advice. The advice contains the steps the farmer should take to improve crop productivity. English is the main language. The AEs prepare the advice (which will be translated to the target language) and store it in the system. The coordinators get the advice by accessing the system through Internet. The coordinator then explains the advice to the farmer, gets the feedback and sends it to AEs. III Communication Channels The communication channels used are alternative to a normal two-way communication. It involves coordinators between the farmer and the agricultural experts to overcome the limitations of verbal communication viz literacy and local languages. It provides for personal, face to face communication of information in the language of the farmer by the coordinator who receives the information from experts. English language is used for the database and local language is used by the coordinator to communicate with the farmers. The coordinator needs to be at least bilingual whereas the experts and the farmers can work in

2.

3.

4.

V Future Potential of the System Thus with the individual farm-based or the cluster approach, the potential of the eSagu model in providing personalized advice and solutions to the problems of farmers in remote rural areas is immense. The model can also integrate Agri-insurance schemes to further benefit the agriculture sector. Agri-insurance schemes typically suffer from the problems of asymmetric information and lack of quality data in the absence of an effective monitoring system that requires considerable manpower and infrastructure. The eSagu model can easily integrate farm-specific agro-advisory and monitoring systems. Similarly many extension services catering to the rural sector could piggy back on this model and serve the interests of the agricultural community in a holistic

manner by successfully bridging the gap between technical advances and wider knowledge on the one hand, and practical

in mind the actual ground realities to ensure long standing sustainability.

E. CONCLUSION
To empower rural India in true sense, The Common Services Centres Scheme is an ambitious project aimed at bringing about socio-economic development in rural areas of India through setting up 100,000 ICT centres under one network. Such social transformation at this scale calls for astute planning, confluence of a series of socio economic factors and successful implementation of using technology as a tool. The CSC scheme is a conscious effort to provide such a framework built on wide stakeholder participation. The Scheme has been envisaged as a bottom-up model for delivering content, services and information that can allow likeminded public/private enterprises to come together under a collaborative framework. Unlike many other existing ICT enabled kiosk initiatives in India, the CSC Scheme was designed keeping

References [1] Role of ICT in education development, pitfalls, opportunities and challenges, USHA VYASULU REDDI

LXXX. [2] CSCs: Empowering Rural India Through ICTs, Alok Bhagava

LXXXI. [3] ICT for rural healthcare: The Tele-clinics way, Toms K [4] eSagu : An IT - www.solutionexchange-un.net. [5] Reforms in Agricultural extension: new policy framework, R Sharma - Economic and Political weekly, 2002 - JSTOR

Efficient Location-Based Spatial Query (LBSQ) Processing in Wireless Broadcast Environments


K Madhavi Asst Professor of IT VR Siddhartha Engineering College (Autonomous) Kanuru, Vijayawada-7, A.P. Abstract: Location-Based Spatial Queries (LBSQs) refer to spatial queries whose response relies on the location of the mobile user. Efficient processing of LBSQs is of significant importance with the ever-increasing deployment and use of mobile technologies. We present LBSQs has certain unique characteristics, which will not be addressed by the traditional spatial query processing in centralized databases. A significant challenge is presented by wireless broadcasting environments, which have excellent scalability but often exhibit high-latency in accessing the database. In this paper, we present a novel query processing technique that, though maintaining high scalability and accuracy, it manages to reduce the latency significantly in processing of LBSQs. Existing techniques cannot be used effectively in a wireless broadcast environment, where only sequential data access is supported. It may not be scalable to very large user population. In an existing system, to communicate with the server, a client must most likely use cellular-type network to achieve a reasonable operating range. The users must reveal that their current location and send it to the server, which may be undesirable for privacy reasons. We propose a novel approach for reducing the spatial query access latency by leveraging results from nearby peers in wireless broadcast environments. The scheme allows a mobile client to locally verify whether candidate objects received from peers are indeed part of its own spatial query result set. The method exhibits great scalability; the higher the mobile peer density, the more the queries answered by peers. The query access latency can be decreased with the increase in number of mobile users. Our approach is based on peer-to-peer sharing, which enables us to process queries Dr Narasimham Challa Professor of IT

without delay at a mobile host by using query results cached in its neighboring mobile peers. We demonstrate the feasibility of our approach through a probabilistic analysis, and illustrate the appeal of our technique through extensive simulation results. Introduction: Technological advances, especially those in wireless communication and mobile devices, have fueled the proliferation of location-based services (LBS). In return, the demands from sophisticated applications with large user populations pose new challenges in LBS research systems: As the name implies, LBSs are built on the notion of location.Mobile devices can only provide a computing environment with limited CPU, memory, network bandwidth, and battery resources. As such, mobile clients must be designed to balance the utilization between these resources and the loads between the client and the server. For example, pushing more computation on the client can reduce bandwidth consumption but increase CPU load and memory consumption. Given the rapidly increasing capability of mobile devices, mobile applications must make reasonable assumptions about the clients computational capability and be able to adapt to it.LBSs must be able to handle a large user population and scale up in the face of increasing demands. An LBS server must be able to process a large number of requests simultaneously. Fortunately, requests to the servers are likely to exhibit spatial locality and perhaps to a lesser degree temporal locality as well. Location is an important element in LBSs. A location model allows us to describe the physical space and properties of the objects contained in it. For example, it defines the coordinate system and therefore the locations of the objects and their spatial relationships. In addition, operations such as location sensing, object counting, browsing, navigation, and search are affected by the location model chosen. Existing location models can be classified into geometric or symbolic models. A geometric model represents the physical space as a Euclidean space and the objects therein as points, lines, shapes, and volumes in the Euclidean space. With the coordinates of each object defined in the Euclidean space, operations such as distance and shortest path computations can be supported. A symbolic models main objective is to capture the semantics of the entities in a physical space by expressing not only the capturing of the objects but also the relationship among them in some form of graph.

Common Spatial Query Types

There are several common spatial query types. In this section, we only cover the ones which are related with this research.

Nearest Neighbor Queries : During the last two decades, numerous algorithms for k nearest neighbor queries have been proposed. In this section I roughly divide these solutions into three groups, regular k nearest neighbor queries, continuous k nearest neighbor queries, and spatial network nearest neighbor queries. Regular Nearest Neighbor Queries A k nearest neighbor (kNN) query retrieves the k (k 1) data objects closest to a query point q. The R-tree and its derivatives have been a prevalent method to index spatial data and increase query performance. To nd nearest neighbors, branch-and-bound algorithms have been designed that search an R-tree in a depth-rst manner or a best-rst manner . The NN search algorithm proposed is optimal, it only visits the node necessary for obtaining the nearest neighbors, and incremental, i.e., it reports neighbors in ascending order of their distance to the query point. Both algorithms can be easily extended for the retrieval of k nearest neighbors. Continuous Nearest Neighbor Queries The NN algorithms discussed in the previous paragraph are mainly designed for searching stationary objects. With the emergence of mobile devices, attention has focused on the problem of continuously nding k nearest neighbors for moving query points. A naive approach might be to continuously issue kNN queries along the route of a moving object. This solution results in repeated server accesses and nearest neighbor computations and is therefore inefficient. Sistla et al. rst proposed the importance of the continuous nearest neighbor queries, the modeling methods, and related query languages, however they did not discuss the processing methods. Song et al. proposed the rst algorithm for continuous NN queries. Their approaches are based on performing several point NN queries at predefined sample points. Saltenis et al. propose a time parameterized R-tree, an index structure for moving objects, to address continuous kNN queries for moving objects. Tao et al. in present a solution for continuous NN queries via performing one single query for the entire route based on the time parameterized R-tree. The main shortcoming of this solution is that it is designed for Euclidean spaces and users have to submit predefined trajectories to the database server. Spatial Network Nearest Neighbor Queries Initially, nearest neighbor searches were based on Euclidean distance between the query object and the sites of inter- est. However, in many applications objects cannot move freely in space but are constrained by a network (e.g., cars on roads, trains on tracks). Therefore, in a re- realistic environment the nearest neighbor computation must be based on the spatial network distance, which is more expensive to compute. A number of techniques have been proposed to manage the complexity of this problem . Query Processing: Broadcast vs. Point-to-Point LBSs by and large assume wireless communication since both the clients and the data (e.g., vehicles being tracked) move. Wireless communication supports two basic data dissemination methods. In periodic broadcast, data are periodically broadcast on a wireless channel accessible to all clients in range. A mobile client listens to the broadcast channel and downloads the data that matches the query. In on-demand access, a mobile client establishes a point-to-point connection to the server and submits requests to and receives results from the server on the established channel. The role of periodic broadcast is crucial in LBSs. This is because both the number of clients and the amount of requests to be supported by the LBS could be huge. Periodic broadcast can be used to disseminate information to all participants at very low cost. Early work on data broadcast focused on non-spatial data, but in order to be useful in LBSs, a data broadcast system must support spatial queries and spatial data. For example, in an intelligent transportation system, the locations of all monitored vehicles can be broadcast so that an on board navigation system can decide autonomously how to navigate through a crowded downtown area. Similar needs can be envisaged in the coordination of mobile robots on a factory floor. In

this section, we investigate the spatial query processing technique in a broadcast environment followed by location-based spatial queries processing in a demand access mode. General Spatial Query Processing on Broadcast Data Search algorithms and index methods for wireless broadcast channels should avoid back-tracking. This is because, in wireless broadcast, data are available on air in the sense that they are available on the channel transiently. When a data item is missed, the client has to wait for the next broadcast cycle, which takes a lot of time. Existing database indexes were obviously not designed to meet this requirement because data are stored on disks and can be randomly accessed any time. Hence, they perform poorly on broadcast data. Since most spatial queries involve searching on objects that are close to each other, a space-filling curve such as a Hilbert Curve can be applied to arrange objects on the broadcast so that the objects in proximity are close together. Algorithms can be developed to answer window and kNN queries on the broadcast . Given a query window, we can identify the first and the last objects within the window that the Hilbert Curve passes through. kNN queries can be answered in a similar manner by first estimating the bounds within which the k nearest neighbors can be found followed by a detailed checking of the Euclidean distances between the candidate objects and the query. The adoption of a space-filling curve can avoid the clients back-tracking the broadcast channel several times when retrieving objects for spatial queries. This is of essential importance to save the power consumption of the clients and improve the response time. However, in on-demand access mode, the processing of location-based spatial queries raises different issues due to the mobility of the clients. Location-based Spatial Queries In contrast to conventional spatial processing, queries in LBSs are mostly concerned about objects around the users current position. Nearest neighbor queries are one example. In a non-location-based system, responses to queries can be cached at the clients and are reusable if the same queries are asked again. In LBSs, however, users are moving frequently. Since the result of a query is only valid for a particular user location, new queries have to be sent to the server whenever there is a location update, resulting in high network transfer and server processing cost. To alleviate this problem, the concept of validity region can be used. The validity region of a query indicates the geographic area(s) within which the result of the query remains valid and is returned to the user together with the query result. The mobile client is then able to determine whether a new query should be issued by verifying whether it is still inside the validity region. The utilization of validity region reduces significantly the number of new queries issued to the server and thus the communication via the wireless channel.

Scheduling Spatial Queries Processing One unmistakable challenge that LBSs have to face is the huge user population and the resulting large workload generated. Traditional spatial database research focuses on optimizing the I/O cost for a single query. However, in LBSs where a large spatial locality could be expected, queries received in an interval may access the same portion of the database and thus share some common result objects. For instance, users in a busy shopping area may want to display on their PDAs a downtown map with shopping malls overlaid on it. These are equivalent to a number of window queries around the users positions. Although the windows are different, they likely overlap, resulting in the accessing of overlapping objects from the map database. Inter-query optimization can be utilized at the server to reduce the I/O cost and response time of the queries. To achieve this objective, multiple spatial window queries can be parallelized, decomposed, scheduled, and processed under a real-time workload in order to enhance system runtime performance; for example, I/O cost and response time. Query locality can be used to decompose and group overlapping queries into independent jobs, which are then combined to minimize redundant I/Os. The essential idea is to eliminate duplicate I/O accesses to common index nodes and data pages. In addition, jobs may be scheduled to minimize the mean query response time. In principle, processing queries close to one another will save more I/O cost because there is a high chance that they can share some MBRs in the R-tree index as well as data objects. An innovative method to quantify the closeness and degree of overlapping in terms of I/O has been developed based on window query decomposition . In addition, in a practical implementation where a fair amount of main memory is available, caching can be used to reduce the I/O cost. For spatial data indexed by an R-tree, high-level R-tree nodes can be cached in memory. Scheduling and Monitoring Spatial Queries One unmistakable challenge that LBSs have to face is the huge user population and the resulting large workload generated. Traditional spatial database research focuses on optimizing the I/O cost for a single query. However, in LBSs where a large spatial locality could be expected, queries received in an interval may access the same portion of the database and thus share some common result objects. For instance, users in a busy shopping area may want to display on their PDAs a downtown map with shopping malls overlaid on it. These are equivalent to a number of window queries around the users positions. Although the windows are different, they likely overlap, resulting in the accessing of overlapping objects from the map database. Inter-query optimization can be utilized at the server to reduce the I/O cost and response time of the queries. To achieve this objective, multiple spatial window queries can be parallelized, decomposed, scheduled, and processed under a real-time workload in order to enhance system runtime performance; for example, I/O cost and response time. Query locality can be used to decompose and group overlapping queries into independent jobs, which are then combined to minimize redundant I/Os. The essential idea is to eliminate duplicate I/O accesses to common index nodes and data pages. In addition, jobs may be scheduled to minimize the mean query response time. In principle, processing queries close to one another will save more I/O cost because there is a high chance that they can share some MBRs in the R-tree index as well as data objects. An innovative method to quantify the closeness and degree of overlapping in terms of I/O has been developed based on window query decomposition in [HZL03]. In addition, in a practical implementation where a fair amount of main memory is available, caching can be used to reduce the I/O cost. For spatial data indexed by an R-tree, high-level Rtree nodes can be cached in memory. Monitoring Continuous Queries on Moving Objects

Most of the existing LBSs assume that data objects are static (e.g., shopping malls and gas stations) and the location-based queries are applied on these static objects. As discussed in Section 2, applications such as locationbased alerts and surveillance systems require continuous monitoring of the locations of certain moving objects, such as cargos, security guards, children, and so on. For instance, we may issue a window query, monitor the number of security guards in the window, and send out an alert if the number falls below a certain threshold. Likewise, we can issue several nearest neighbor queries centered around the major facilities and monitor the nearest police patrols so that they can be dispatched quickly should any problem arise. Generally, we are interested in the result changes over time for some spatial queries. This is the problem known as monitoring continuous spatial queries. Most of the existing work assumed that clients are autonomous in that they continuously measure their positions and report location updates to the server periodically. Thus, server-side indexing and processing methods have been devised to minimize the CPU and I/O costs for handling these updates and reevaluating the query results. In these methods, the update frequency is crucial in determining the system performance high update frequency would overload both the clients and servers in terms of high communication and processing costs while low update frequency would introduce errors into the monitored results. An alternative to periodic location update is to let the clients be aware of the spatial queries that are being monitored so that they would update their locations on the server only when the query results were about to change. The main idea is that the server maintains a safe region for each moving client. The safe region is created to guarantee that the results of the monitoring queries remain unchanged and as such need not be reevaluated as long as the clients are moving within their own safe regions. Once a client moves out of its safe region, it initiates a

location update. The server identifies the queries being affected by this update and reevaluates them incrementally. At the same time, the server re-computes a new safe region of this client and sends it back to the client. The number of clients inside each window is being monitored. It is clear that when a client (shown as a dot in the figure) moves within the shaded rectangle S1, it will not affect the results of the four queries. However, when it moves out of S1, its new location must be sent to the server in order to re-evaluate the queries and recompute a new safe region for the client accordingly.

Q4 Q4 Q1 Q3 S1 Q1 Q3

Q2 S2

Q2 (a) Computing Safe Regions b)Mobility Pattern Effect For kNN queries, the exact locations of the clients must be known in order to determine which client is closer to the query point. However, since a client can move freely inside a safe region without informing the server, the safe regions can only be used to identify a number of potential candidates and the server still needs to probe these candidates to request their exact positions. In order to reduce the communication cost between the clients and the server, we need to reduce the number of location updates and server probes during monitoring. As such, we need to find the largest possible safe regions for the clients to reduce the number of location updates and

devise efficient algorithms for re-evaluating the query results to reduce the number of server probes. When computing the safe regions, we also need to consider the effect of the clients mobility pattern. This is because a safe region does not have to occupy the largest area to be the best. For example, in Fig. S1 is the largest safe region for the client. However, as shown in Fig. 3 (b), if we know that the client is moving toward the west, S2 is a better safe region than S1 because it takes longer for the client to move out of S2 and thus defer the next location update.

6: if w & MV R then 7: return W Q 8: else 9: W Q [ query results returned from the on-air window query with w0 . {if w 6& MV R , utilize w0 to compute the new search bounds and results.} 10: return W Q 11: end if

Algorithm 1: NNV 1: P peer nodes responding to the query request issued from q. ; 2: MV R 3: for 8p 2 IP do O 4: MV R * 14 p:V R and O * 14 p:O 5: end for O, 6: sort according to kq; oi k 7: Compute kq; es k, where edge es has the shortest distance to q among all the edges of MV R 8: i 14 1 9: while jHj < k and i jO do Oj 10: if kq; oi k kq; es k then 11: H:verified * 14 oi 12: else 13: H:unverified * 14 oi 14: i 15: end if 16: end while 17: return H Algorithm 2: 1: P peer nodes responding to the query request issued from q. 2: for 8p 2 P do 3: MV R * 14 p:V R and O * 14 p:O 4: end for 5: W Q 8o 2 O that overlap with w

Conclusion : A novel approach for reducing the spatial query access latency by leveraging results from nearby peers in wireless broadcast environments. Significantly, our scheme allows a mobile client to locally verify whether candidate objects received from peers are indeed part of its own spatial query result set. REFERENCES S. Acharya, R. Alonso, M.J. Franklin, and S.B. Zdonik, Broadcast Disks: Data Management for Asymmetric Communications Environments, Proc. ACM SIGMOD 95, pp. 199-210, 1995. D. Barbara, Mobile Computing and Databases: A Survey, IEEE Trans. Knowledge and Data Eng., vol. 11, no. 1, pp. 108-117, Jan./Feb. 1999. N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles, Proc. ACM SIGMOD 90, pp. 322-331, 1990.

J. Broch, D.A. Maltz, D.B. Johnson, Y.-C. Hu, and J.G. Jetcheva, A Performance Comparison of Multi-Hop Wireless Ad Hoc Network Routing Protocols, Proc. ACM MobiCom 98, pp. 85-97, 1998. Bychkovsky, B. Hull, A.K. Miu, H. Balakrishnan, and S. Madden, A Measurement Study of Vehicular Internet Access Using In Situ Wi-Fi

Networks, Proc. ACM MobiCom 06, Sept. 2006. C.-Y. Chow, H. Va Leong, and A. Chan, Peer-toPeer Cooperative Caching in Mobile Environment, Proc. 24th IEEE Intl Conf. Distributed Computing Systems Workshops (ICDCSW 04), pp. 528-533, 2004.

K means Clustering Algorithm with High Performance using large data


Vikas Chaudhary 1, Vikas Mishra2, Kapil3
1,2,3

Department of MCA, Krishna Engineering College, Mohan Nagar, Ghaziabad-201007 1 (Email: vikas.jnvu@rediffmail.com) 2 (Email : vikkymishra1984@gmail.com) 3 (Email : kapilteotia332@rediffmail.com)

Abstract In this paper, we demonstrate that Kmeans clustering is the most widely used non hierarchical approach to forming good clusters which is specified through a desired number of clusters, say k, then assign each case(object)to one of k clusters so as to minimize a measure of dispersion within the clusters. However, the efficiency of the KM algorithm required to enhance when it face with large amount of data. The improved algorithm avoids unnecessary calculations by using the triangle inequality We apply the improved algorithm for customer classification. Experiments show that the optimized algorithm take lower time overhead than the standard K means algorithm Keywords- K-Means clustering, triangle inequality, heuristic algorithms, NP-hard, Euclidean space
I. INTRODUCTION

Recently, data mining has not only been popular in research area but also in commercialization. Regarding k means clustering [1].Data mining can help organizations discovering meaningful trends, patterns and correlations in their customer, product, or data, to drive improved customer relationships and then decrease the risk of business operations [2].The basic data mining techniques include association rules, classification, clustering, regression analysis, sequence analysis, etc. Classification and clustering are two main techniques in the customer segmentation. Cluster analysis is a statistical technique that is used to identify a set of groups that both minimize within-group variation and maximize betweengroup variation based on a distance or dissimilarity function, and its aim is to find an optimal set of clusters [3]. It can be very time consuming when the amount of data or dimension increases, because the clustering procedure consists of several iterations, and in each iteration all data are employed to calculate the distances to form a new set of centroids of cells. Today, a successful company does well in keeping and managing its customers through providing a number of attractive, personalized and comprehensive services that satisfy customer needs [4]. Cluster analysis, or clustering, is a multivariate statistical technique which identifies groupings of the data objects based on the inter-object similarities computed by a chosen distance metric .Clustering algorithms mainly fall into two categories: Hierarchical clustering and Partitional clustering [5].The partitioned clustering algorithms, which differ from the

hierarchical clustering algorithms, are usually to create some sets of clusters at start and partition the data into similar groups after each iteration. Partitional clustering is more general than hierarchical clustering because the groups can be divided into more than two subgroups in one step (for hierarchy method, always merge or divide into 2 subgroups), and dont need to complete the dendrogram. The most distinct characteristic of data mining is that it deals with very large and complex data sets (gigabytes or even terabytes). The data sets to be mined often contain millions of objects described by tens, hundreds or even thousands of various types of attributes or variables (interval, ratio, binary, ordinal, nominal, etc.).This requires the data mining operations and algorithms to be scalable and capable of dealing with different types of attributes. However, most algorithms currently used in data mining do not scale well when applied to very large data sets because they were initially developed for other applications than data mining which involve small data sets. In terms of clustering, we are interested in algorithms which can efficiently cluster large data sets containing both numeric and categorical values because such data sets are frequently encountered in data mining applications. Most existing clustering algorithms either can handle both data types but are not efficient when clustering large data sets or can handle large data sets efficiently but are limited to numeric attributes. Few algorithms can do both well [6].
II. K-MEANS ALGORITHM

The k-means algorithm [7] is an algorithm to cluster n objects based on attributes into k partitions, k < n. It is similar to the expectation-maximization algorithm for mixtures of Gaussians in that they both attempt to find the centers of natural clusters in the data. It assumes that the object attributes form a vector space [8]. The objective it tries to achieve is to minimize total intra-cluster variance, or, the squared error function: The standard K-means Algorithm The processing flow of K-means algorithm is as follows: first, select K objects randomly, each object initially represents the mean value or center of a cluster. For the remaining objects, we assign each object to the nearest cluster according to their distance from the cluster center. Then re-calculate the mean value of each cluster. This process is repeated until the criterion function converges. Usually K-means algorithm uses the squared error criteria which are defined as follows:

The general algorithm was introduced by Cox (1957), and (Ball and Hall, 1967; MacQueen, 1967) first named it means. Since then it has become widely popular and is classified as a partitional or non-hierarchical clustering method [6] (Jain and Dubes, 1988). It is defined as follows: Given a set D = {X1, . . . , Xn} of n numerical data objects, a natural number k _ n, and a distance measure d, the k- earns algorithm aims at finding a partition C of D into k non-empty disjoint clusters C1, . . . , Ck with Ci Cj = ; and Ski=1 Ci = D such that the overall sum of the squared distances between data objects and their cluster centers is minimized. Mathematically, if we use indicator variables will which take value 1 if object Xi is in cluster Cl, and 0 otherwise, then the problem can be stated in terms of a constrained non-linear optimization problem as follows: Minimize

To deal with the problem of not well-defined boundaries between clusters, the notion of fuzzy partitions has been applied successfully to the clustering problem resulting in the so-called fuzzy clustering

(Fig-1)

(Fig-2)

As is well known, the usual method toward the optimization of P in (1) subject to the constraint (2) is to use an alternative extension of the k-means algorithm for clustering categorical data 2 partial optimization for Q and W. That is, we first fix Q and find necessary conditions for W to minimize P. Then we fix W and minimize P according to Q. Basically, the k-means algorithm iterates through a three-step process until P(W,Q) converges to some local minimum (Selim and Ismail, 1984): Steps: 1. Select an initial Q(0) = {Q(0)1 , . . . ,Q(0) k }, and set t = 0. 2. Keep Q(t) fixed and solve P(W,Q(t)) to obtain W(t), i.e., regarding Q(t) as the cluster centers, assign each object to the cluster of its nearest cluster center. 3. Keep W(t) fixed and generate Q(t+1) such that P(W(t),Q(t+1)) is minimized, i.e., construct new cluster centers according to the current distribution of objects. 4. In the case of convergence or if a given stopping criterion is fulfilled, output the result and stop. Otherwise, Set t = t + 1 and go to Step 2. In the setting of numerical data clustering, the Euclidean norm

is often chosen as a natural distance measure in the k means algorithm. With this distance measure, the computation of the mean of a clusters objects returns the clusters center, fulfilling the minimization condition of Step 3 above. Namely,

K-means algorithm has common advantages such as easy to implement and converge in finite iterations. But when samples have noises (outliers) which are sufficiently far removed from the rest of the data (see Fig. 2), they will have influences on the results. Process The dataset is partitioned into K clusters and the data points are randomly assigned to the clusters resulting in clusters that have roughly the same number of data points. For each data point: Calculate the distance from the data point to each cluster. If the data point is closest to its own cluster, leave it where it is. If the data point is not closest to its own cluster, move it into the closest cluster. Repeat the above step until a complete pass through all the data points results in no data point moving from one cluster to another. At this point the clusters are stable and the clustering process ends. The choice of initial partition can greatly affect the final clusters that result, in terms of inter-cluster and intracluster distances and cohesion. Advantage of K Means With a large number of variables, K-Means may be computationally faster than hierarchical clustering (if K is small). K-Means may produce tighter clusters than hierarchical clustering, especially if the clusters are globular.

Disadvantage of K Means

Difficulty in comparing quality of the clusters produced (e.g. for different initial partitions or values of K affect outcome). Fixed number of clusters can make it difficult to predict what K should be. Does not work well with non-globular clusters. Different initial partitions can result in different final clusters. It is helpful to rerun the program using the same as well as different K values, to compare the results achieved. LXXXII. Application of K Means Data detection for burst-mode optical receiver[s], and recognition of musical genres. High-speed optical multi-access network applications, [such as] optical bus networks [and] WDMA optical star networks" can use burst-mode receivers (Zhao, 1492). New, efficient burst-mode signal detection scheme" that utilizes "a two-step data clustering method based on a K-means algorithm. Achieve human-level accuracy with fast training and classification" by using "radial basis function. It tends to find the outliers in the data set. Creating a network without gradient descent takes seconds, whereas applying gradient descent takes hours. Clustering effects on image synthesis Data Sensitive application with K Means Clustering From the algorithmic framework, we can see that the algorithm needs to adjust the sample classification continuously, and calculate the new cluster centers constantly. Therefore, the time consumption is fairly considerable when the dataset is large. It is necessary to improve the efficiency of the algorithms application. If a data point is far away a center, it is not necessary to calculate the exact distance between the point and the center in order to know that the point should not be assigned to this center. So most distance calculations in standard K-means are redundant. In this paper, we use triangle inequality to reduce these redundant calculations. In this way we improved the efficiency of the algorithm to a large extent. III. DEMONSTRATION OF THE STANDARD ALGORITHM The algorithm is deemed to have converged when the assignments no longer change. Commonly used initialization methods are Forgy and Random Partition. The Forgy method randomly chooses k observations from the data set and uses these as the initial means. The Random Partition method first randomly assigns a cluster to each observation and then proceeds to the Update step, thus computing the initial means to be the centroid of the cluster's randomly assigned points. The Forgy method tends to spread the initial means out, while Random Partition places all of them close to the center of the data set. According to Hamerly et al., the Random Partition method is generally preferable.

(1)

(2)

(3)
1.

(4)
K initial "means" (in this case k=3) are randomly selected from the data set (shown in color). K clusters are created by associating every observation with the nearest mean. The partitions here represent the Voronoi diagram generated by the means.The centroid of each of the k clusters becomes the new means. Steps 2 and 3 are repeated until convergence has been reached.

2. 3.

4.

IV. VARIANTS OF K-MEANS

i.) Optimized K-Means algorithm


As can be seen from the generally acknowledged truth, the sum of two sides is greater than the third side in a triangle. Euclidean distance meets the triangle inequality, which we can extend to the multi-dimensional Euclidean space. We can take three vectors in Euclidean space randomly: x, a, b, then: _________(1) _________(2)

___(3) _______(4)

Thus we can improve K-Means based on theory above, and for convenience we call the improved algorithm K-Means new. The general process for K Means_ new algorithm is as follows: First, select initial cluster centers, and set the lower bound y(x,f)=0 for each data point and cluster center . Second, assign each data point to its nearest initial cluster, we will use the results obtained previously to avoid unnecessary distance calculations in this process. Each time d(x,f) is computed, set y(x,f)=d(x,f). At the beginning of each iteration, for most data points and cluster centers, the lower bounds and the upper bounds are tight. If they are tight at the beginning of one iteration, the upper bounds are inclined to be tight at the beginning of the next iteration, as most cluster centers changes slightly, so the bounds change slightly. ii.) The k-modes algorithm In principle the formulation of problem P is also valid for categorical and mixed type objects. The cause that the kmeans algorithm cannot cluster categorical objects is its dissimilarity measure and the method used to solve problem P2. These barriers can be removed by making the following modifications to the k-means algorithm: 1. Using a simple matching dissimilarity measure for categorical objects, 2. Replacing means of clusters by modes, and 3. Using a frequency-based method to find the modes to solve problem P2. This section discusses these modifications. Steps: 1. Calculate the frequencies of all categories for all attributes and store them in a category array in descending order of frequency as shown in figure 1. Here, ci; j denotes category i of attribute j and f .ci; j / f .ciC1; j / where f .ci; j / is the frequency of category ci; j . 2. Assign the most frequent categories equally to the initial k modes. For example in figure 1, assume k D3. We assign Q1 D[q1;1 Dc1;1; q1;2 Dc2;2; q1;3 Dc3;3; q1;4 Dc1;4];

defined by the total mismatches of the corresponding attribute categories of the two objects. The smaller the number of mismatches is, the more similar the two objects. This measure is often referred to as simple matching (Kaufman and Rousseeuw, 1990). Formally,

Where

iii.) The k-prototypes algorithm


It is straightforward to integrate the k-means and k-modes algorithms into the k-prototypes algorithm that is used to cluster the mixed-type objects. The k-prototypes algorithm is practically more useful because frequently encountered objects in real world databases are mixed-type objects. The dissimilarity between two mixed-type objects X and Y , which are described by attributes Ar1; Ar2; : : : ; Arp; AcpC1; : : : ; Ac m, can be measured by

where the first term is the squared Euclidean distance measure on the numeric attributes and the second term is the simple matching dissimilarity measure on the categorical attributes. The weight is used to avoid favoring either type of attribute. The influence of in the clustering process is discussed in (Huang, 1997a)

Let

Q2 D[q2;1 Dc2;1; q2;2 Dc1;2; q2;3 Dc4;3; q2;4 Dc2;4] and Q3 D[q3;1 Dc3;1; q3;2 Dc2;2; q3;3 Dc1;3; q3;4 Dc3;4]. 3. Start with Q1. Select the record most similar to Q1 and replace Q1 with the record as the first initial mode. Then select the record most similar to Q2 and replace Q2 with the record as the second initial mode. Continue this process until Qk is replaced. In these selections Ql 6D Qt for l 6D t. Step 3 is taken to avoid the occurrence of empty clusters. The purpose of this selection method is to make the initial modes diverse, which can lead to better clustering results Dissimilarity measure Let X, Y be two categorical objects described by m categorical attributes. The dissimilarity measure between X and Y can be

And

We can rewrite it as

Since both Pr and Pc are nonnegative, minimizing P.W;Q/ is equivalent to minimizing Prl and Pcfor 1 l k. We can still use the same algorithm in Section 3 to find a locally optimal Q;W because nothing has changed except for d. ; /. Given a QO , we use (9) to calculate W in the same way as the kmeans algorithm. Given a WO , we find Q by minimizing Pr l

and Pcl for1 l k. Prl is minimised if ql; j is calculated by (4). From Section 4 we know that Pcl can be minimised by selecting ql; j for pC1 j m according to Theorem 1. A practical implementation of the k-prototypes algorithm is given in (Huang, 1997a) iv.) The global k-means algorithm. The global k means[9] algorithm has the following steps to perform the specific task. Step 1. (Initialization) Compute the centroid x1 of the set A:

segmentation model except RFM. Overall, we are glad to see that the proposed method can be helpful for others in many fields.

REFERENCES
[1]J. B. MacQueen, Some Methods for Classification and Analysis of Multivariate Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1967, vol. 1, pp. 281297. [2] Ching-Hsue Cheng, 2009. Classifying the segmentation of customer value via RFM model and RS theory. Expert Systems with Applications, 36 (2), 41764184. [3] Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). USA: Morgan Kaufmann Publishers. [4]J.Peppard, Customer relationship management (CRM) innancial services, European Management Journal18 (2000) 312-327 [5] J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, San Francisco, CA, 2001. [6] ZHEXUE HUANG, Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values, Data Mining and Knowledge Discovery 2, 283304 (1998) c 1998 Kluwer Academic Publishers. Manufactured in The Netherlands. [7] Xiaoping Qin, Shijue Zheng , Tingting HeMing Zou, Ying Huang Optimizated K-means algorithm and application in CRM system 2010 International Symposium on Computer, Communication, Control and Automation [8] P. S. Bradley and U. M. Fayyad, Refining initial points for K-means clustering, Proceedings of the Fifteenth International Conference on Machine Learning (ICML98), 1998, pp. 91-99 [9] Adil M. Bagirov, Karim Mardaneh, Modified global kmeans algorithm for clustering in gene expression, Centre for Informatics and Applied Optimization, School of Information Technology and Mathematical Sciences, University of Ballarat, Victoria, 3353, Australia.

and set k = 1. Step 2. Set k = k + 1 and consider the centers x1, x2, . . . , xk1 from the previous iteration. Step 3. Consider each point a of A as a starting point for the kth cluster center, thus obtaining m initial solutions with k points (x1, . . . , xk1, a); apply k-means algorithm to each of them; keep the best k-partition obtained and its centers x1, x2, . . . , xk. Step 4. (Stopping criterion) If k = q then stop, otherwise go to This version of the algorithm is not applicable for clustering on middle sized and large data sets. Two Procedures were introduced to reduce its complexity (see (Likes et al. 2003)). We mention here only one of them because the second procedure is applicable to low dimensional data sets. Let di k1 be a squared distance between ai 2 A and the closest cluster center among the k 1 cluster centers obtained so far. For each ai 2 A we calculate the following:

and we take the data point al 2 A for which l = arg min i=1,...,mras a starting point for the k-th cluster center. Then kmeans algorithm is applied starting from the point x1, x2, . . . , xk1, and al to find k cluster centers. In our numerical experiments we use this procedure. It should be noted that kmeans algorithm and its variants tend to produce only spherical clusters and they are not always appropriate for solving clustering problems. However applying k-means algorithms we assume that clusters in a data set can be approximated by n-dimensional balls. V. CONCLUSION As we known, the standard K-Means algorithm is used in many fields. While, the efficiency of k-means algorithm is unsatisfactory when faced with large-scale dataset. In this paper, we improved the standard K-Means algorithm using triangle inequality. We run the optimized algorithm on Wine Recognition Data and compare the efficiency of two algorithms. The experimental results show that our method is more efficient than the standard K-Means algorithm, especially when the number of clusters is large We also use the new method to carry out customer classification on the customer dataset took from a company in insurance industry. In future research, we want to employ other kinds of datasets to do the experiment, such as financial industry or education industry. Besides, we will consider other customer

Performance Evaluation of Route optimization Schemes Using NS2 Simulation


MANOJ MATHUR1, SUNITA MALIK2, VIKAS3
Electronics & Communication Engineering Department D.C.R.U.S.T., MURTHAL (Haryana)
mnjmathur03@gmail.com 2 snt mlk@yahoo.co.in 3 vikaspanchal93@gmail.com
1

Abstract - MIPv4 (Mobile Internet Protocol version 4), in which the main problem is triangular routing. Mobile node able to deliver packets to a corresponding node directly through foreign agent but when corresponding node sends packet to the mobile node packet comes to foreign agent via home agent then it comes to mobile node. This asymmetry is called triangle routing. It leads to many problems, like load on the network and delay in delivering packets. The next generation IPv6 is designed to overcome this kind of problem (triangle routing). To solve the triangle routing problems three different route optimization schemes are used which exclude the inefficient routing paths by creating the shortest routing path. These are Liebschs Route optimization scheme, Light Weight Route optimization scheme, enhanced light weight route optimization scheme. I have taken Throughput and Packet delivery fraction, Performance metrics to compare these three schemes by using NS-2 simulations. Throughput is the rate of communications per unit time. Packet delivery fraction (PDF) is the ratio of the data packets delivered to the destinations to those generated by the CBR sources. By using these parameters I have found that enhanced light weight route optimization scheme performance is better than Liebschs Route optimization scheme & Light Weight Route optimization scheme. Keywords: Route Optimization Schemes, Performance Result

are used which exclude the inefficient routing paths by creating the shortest routing path The RO schemes using correspondent information (CI) message. These are Liebschs Route optimization scheme, Light Weight Route optimization scheme, enhanced light weight route optimization scheme., In this paper I have compare these three schemes by using NS-2 simulations.

II. The RO SCHEMES Liebschs RO Scheme In Liebschs Route Optimization we use Local Mobility Anchor and Mobile Access Gateway to exchange the RO message for establishing RO path for the Mobile Nodes. It is the LMA which enable the packet sending possible between MN1and MN2. When the MN1 sends the packet to the MN2, the Local Mobility Anchor enable the RO trigger for data packets sent from the MN1 to the MN2 [7]. This is because the LMA has all network topology information in the LMD. In the beginning of Route Optimization procedures LMA sent the Route Optimization Int message to Mobile Access Gateway2. After Mobile Access Gateway2 send the RO Init Acknowledgement to Local Mobility Anchor. Local Mobility Anchor sends the RO setup message to the MAG1. The MAG1 send the RO setup Acknowledgment message to the LMA. As the LMA send and receive the same message for the MAG2, the RO procedure is finished. Then data packets are directly delivered between the Mobile Node1 and Mobile Nnode2 due to the effect of the RO.

I.

INTRODUCTION

As the growth of wireless network technology dimension for accessing mobile network has been increased dramatically. Mobile Internet Protocol version6 is a mobility protocol standardized by the Internet Engineering Task Force (IETF). In Mobile Internet Protocol version6, communications are maintained even though the mobile node (MN) moves from its home network to foreign network. This is because that the MN sends Binding Update (BU) message to its Home Agent (HA) located in the HN to inform the location information whenever the MN hands off (move) to other networks. The Mobile Nodes in the Internet, it requires that the MNs maintain mobility related information and create own mobility signaling message. In other words, the MNs that has limited processing power, battery, and memory resource [2]. To overcome such limitations, IETF has proposed Proxy Mobile IPv6 (PMIPv6) protocol. In PMIPv6, the MN's mobility is guaranteed by the newly proposed network entities such as the local mobility anchor (LMA) and the mobile access gateway (MAG) [6]. PMIPv6 causes the triangle routing problem that causes inefficient routing path [5] . In order to establish the efficient routing paths, three different Routing Optimization (RO) schemes have been introduced. To solve the triangle routing problems three different route optimization schemes

the possible setup with RO. The LMA sends Corresponding Binding Information (CBI) message to the

Fig 1 Data flow in Liebschs RO Scheme

Light Weight Route Optimization Scheme (LWRO) In Light Weight Route Optimization Scheme Local Mobility Anchor and Mobile Access Gateway are used. To establish the Route Optimization path between the Mobile Nodes we use Local Mobility Anchor and Mobile Access Gateway. In it Mobile Node1 connected to Mobile Access Gateway and the Mobile Node2 connected to Mobile Access Gateway2. The packets from the Mobile Node1 to the Mobile Node2 are passing through the Local Mobility Anchor [7]. When the Local Mobility Anchor received the packet, it knows the path for the packets to the Mobile Access Gateway2, but at the same time, it also sends a corresponding Binding Update to Mobile Access Gateway2. The Mobile Access Gateway1 receives the corresponding Binding Acknowledgment. Now packet is send from Mobile Access Gateway2 to Mobile Node1. Thus packets from the MN1 destined to the MN2 get intercepted by the Mobile Access Gateway1 and are forwarded to the Mobile Access Gateway2, instead of being forwarded to the Local Mobility Anchor.
Fig 2 Data flow in Light Weight RO Scheme

MAG1 [1] .Corresponding Binding Information (CBI) message include the MN1's address, the MN2's address, and the MAG2's address information. When the MAG1 received CBI message, then the MAG1 send Correspondent Binding Update message to the MAG2. Correspondent Binding Update message include the MN1's address, the MN2's address and the MAG1's address information The MAG2 sends Corresponding Binding Acknowledgment (CBA) message to the MAG1 for Corresponding Binding (CB). Now the packets are exchange between the MN1 and the MN2.

Enhance Light Weight Route Optimization Scheme (ELWRO) In Enhance Light Weight Route Optimization Scheme Local Mobility Anchor and Mobile Access Gateway are used. To establish the Route Optimization path between the Mobile Nodes we use Local Mobility Anchor and Mobile Access Gateway. In ELWRO scheme in Corresponding Binding Information (CBI) message are used. In MN1 sends data packets to the MN2.First of all MN1 sends the data packets to the Mobile Access Gateway1, and then the MAG1 sends the data packets to the Local Mobility Anchor. The LMA knows

Fig 3 Data flow in Enhance Light Weight RO Scheme

the Liebschs and light weight route Optimization scheme. In packet are transmitted between CN & MN more fastly.

III. PERFORMANCE METRICS Performance matrix for the above three scheme is given by Throughput: Defined as rate of communication per unit time. TH=SP/PT SP=sent packet PT=pause time

Packet Delivery Fraction: Defined as the ratio of data packets delivered to destination to those generated by CBR source is known as packet delivery fraction. PDF=SPD/GPCBR*100 SPD =sent packet to destination GPCBR =generated packet by cbr IV. PERFORMANCE RESULT Throughput: As indicated in graph the Enhanced Light Weight route optimization scheme perform better than the Liebschs and light weight route optimization scheme. In ELWRO rate of communication of packets are more with respect to pause time. In packet are transmitted between CN & MN more fastly.

Fig 5 Packet delivery fraction comparison study graph

V. CONCLUSION In this paper, we have introduced the operation of three RO schemes that solve the triangle routing problem and provided the results of performance evaluation. The results of Throughput and Packet Delivery Fraction performance evaluation show that performance of our ELWRO scheme is better than Liebschs route optimization scheme & LWRO scheme. REFERENCES
[1] Choi, Young-Hyun and Chung, Tai Myoung Enhanced light weight route optimization in proxy mobile IPv6 2009 fifth international joint conference on INC,IMS and IDC at internet management technology laboratory. [2] IETF Working Group on Mobile IP. [3] Choi, Young-Hyun and Chung, Tai Myoung.et-al Route Optimization Mechanisms Performance Evaluation in Proxy Mobile IPv6 2009 Fourth International Conference on Systems and Networks Communications. [4] Dutta, A.Das,S et.al Proxy MIP extension for inter-MAG route optimization, Jan2009. [5] RFC-2002 IP Mobility Support. [6] C.E. Perkins, Mobile IP: Design Principles and Practises. [7] Route Optimization Mechanisms Performance Evaluation in Proxy Mobile IPv6 2009 Fourth International Conference on Systems and Networks Communications. [8] J. Lee, et al., "A Comparative Signaling Cost Analysis of Hierarchical Mobile IPv6 and Proxy Mobile IPv6", IEEE PIMRC 2008, pp.1-6, September 2008.

Fig 4 Throughput comparison study graph

Packet Delivery Fraction: As indicated in graph the Enhanced Light Weight route optimization scheme perform better than

Image Tracking and Activity Recognition


Navneet Sharma1,Divya Dixit2, Ankur Saxena3 1,2 Students, M.Tech, Galgotias College of Engineering and Technology 3 Guide:- ITS College of Engineering and Technology
1

navneet1979@gmail.com

2 3

divyadivyadixit@rediffmail.com

ankursaxena19jan@rediffmail.com

Abstract:- Motion is a primary visual cue for


umans. Motion alerts our attention mechanisms (towards potentially interesting or dangerous moving objects). In order to scrutinize a moving object, we need to follow the object as it moved through the image sequence - this is called visual tracking. In video analysis, the motion of a specified target signal can be tracked using the concept of stochastic gradient systems. This type of tracking a target object (in the foreground) of the video helps in the following tasks 1. Modeling the velocity of the target as a combination of known and unknown parameters. 2. Prediction of future motion. 3. Simulation of such movements. 4. Summarizing the movements of the target. I. INTRODUCTION In recent years, there has been an increasing interest in monocular human tracking and activity recognition systems, due to the large amount of applications where those features can be used. Standard algorithms are not practical to employ for human tracking due to the computational cost that arises from the high number of degrees of freedom of the human body and from the ambiguity of the images obtained from a single camera. Image tracking and activity recognition are receiving increasing attention among computer scientists due to the wide spectrum of applications where they can be used, ranging from athletic performance analysis to video surveillance. By image tracking we refer to the ability of a computer to recover the position and orientation of the object from a sequence of images. There have been several different approaches to allow computers to derive automatically the kinematics pose and activity from image sequences. This paper will focus on monocular (i.e. systems that work with only one camera) tracking and activity recognition systems. Due to the complexity of the problem space and the ambiguity of the images obtained from the camera, standard algorithms produce results that are extremely computationally expensive.

Constraints in the configuration of the human body can be used to reduce its complexity. The constraints can be deduced from demonstration, based on the human performance of different activities. A human tracking system is developed using this kind of constraints and then evaluated. The fact that the constraints are based on activities allows, while doing the tracking, the inference of the activity the human is performing. In this paper, a methodology is presented for tracking of a foreground object in a suitably selected video. The motion of this foreground is assumed to be non-linear in time and hence modeled by stochastic differential equations[1]. These equations include the following:1. 2. A known parameter (which is a non-linear function of the 2-D drift velocity). An unknown parameter, which appropriately models the velocity of the target, as a random process.

The solution now calculates a diffusion matrix, which is specific to the target concerned and hence can be used in identification of the same. II. PRELIMINARIES Major motivation for this topic of tracking comes from the potential function approach employed in the sports and tracking of animals. Following is a brief overview of the application of potential function in tracking. Consider a typical soccer match. The motion of the ball in a plane field may be considered as Brownian[3][4]. The entire path traced by the football during a football or a hockey match.

(r(ti+1)-r(ti))/(ti+1-ti) = (r(ti)) + i
This expression makes it clear that has the interpretation as average velocity at r. Regarding these noise values it will be assumed, for the moment, that the x- and y-components are independent normals with mean 0 and variance _2. Residual plots will be employed to assess whether the variance does not appear constant. Later, this expression will be used for simulation purposes. As the expression is linear in the parameters, so too will be its gradient with the implication that simple least squares may be used to estimate the parameters. The variance may be estimated by the standardized sum of the squared residuals of the fit of last equation. The distribution of the potential function so calculated is shown in figure.

Fig 1 . The trajectory of a typical play.

The path in Figure may be viewed as a realization of a stochastic process described by the time ti at which the i-th pass was initiated and (x(ti),y(ti)) the location where the pass was started on the field for i = 1,I. A statistical question is how to describe such a trajectory, that is one involving points connected by straight lines. The approach employed here involves potential functions motivated by classical mechanics and advanced calculus[1][2]. It lets one describe instantaneous velocity at an arbitrate place and time. Where will the particle head next and at what speed? This method has proven helpful in describing the motion of a road variety of objects. The potential plus statistical model approach allows simulation of future paths. Take the fitted potential of the play, symmetrize about the middle, use for each team (different) ends of the field. Show, with additional data, can simulate the flow in a game with the ball changing sides. The classical potential function of Newtonian gravity is given by - 1/|r| with r = (x,y), r denoting a 2D row vector and |r| = (x2+y2). This particular potential goes to negative infinity as |r| goes to 0. A particle moving in its field will be attracted to the origin, (0,0). A class of potential functions, of which the Newtonian is a member is provided by |r|. Let r(t) denote the location (x(t),y(t)) and consider the model

Fig 2:- he redder/darker the color is, the smaller the value of the potential estimate H . The balls track has been superposed. The dashed vertical line is midfield.

III. STOCHASTIC DIFFERENTIAL EQUATION Stochastic differential equations[1][2][3] have been used to model and track the motion of mountain elk. Location of an elk at time t can be given as r(t) = {X(t),Y(t)}. If dX(t) and dY(t) are the incremental step sizes in x and y directions, respectively, then the following model is employed

r(ti+1)-r(ti)=(r(ti))(ti+1-ti)+i
for the changes of position where is the gradient of a specified parametric potential function and, ti the times of pass initiation. In the fitting and conceptualization it proves more convenient to write

Where is the drift vector ( resolved in x and y directions), dt is the incremental time D{r(t),t} is the diffusion matrix (the unknown) which is a function of location r(t) and time is a random process, which introduces some variability in the deterministic motion. The function, H{r(t),t}, that describes this force field at location r and time t is referred to as a potential function. When a potential function exists,

the relationship between the function and the drift term is given by If animals are attracted or repelled from grassland foraging areas or other habitat features at certain times of the day then H(r) might be assumed to depend on distances to the habitat feature[2]. For example, if the shortest distance from an animal at r to a foraging area is d(r), then H{r(t),t} = h{d(r),t} for some function h(.).The estimation method to be presented can be motivated by stochastic gradient systems, that is, systems that can be written in the time invariant case as

A.Determination of background The background of the image frames extracted from the video is determined, where background updation should take place every t second in the video . The frames are captured at a uniform interval of t = 0.5seconds (500ms) . Thus the background is regularly updated. B.Determination of foreground Foreground identification is performed by considering two consecutive frames and applying clusterization techniques, followed by change detection by subtraction. The centroid at time ti as centroid (xi, yi) of the resulting foreground image is calculated. C.Determination of attractor and repellor for foreground in the frameAttractor and repellor are identified by again considering consecutive set of frames. D.Determination of the difference between the two consecutive framesThe calculation of difference image is performed for every consecutive pair of frames. E.Determine the speed of the moving target foreground as i = ((xi+1-xi )2 + (yi+1-yi)2)1/2/(ti+1-ti). The Instantaneous velocities are calculated at every time instant using the centroid results. F.Setting up a model r(ti+1)-r(ti) = (r(ti))(ti+1-ti) + i (r(ti+1)-r(ti))/ (ti+1-ti) = (r(ti)) + i where i = average velocity at r(ti). The model is formed by calculating the difference between successive centroid locations and considering the previously calculated drift value.

for some differentiable V with B(t) a p-dimensional Brownian motion and a by matrix. This expression is a particular case of the stochastic differential equation (SDE) What distinguishes the traditional SDE work from the present study is that the drift term here has the special form of a gradient of some real-valued function. It will be seen that the modeling situation is simplified when such a gradient is assumed to exist. IV THE METHODOLOGY In this Paper, a methodology is presented for tracking of a foreground object in a suitably selected video, using the model of stochastic gradient systems and Brownian motion described in previous sections. The motion of this foreground is assumed to be nonlinear in time and hence modeled by stochastic differential equations. These equations include the following: A known parameter (which is a non-linear function of the 2-D drift velocity). An unknown parameter, which appropriately models the velocity of the target, as a random process. The solution now calculates a diffusion matrix, which is specific to the target concerned. The motion of the target is now categorized into the following Translation Left Movement Right Movement Scaling Scaling towards camera Scaling away from camera Tracking a specified target in a video signal can be accomplished in the following steps

REFRENCES
[1] E. Al`os, O. Mazet, and D. Nualart. Stochastic calculus with respect to Gaussian processes. Ann. Probab., 29(2):766801, 2001. [2] P. Caithamer. The stochastic wave equation driven by fractional Brownian noise and temporally correlated smooth noise. Stoch. Dyn., 5(1):4564, 2005. [3] P. Yvergniaux and J. Chollet , Particle Trajectories Modelling Based on a Lagrangian Memory Effect. XXIIIth IAHR Conference, Hydraulics and the Environment, Ottawa, Canada., Aug. 1989. [4] Y. Saad, Iterative Methods for Sparse Linear Systems. Boston, MA: PWS, 1996.

An innovative digital watermarking process A Critical Analysis


Sangeeta Shukla#1, Preeti Pandey*2, Jitendra Singh#3
#1,2,3

Student, M Tech II Sem, SRMS College of Engineering & Technology, Bareilly, India
sangeeta.t17@gmail.com preeti1986pandey@gmail.com jitendravictor@gmail.com

AbstractThe seemingly ambiguous title of this paper use of the terms criticism and innovation in concord signifies the imperative of every organisations need for security within the competitive domain. Where organisational security criticism and innovativeness were traditionally considered antonymous, the assimilation of these two seemingly contradictory notions is fundamental to the assurance of long-term organisational prosperity. Organisations are required, now more than ever, to grow and be secured with their innovation capability rending consistent innovative outputs. This paper describes research conducted to consolidate the principles of digital watermarking and identify the fundamental components that constitute organisational security capability. The process of conducting a critical analysis is presented here. A brief description is provided of the basic field of digital watermarking recently, followed by a description of the advantages and disadvantages that were conducted to evaluate the process. The paper concludes with a summary of the analysis and potential findings for future research. Keywords- Digital Watermarking, Intellectual property protection, Steganography, Copyright protection, Covert communication

difficult to remove. In digital watermarking, the signal may be audio, pictures, or video. If the signal is copied, then the information also is carried in the copy. A signal may carry several different watermarks at the same time.

Fig 1. A digital watermarked picture In visible digital watermarking, the information is visible in the picture or video. Typically, the information is text or a logo, which identifies the owner of the media. The image on the right has a visible watermark. When a television broadcaster adds its logo to the corner of transmitted video, this also is a visible watermark.

Introduction
All Digital watermarking is the process of embedding information into a digital signal in a way that is

Fig 2. General Digital Watermarking Process In invisible digital watermarking, information is added as digital data to audio, picture, or video, but it cannot be perceived as such (although it may be possible to detect that some amount of information is hidden in the signal). The watermark may be intended for widespread use and thus, is made easy to retrieve or, it may be a form of Steganography, where a party communicates a secret message embedded in the digital signal. In both the cases, as in visible watermarking, the objective is to attach ownership or other descriptive information to the signal in a way that is difficult to remove. It also is possible to use hidden embedded information as a means of covert communication between individuals.

The information to be embedded in a signal is called a digital watermark, although in some contexts the phrase digital watermark means the difference between the watermarked signal and the cover signal. The signal where the watermark is to be embedded is called the host signal. A watermarking system is usually divided into three distinct steps, embedding, attack, and detection. In embedding, an algorithm accepts the host and the data to be embedded, and produces a watermarked signal. Then the watermarked digital signal is transmitted or stored, usually transmitted to another person. If this person makes a modification, this is called an attack. While the modification may not be malicious, the term attack arises from copyright protection application, where pirates attempt to remove the digital watermark through modification. There are many possible modifications, for example, lossy compression of the data (in which resolution is diminished), cropping an image or video, or intentionally adding noise. Detection (often called extraction) is an algorithm which is applied to the attacked signal to attempt to extract the watermark from it. If the signal was unmodified during transmission, then the watermark still is present and it may be extracted. In robust digital watermarking applications, the extraction algorithm should be able to produce the watermark correctly, even if the modifications were strong. In fragile digital watermarking, the extraction algorithm should fail if any change is made to the signal.

Applications
Digital watermarking may be used for a wide range of applications, such as: Copyright protection Source tracking (different recipients get differently watermarked content) Broadcast monitoring (television news often contains watermarked video from international agencies) Covert communication

Digital watermarking life-cycle phases

Classification:
A digital watermark is called robust with respect to transformations if the embedded information may be detected reliably from the marked signal, even if degraded by any number of transformations. A digital watermark is called robust if it resists a designated class of transformations. Robust watermarks may be used in copy protection applications to carry copy and no access control information.

Fig 3. Digital watermarking life cycle phases General digital watermark life-cycle phases are with embedding-, attacking-, and detection and retrieval functions

Typical image degradations are JPEG compression, rotation, cropping, additive noise, and quantization. For video content, temporal modifications and MPEG compression often are added to this list. A digital watermarking method is said to be of quantization type if the marked signal is obtained by quantization. Quantization watermarks suffer from low robustness, but have a high information capacity due to rejection of host interference. A digital watermark is called imperceptible if the watermarked content is perceptually equivalent to the original, unwatermarked content.[1] A digital watermark is called perceptible if its presence in the marked signal is noticeable, but non-intrusive. A digital watermarking method is referred to as spread-spectrum if the marked signal is obtained by an additive modification. Spreadspectrum watermarks are known to be modestly robust, but also to have a low information capacity due to host interference. A digital watermarking method is referred to as amplitude modulation if the marked signal is embedded by additive modification which is similar to spread spectrum method, but is particularly embedded in the spatial domain. Reversible data hiding is a technique which enables images to be authenticated and then restored to their original form by removing the digital watermark and replacing the image data that had been overwritten. Digital watermarking for relational databases emerged as a candidate solution to provide copyright protection, tamper detection, traitor tracing, maintaining integrity of relational data.

set,some documents were identified as core, directly addressing the subject of digital watermarking. These documents were sourced from many locations, including peer reviewed journals, conference proceedings, white papers, electronic books, etc.
The core documents were further subdivided into 9 groups. The topics were tabulated and were used to perform critical analysis. The first step was a detailed manual analysis and interpretation of the documents thus extracted (supplementing the initial literature study) and the second step was a critical approach towards every document for analysis.

Critical analysis
The analysis was done on the basis of the topics indexed and the present work, the future scope, the methodology used and the conclusion were tabulated for concluding the report on the analysis if digital watermarking. Considering the potential advantages, disadvantages of this technology.

Advantages:
Content Verification: Invisible digital watermarks allow the recipient to verify the authors identity. This might be very important with certain scientific visualizations, where a maliciously altered image could lead to costly mistakes. For example, an oil company might check for an invisible watermark in a map of oil deposits to ensure that the information is trustworthy. Watermarks provide a secure electronic signature. Determine rightful ownership: Scientific visualizations are not just graphs of data; they are often artistic creations. It is therefore entirely appropriate to copyright these images. If an author is damaged by unauthorized use of such an image, the author is first obligated to prove rightful ownership. Invisible digital watermarking provides another method of proving ownership (in addition to posting a copyright notice and registering the image). Track unlawful use: This technology might allow an author to track how his or her images are being used. Automated software would scan randomly selected images on the Internet

Literature Review
Wherever The literature surveyed prior to the research process, and throughout the duration of this project, constituted more than hundreds of documents. From this large literature

(or any digital network) and flag those images which contain the authors watermark. This covert surveillance of network traffic would detect copyright violations thereby reducing piracy. Avoid malicious removal: The problem with copyright notices is that they are easily removed by pirates. However, an invisible digital watermark is well hidden and therefore very difficult to remove. Hence, it could foil a pirates attack.

TABLE I Crtical Analysis of the Literature Reviewed


Topic Digital Water markin g for 3D Polygo ns using Multir esoluti on Wavel et Decom positio n Present work Proposed watermarki ng method is based on wavelet transform (WT) and multiresolu tion representat ion (MRR) of the polygonal model. Future work In future watermarki ng should be more robust to possible geometric operation, noise imposition and intentional attack. The embedding capacity should also be increased, and processing time be decreased. The method to extract the watermark without the original polygon should be proposed. It must be expanded to the freeform surface model or solid model which has to be more secret than the polygonal model in the CAD/CAM area. In the future, in order to make watermark s more tamperresistant, we are to apply error correcting code to our Method First the requiremen ts and features of the proposed watermarki ng method are discussed. Second the mathemati cal formulation s of WT and MRR of the polygonal model are shown. Third the algorithm of embedding and extracting the watermark is proposed Conclusion Finally, the effectiveness of the proposed watermarking method is shown through several simulation results.

Disadvantages
Degrade image quality: Even an invisible watermark will slightly alter the image during embedding. Therefore, they may not be appropriate for images which contain raw data from an experiment. For example, embedding an invisible watermark in a medical scan might alter the image enough to lead to false diagnosis. May lead to unlawful ownership claims for images not yet watermarked: While invisible digital watermarks are intended to reduce piracy, their widespread acceptance as a means of legal proof of ownership may actually have the opposite effect. This is because a pirate could embed their watermark in older images not yet containing watermarks and make a malicious claim of ownership. Such claims might be difficult to challenge. No standard system in place: While many watermarking techniques have been proposed, none of them have become the standard method. Furthermore, none of these schemes have yet been tested by a trial case in the courts. Therefore, they do not yet offer any real copyright protection. May become obsolete: This technology only works if the watermarks cannot be extracted from an image. However, technological advances might allow future pirates to remove the watermarks of today. It is very difficult to ensure that a cryptographic method will remain secure for all time.

A Practic al Metho d for Water markin g Java Progra ms

A practical method that discourages program theft by embedding Java programs with a digital watermark.

Embedding method is indiscernibl e by program users, yet enables us to identify an illegal program that contains

The result of the experiment to evaluate our method showed most of the watermarks (20 out of 23) embedded in class files survived two

A Digital Audio Water mark Embed ding Algorit hm

Soft IP Protec tion: Water markin g HDL Codes

Embedding a program developers copyright notation as a watermark in Java class files will ensure the legal ownership of class files. Proposed an audio digital watermarki ng algorithm based on the wavelet transform and the complex cepstrum transform (CCT) by combining with human auditory model and using the masking effect of human ears. Leverage the unique feature of Verilog HDL design to develop watermarki ng techniques. These techniques can protect both new and existing Verilog designs. This paper presents a secure (tamperresistant) algorithm for watermarki ng images, and a

watermarki ng method.

stolen class files.

kinds of attacks that attempt to erase watermarks: an obfuscactor attack, and a decompilerecompile attack.

methodolo gy for digital watermarki ng that may be generalized to audio, video, and multimedia data. .

This algorithm does not compromis e the robustness and inaudibility of the watermark effectively.

This algorithm is realized to embed a binary image watermark into the audio signal and improved the impercepti bility of watermarks .

Experimental results show that this algorithm has a better robustness against common signal processing such as noise, filtering, resampling and lossy compression.

We are currently collecting and building more Verilog and VHDL circuits to test our approach. We are also planning to develop CAD tools for HDL protection. The experiment s presented are preliminary , and should be expanded in order to validate the

Secure Spread Spectr um Water markin g for Multi media

We watermark SCU-RTL & ISCAS benchmark Verilog circuits, as well as a MP3 decoder. Both original and watermark ed designs are implement ed on asics & fpgas. Further, the use of Gaussian noise, ensures strong resilience to multipledocument, or

The results show that the proposed techniques survive the commercial synthesis tools and cause little design overhead in terms of area/resourc es, delay and power.

Digital Water markin g facing Attack s by Amplit ude Scaling and Additi ve White Noise

A communica tions perspective on digital watermarki ng is used to compute upper performanc e limits on blind digital watermarki ng for simple AWGN attacks and attacks by amplitude scaling and additive white noise.

results. We are conducting ongoing work in this area. Further, the degree of precision of the registration procedures used in undoing a fine transforms must be characteriz ed precisely across a large test set of images. An important results is that the practical ST-SCS watermarki ng scheme achieves at least 40 % of the capacity of ICS which can still be improved by further research.

collusional, attacks.

Experimental results are provided to support these claims, along with an exposition of pending open problems.

Digital water mark mobile Agent

A digital watermark agent travels from host to host on a network and acts like a

The second component that we are developing is a datamining and datafusion module to

We show that this case can be translated into effective AWGN attacks, which enables a straight forward capacity analysis based on the previously obtained watermark capacities for AWGN attacks. Watermark capacity for different theoretical and practical blind watermarki ng schemes is analyzed.. This system enables an agency to dispatch digital watermark agents to agent servers and

Analysis shows that the practical STSCS watermarking achieves at least 40 % of the capacity of an ideal blind watermarking scheme.

Development of an active watermark method which allows the watermarked documents themselves to

detective that detects watermarks and collects evidence of any misuse. Furthermor e, we developed an active watermark method which allows the watermark ed documents themselves to report their own usage to an authority if detected. Analys is of Water markin g Techni ques for Graph Colori ng Proble m Theoretical framework to evaluate watermarki ng techniques for intellectual property protection (IPP). Based on this framework, we analyze two watermarki ng techniques for the graph coloring(GC ) problem . Since credibility and overhead are the most important criteria for any efcient watermarki ng technique, Theoretic al analysis of waterma rk

intelligently select the next migration hosts based on multiple sources of informatio n such as related business categories and results of web search engines.

agent can perform various tasks on the server. Once all the actions have been taken, a report will be sent to an agencys database and an agent can continue to travel to another agent server.

report their own usage to an authority if detected.

mark as constr ained by reliabil ity

capacity Simplied watermark scheme is postulated. In the scheme, detection yields a multidimen sional vector, in which each dimension is assumed to be i.i.d. (independe nt and identically distributed) and follow the Gaussian distribution .

positive error rate, the false negative error rate, and the bit error rate

it was shown that this approach yields a good estimate of the capacity of a watermark

__

Formulae is derived that illustrate the tradeoff between credibility and overhead.

Asymptoticall y we prove that arbitrarily high credibility can be achieved with at most 1-coloroverhead for both proposed watermarking techniques.

Conclusion
This paper concludes with a discussion on the relevance and applicability of the Innovative digital piracy and potential further research. The first point pertains to the requirement and the need of digital watermarking as an answer to digital piracy. The economic impact of digital piracy on the media industry is a credible threat to the sustainment of the industry. The advantages of digital watermarking are the following Content Verification, Determine rightful ownership, Track unlawful use, Avoid malicious removal. The disadvantages of it are Degrade image quality, May lead to unlawful ownership claims for images not yet watermarked, No standard system in place , May become obsolete.

Refrences
Some more experiment s can be performed Reliability is represente d by three kinds of error rates: the false Experiments were performed to verify the theoretic analysis, and [24] A Practical Method for Watermarking Java Programs, The 24th Computer Software and Applications Conference (compsac2000), Taipei, Taiwan, Oct. 2000. [25] A Digital Audio Watermark Embedding Algorithm, School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China

Practic al capaci ty of digital water

[26] Yue Sun, Hong Sun, and Tian-ren Yao, Digital audio watermarking algorithm based on quantization in wavelet domain. Journal of Huazhong University of Science and Technology, 2002, Vol.30(5), pp.12-15 . [27] Hong-yi Zhao, Chang-nian Zhang, Digital signal processing and realization in MATLAB. Publishing company of chemical industry, Beijing, 2001, pp.129-131. [28] Secure and Robust Digital Watermarking on Grey Level Images, SERC Journals IJAST Vol 11, 1 [29] Secure Spread Spectrum Watermarking for Multimedia, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 12, DECEMBER 1997. [30] Digital Watermarking by Chaelynne M. Wolak wolakcha@scsi.nova.edu, DISS 780 Ass: Twelve, School of Computer and Information Sciences Nova Southeastern University July 2000

[31] Digital watermarking a technology overview, Hebah H.O. Nasereddin Middle East University, P.O. Box: 144378, Code 11814, Amman-Jordan. IJRRAS 6 (1) January 2011, www.arpapress.com/Volumes/Vol6Issue1/IJRRAS_6_1_10.p df [32] Digital Watermarking facing Attacks by Amplitude Scaling and Additive White Noise, Joachim J. Eggers, Bernd Girod, Robert B auml, 4th Intl. ITG Conference on Source and Channel Coding Berlin, Jan. 28-30, 2002 [33] Digital Watermark Mobile Agents, Jian Zhao and Chenghui Luo Fraunhofer Center for Research in Computer Graphics, Inc. 321 South Main Street Providence, [34] Digital Watermarks in Scientific Visualization, Wayne Pafko Term Paper (Final Version) SciC8011 Paul Morin May 8, 2000

SURVEY ON DECISION TREE ALGORITHM


Shweta Rana shweta_rana1@yahoo.co.in
Guided By:

Jyoti Shukla jyoti.shukla@gmail.com


DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING AMITY SCHOOL OF ENGINEERING AND TECHNOLOGY, AMITY UNIVERSITY,

SECTOR-125, NOIDA(U.P)

Abstract
This paper aims to study the various classification algorithms used in data mining. All these algorithms are based on constructing a decision tree for classifying the data but basically differ from each other in the methods employed for selecting splitting attribute and splitting conditions. The various algorithms which will be studied are: CART (Classification and regression tree), ID3 and C4.5.

systems as they are brought on-line. Data mining tools can analyze massive databases to deliver answers to questions such as, "Which clients are most likely to respond to my next promotional mailing, and why?, when implemented on high performance client/server or parallel processing computers.

CLASSIFICATION ALGORITHMS
A data mining function that assigns items in a collection to target categories or classes is known as Classification. Goal of classification can be described as to accurately predict the target class for each case in the data. The classification task will begin with a data set in which the class assignments are known for each case. The classes are the values of the target. The classes are distinct and do not exist in an ordered relationship to each other. Ordered values would indicate a numerical, rather than a categorical, target. A predictive model with a numerical target uses a regression algorithm, not a classification algorithm. For example, customers might be classified as either users or non-users of a loyalty card. The predictors would be the attributes of the customers: age, gender, address, products purchased, and so on. The target would be yes or no (whether or not the customer used a loyalty card). In the model build (training) process, a classification algorithm finds relationships between the values of the predictors and the values of the target. Different classification algorithms use different techniques for finding relationships. These relationships are summarized in a model, which can then be applied to a different data set in which the class assignments are unknown. Definition : Given a database D = { t1,t2,t3,..tn} of tuples (data , records). And a set of classes c = {c1,c2,c3cm} the classification problem is to define a mapping f:D ->c where each ti is assigned to one class . A class .cj ,contains

Introduction
The extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses, is known as data mining. The future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions are predicted by the Data mining tools. The automated, the prospective analyses offered by the data mining move beyond the analyses of past events provided by retrospective tools typical of the decision support systems. The Data mining tools can answer business questions that

were traditionally too much time consuming to resolve. They search databases for the hidden patterns, finding predictive information that the experts may have missed because it lies outside their expectations.Most of the companies already collect and refine the massive quantities of data. These Data mining techniques can be implemented rapidly on existing software and hardware platforms to enhance the value of the existing information resources, and can be integrated with the new products and the

precisely those mapped to it , that is, cj = { ti : f(ti)=cj , 1<i<n ,and ti belongs to D} [4]

Different types of classification algorithms


5. Statistical based algorithms Regression
As with all the regression techniques we assume type existence of a single output variable and more input variable s. The output variable is numerical . The general regression tree building methodology allows input variables to be tye mixture of continuous and categorical variables. A decision tree is generated where each decision node in the tree contains a test on some input variables value . The terminal nodes of the tree contain the predicted output variable values . Regression tree may be considered as a variant of decision trees , designed to approximate real valued function instead of being used for classification tasks.

also the desired classification for each item . As a result, the training data become the model When classification is to be made for a new item, its distance to each item in the training set is to be determined .only the K , closest entries in the training set are considered items from this set of K , closest items.

7. Decision Tree-Based Algorithm


The decision tree approach is most useful in classification problems. Here in this technique a tree is constructed to model the classification process. Once the tree is built , it is applied to each tuple in the database and results in classification for that tuple . It involves two basic steps : building the tree and applying the tree to the database.

ID3
The ID3 technique is based on information theory and attempts to minimize the expected number of comparisons,. Its basic strategy is to choose splitting attributes with the highest gain first. The amount of information associated with an attribute value is related to the probability of occurrence. The concept here used to qualify information is called entropy.

Bayesian Classification
The effect of a variable value on a given class is independent of the values of other variable is assumed by the Nave Bayes classifications. This assumption is called class conditional independence. This assumption is made to simplify the computation and in this sense considered to be Nave. This is a fairly among assumption and is often not applicable. Although, bias is estimating probabilities, not their exact values that determine the classifications.

C4.5
It is an improvement of ID3. Here classification is via either decision trees or rules generated from them. For splitting purposes, It uses the largest Gain Ratio that ensures a larger than average information gain .This is to compensate for the fact that the gain Ratio value is skewed towards splits where the size of one subset is close to that of the starting one.

6. Distance based algorithms Simple approach


Here in this approach , if we have a representation of each class , we can perform classification by assigning each tuple in the class in which it is most similar . A simple classification technique would be to place each item in the class where it is most similar to the center of the class, a predefined pattern can be used to represent the class. Here if once the similarity measure is defined, each item to be classified will be compared to the predefined pattern . Then that item is going to be placed in the class with largest similarity value.

CART
Classification and Regression tree (CART) is a technique that generates a binary decision tree. Similarly as with ID3, entropy is used as a measure to choose the best splitting attribute and criterion Here however , where the child is created for each subcategory , only two children are created . The splitting is performed around what is found to be the best split point. The tree stops growing when no split will improve the performance.

8. Neural Network Based algorithm A model representing how to classify any given
database is constructed with neural networks, just as with decision trees. The activation functions typically are sigmoid. when a tuple must be classified , certain attribute values from that tuple are input into the directed graph at the corresponding source nodes .

K Nearest Neighbors
The KNN technique, it assumes that the entire training set includes not only the data in the set but

The ID3 Algorithm


ID3 is a non incremental algorithm, meaning it derives its classes from a fixed set of training instances. An incremental algorithm revises the current concept definition, if necessary, with a new sample. The classes created by ID3 are inductive, that is, given a small set of training instances, the specific classes created by ID3 are expected to work for all future instances. The distribution of the unknowns must be the same as the test cases. Induction classes cannot be proven to work in every case since they may classify an infinite number of instances. Note that ID3 (or any inductive algorithm) may misclassify data.

called entropy. Entropy measures the amount of information in an attribute. Given a collection S of c outcomes Entropy(S) = -p(I) log2 p(I) where p(I) is the proportion of S belonging to class I. is over c. Log2 is log base 2. Note that S is not an attribute but the entire sample set.

Algorithm
ID3 (Examples, Target_Attribute, Attributes)

Data Description
The sample data used requirements, which are: by ID3 has certain

Attribute-value description - the same attributes must describe each example and have a fixed number of values. Predefined classes - an example's attributes must already be defined, that is, they are not learned by ID3. Discrete classes - classes must be sharply delineated. Continuous classes broken up into vague categories such as a metal being "hard, quite hard, flexible, soft, quite soft" are suspect. Sufficient examples - since inductive generalization is used (i.e. not provable) there must be enough test cases to distinguish valid patterns from chance occurrences.

Create a root node for the tree If all examples are positive, Return the single-node tree Root, with label = +. If all examples are negative, Return the single-node tree Root, with label = -. If number of predicting attributes is empty, then Return the single node tree Root, with label = most common value of the target attribute in the examples. Otherwise Begin A = The Attribute that best classifies examples. Decision Tree attribute for Root = A. For each possible value, , of A, Add a new tree branch below Root, corresponding to the test A = Let Examples( examples that have the value . for A ), be the subset of

Attribute Selection
How does ID3 decide which attribute is the best? A statistical property, called information gain, is used. Gain measures how well a given attribute separates training examples into targeted classes. The one with the highest information (information being the most useful for classification) is selected. In order to define gain, we first borrow an idea from information theory

If Examples( ) is empty Then below this new branch add a leaf node with label = most common target value in the examples Else below this new branch add the subtree ID3 (Examples( Attributes - {A}) End Return Root ), Target_Attribute,

C4.5 ALGORITHM

C4.5 is an algorithm used to generate a decision tree developed by Ross Quinlan. C4.5 is an extension of Quinlan's earlier ID3 algorithm. The decision trees generated by C4.5 can be used for classification, and for this reason, C4.5 is often referred to as a statistical classifier. This section explains one of the algorithms used to create Univariate DTs. This one, called C4.5, is based on the ID32 algorithm, that tries to find small (or simple) DTs. We start presenting some premises on which this algorithm is based, and after we discuss the inference of the weights and tests in the nodes of the trees.

At each node of the tree, C4.5 chooses one attribute of the data that most effectively splits its set of samples into subsets enriched in one class or the other. Its criterion is the normalized information gain (difference in entropy) that results from choosing an attribute for splitting the data. The attribute with the highest normalized information gain is chosen to make the decision. The C4.5 algorithm then recurs on the smaller sub lists. This algorithm has a few base cases. All the samples in the list belong to the same class. When this happens, it simply creates a leaf node for the decision tree saying to choose that class. None of the features provide any information gain. In this case, C4.5 creates a decision node higher up the tree using the expected value of the class. Instance of previously-unseen class encountered. Again, C4.5 creates a decision node higher up the tree using the expected value.

Construction
Some premises guide this algorithm, such as the following: if all cases are of the same class, the tree is a leaf and so the leaf is returned labeled with this class; for each attribute, calculate the potential information provided by a test on the attribute (based on the probabilities of each case having a particular value for the attribute). Also calculate the gain in information that would result from a test on the attribute (based on the probabilities of each case with a particular value for the attribute being of a particular class); depending on the current selection criterion, find the best attribute to branch on.

CART ALGORITHM
Classification and regression trees (CART) is a nonparametric Decision tree learning technique that produces either classification or regression trees, depending on whether the dependent variable is categorical or numeric, respectively. Trees are formed by a collection of rules based on values of certain variables in the modeling data set Rules are selected based on how well splits based on variables values can differentiate observations based on the dependent variable Once a rule is selected and splits a node into two, the same logic is applied to each child node (i.e. it is a recursive procedure) Splitting stops when CART detects no further gain can be made, or some pre-set stopping rules are met Each branch of the tree ends in a terminal node

Algorithm
C4.5 builds decision trees from a set of training data in the same way as ID3, using the concept of information entropy. The training data is a set samples. Each sample of already classified is a

vector where represent attributes or features of the sample. The training data is augmented with a vector

where represent the class to which each sample belongs. [1]

Each observation falls into one and exactly one terminal node Each terminal node is uniquely defined by a set of rules

Tree Growing Process


The basic idea of tree growing is to choose a split among all the possible splits at each node so that the resulting child nodes are the purest. In this algorithm, only univariate splits are considered. That is, each split depends on the value of only one predictor variable. All possible splits consist of possible splits of each predictor For each continuous and ordinal predictor, sort its values from the smallest to the largest. For the sorted predictor, go through each value from top to examine each candidate split point (call it v, if x v, the case goes to the left child node, otherwise, goes to the right.) to determine the best. The best split point is the one that maximize the splitting criterion the most when the node is split according to it. The definition of splitting criterion is in later section. For each nominal predictor, examine each possible subset of categories (call it A, if x A , the case goes to the left child node, otherwise, goes to the right.) to find the best split. Find the nodes best split. Among the best splits found in step 1, choose the one that maximizes the splitting criterion. Split the node using its best split found in step 2 if the stopping rules are not satisfied. Splitting criteria and impurity measures At node t, the best split s is chosen to maximize a splitting criterion i(s,t) . When the impurity measure for a node can be defined, the splitting criterion corresponds to a decrease in impurity. In SPSS products, I (s, t) p(t) i(s, t) is referred to as the improvement. Stopping Rules Stopping rules control if the tree growing process should be stopped or not. The following stopping rules are used: If a node becomes pure; that is, all cases in a node have identical values of the dependent variable, the node will not be split. If all cases in a node have identical values for each predictor, the node will not be split. If the current tree depth reaches the userspecified maximum tree depth limit value,

the tree growing process will stop. If the size of a node is less than the userspecified minimum node size value, the node will not be split. If the split of a node results in a child node whose node size is less than the userspecified minimum child node size value, the node will not be split. Conclusion and Future Work Decision tree induction is one of the classification techniques used in decision support systems and machine learning process. With decision tree technique the training data set is recursively partitioned using depth- first (Hunts method) or breadth-first greedy technique (Shafer et al , 1996) until each partition is pure or belong to the same class/leaf node (Hunts et al, 1966 and Shafer et al , 1996). Decision tree model is preferred among other classification algorithms because it is an eager learning algorithm and easy to implement. Decision tree algorithms can be implemented serially or in parallel. Despite the implementation method adopted, most decision tree algorithms in literature are constructed in two phases: tree growth and tree pruning phase. Tree pruning is an important part of decision tree construction as it is used improving the classification/prediction accuracy by ensuring that the constructed tree model does not overfit the data set (Mehta et al, 1996). In this study we focused on serial implementation of decision tree algorithms which are memory resident, fast and easy to implement compared to parallel implementation of decision that is complex to implement. The disadvantages of serial decision tree implementation is that it is not scalable (disk resident) and its inability to exploit the underlying parallel architecture of computer system processors. Our experimental analysis of performance evaluation of the commonly used decision tree algorithms using Statlog data sets (Michie et al, 1994) shows that there is a direct relationship between execution time in building the tree model and the volume of data records. Also there is an indirect relationship between execution time in building the model and attribute size of the data sets. The experimental analysis C4.5 algorithms have a good classification accuracy compared to other algorithms used in the study. The variation of data sets class size, number of attributes and volume of data records is used to determine which algorithm has better classification accuracy between IDE3 and CART algorithms. In future we will perform experimental analysis of commonly used parallel implementation tree algorithms and them compare it

that serial implementation of decision tree algorithms and determine which one is better, based on practical implementation.

Evolutionary Algorithms to Data Mining Problems. Soft Computing, 2007. [6] Clark, P., Niblett, T. The CN2 Induction Algorithm. M] Hmlinen, W., Vinni, M. Comparison of machine learning methods for intelligent [7] tutoring systems. Conference Intelligent Tutoring Systems, Taiwan, 2006achine Learning 1989, 3(4) [8] Jovanoski, V., Lavrac, N. Classification Rule Learning with APRIORI-C. In: Progress in Artificial Intelligence, Knowledge Extraction, Multi-agent Systems, Logic Programming and Constraint Solving, 2001. [9] Romero, C., Ventura, S. Educational Data Mining: a Survey from 1995 to 2005. Expert Systems with Applications, 2007, 33(1). [10] ] Yudelson, M.V., Medvedeva, O., Legowski, E., Castine, M., Jukic, D., Rebecca, C. Mining Student Learning Data to Develop High Level Pedagogic Strategy in a Medical ITS. AAAI Workshop on Educational Data Mining, 2006.

References
[1] Baik, S. Bala, J. (2004), A Decision Tree Algorithm for Distributed Data Mining: Towards Network Intrusion Detection, Lecture Notes in Computer Science, [2] McSherry, D. (1999). Strategic induction of decision trees. Knowledge-Based Systems, [3] Agrawal R.,and Srikant R. 1994 Fast Algorithms for Mining Association Rules in Large Databases Proc. 20th Int. Conf.Very Large Data Bases (VLDB 94) [4] Jeffrey W. Seifert Analyst in Information Science and Technology Policy Resources, Science, and Industry Division , Data Mining: An Overview [5] Alcal, J., Snchez, L., Garca, S., del Jesus, M. et. al. KEEL A Software Tool to Assess

Comparing the effect of Hard and Soft threshold techniques on speech compression using wavelet transform
Sucheta Dhir Indira Gandhi Institute of Technology, G.G.S. Indraprastha University, E-mail: dhirsucheta@yahoo.co.in
Abstract- In mobile communication systems, service providers are trying to accommodate more and more users in the limited bandwidth available to them. To accommodate more users they are continuously searching for low bit data rate speech coder. There are many types of speech coder ( vcoder) available such as Pulse Code Modulation (PCM) based vcoder , Linear Predictive vcoder (LPC), and some higher quality vcoders like Residual Excited Linear Prediction (RELP) and Code Excited Linear Prediction (CELP). This paper deals with slightly newer concept that involves the use wavelet transform in speech compression. The wavelet transformation of a speech signal results in a set of wavelet coefficients which represent the speech signal in the wavelet domain. Most of the speech energy is concentrated in the high valued coefficients, which are few. Thus the small valued coefficients can be truncated or zeroed. For compression wavelet coefficients are truncated below a threshold. There are two approaches for calculating thresholds: Global Threshold and Level Dependent Threshold. Both types of thresholds can be either hard or soft. The result of MATLAB simulation shows that Compression factor increases when soft threshold is used in both global and level dependent threshold techniques. However the better signal to noise ratio and retained signal energy values are obtained when hard threshold is used. Index Terms-DWT, Global threshold, level dependent threshold, Compression

INTRODUCTION
Humans use multiple ways to communicate with one another. Speech is the most commonly used media by people to express their thoughts. Development of telephones, mobile satellite communication, etc. has helped us to communicate with anyone present on the globe that has the access to mobile technology. With the bandwidth of only 4 KHz human speech can convey information with emotions. Now a day there is great emphasis on reducing the delay in transmission as well as on sound clarity of the transmitted & received signal. Through speech coding a voice signal is converted into more compact

form, which can then be transmitted on a wired or wireless. The motivation behind the compression of speech is that there is limited access to the bandwidth available for transmission. Thus a speech signal is first compressed and the coded before its transmission and at the receiver end received signal is first decoded and then decompressed to get back the speech signal. Special stress is laid on the design and development of efficient compression techniques and speech coders for voice communication and transmission. Speech coders may be used for real time coding of speech for its use in mobile satellite communication, cellular telephony, and audio for videophones or video teleconferencing. Traditionally Speech coders are classified mainly into two categories: Waveform coders and analysis/synthesis vcoders. A waveform coder attempts to copy the actual shape of the signal produced by a microphone. The most commonly used waveform coding technique is Pulse code modulation (PCM). A vcoder attempts to reproduce a signal that is perceptually equivalent to the speech waveform. One of the most commonly used techniques for analysis/synthesis coding are Linear Predictive Coding (LPC) [1], Residual Excited Linear Prediction (RELP) and Code Excited Linear Prediction (CELP). This paper deals with slightly newer concept which employs the use of wavelets for speech compression [5]. Wavelets are mathematical functions of finite duration with average value zero. A signal can be represented by a set of scaled and translated versions of a basic function called the mother wavelet and this process is known as Wavelet Transformation [9]. The wavelet transformation of a signal results in a set of wavelet coefficients which represent the signal in the wavelet domain. All the data operations can now be performed using just the corresponding wavelet coefficients.

SPEECH COMPRESSION USING WAVELET TRANSFORMATION:


Fig-1 shows the Design Flow of Wavelet based Speech Encoder.

thresholds can be either hard or soft. Hard threshold process can be described as the usual process of setting to zero the elements whose absolute values are lower than the threshold. Soft threshold process is an extension of hard threshold, first setting to zero the elements whose absolute values are lower than the threshold, and then shrinking the nonzero coefficients toward 0.

4- Quantization and Encoding:


Quantization is the process of mapping large set of input values to a smaller set. Since quantization involves many to few mapping therefore it is a nonlinear and irreversible process. The thresholding of wavelet coefficients gives floating point values. These floating point values are converted into integer values using quantization table. These quantized coefficients are the indices to the quantization table. Quantized table contains redundant information. To remove the redundant information the quantized coefficients are then efficiently encoded. Encoding can be performed using Huffman Coding. Huffman coding is a statistical technique which attempts to reduce the amount of bits required to represent a string of symbols. Huffman coding is a type encoding technique which involves computation of probabilities of occurrence of symbols. These symbols are the indices to the quantization table. Symbols are arranged in descending order according to their probability of occurrence. Shortest code is assigned to symbol having maximum probability of occurrence and longest code is assigned to the symbol having minimum occurrence. The actual compression takes place in this step only because in the previous steps the length of the signal at each stage was equal to the length of the original signal. It is in this step that each symbol is represented with a variable code.

Fig 1: Design Flow of Wavelet based Speech Coder.

The major steps shown in the above diagram are explained in the following sections.

1- Choice of Wavelet Function


To design a high quality speech coder, the choice of an optimal mother wavelet function is of prime importance. The selected wavelet function should be capable of reducing the reconstructed error variance and maximizing signal to noise ratio (SNR). Different criteria can be used to select an optimal mother wavelet function [6]. Selection of optimal mother wavelet function can be based on the amount of energy a wavelet function can concentrate into level 1 approximation coefficients.

2- Wavelet Decomposition
A signal is decomposed into different resolutions or frequency bands. This task can be carried out by taking the discrete wavelet transform of the signal by a suitable function at appropriate decomposition level. The level of decomposition can be selected based on the value of entropy [2]. For processing a speech signal level-5 wavelet decomposition is adequate.

PERFORMANCE PARAMETERS
1- Compression Factor: It is the ratio of
original signal to the compressed signal. CR (1)

3- Truncation of Coefficients
Most of the speech energy is concentrated in the high valued coefficients, which are few. Thus the small valued coefficients can be truncated or zeroed. For compression wavelet coefficients are truncated below a threshold. There are two approaches for calculating thresholds: Global Threshold and Level Dependent Threshold. Global threshold is used to retain largest absolute value coefficients, regardless of level of decomposition. Unlike Global threshold, Level Dependent threshold vary depending upon the level of decomposition of the signal. Both types of

2- Retained Signal Energy (PERFL2): It


indicates the amount of energy retained in the compressed signal as a percentage of the compressed signal. (2)

3- Percentage of Zero Coefficient (PERF0): PERF0 is defined as the number of zeros


introduced in the signal due to thresholding which is given by the following relation. (3)

4- Signal to Noise Ratio (SNR): SNR gives


the quality of the reconstructed signal. A high value indicates better reconstruction. (4)

SIMULATION RESULTS
For choosing optimal mother wavelet functions of five different wavelet families were used to decompose a speech sample shown in fig 2. The retained signal energy at level-1 wavelet decomposition was calculated and the same is recorded in table 1(a, b, c, d, e).

Wavelet Function bior-1.1 bior-1.3 bior-1.5 bior-2.2 bior-2.4 bior-2.6 bior-2.8 bior-3.1 bior-3.3 bior-3.5 bior-3.7 bior-3.9 bior-4.4 bior-5.5 bior-6.8

Retained Signal Energy 91.4615 96.3201 92.5828 96.7950 96.9173 96.9730 97.0020 98.3436 98.3986 98.4286 98.4455 98.4556 95.8568 93.5781 96.5751

Fig 2: Speech signal sample.

Table 1(e): Retained Signal Energy for Biorthogonal Wavelet Family

Wavelet Function Haar Wavelet Function db-1 db-2 db-3 db-4 db-5 db-6 db-7 db-8 db-9 db-10 Wavelet Function sym-2 sym-3 sym-4 sym-5 sym-6 sym-7 sym-8 Wavelet Function coif-1 coif-2 coif-3 coif-4 coif-5

Retained Signal Energy 91.4615 Retained Signal Energy 91.4160 93.8334 94.8626 95.4728 95.8830 96.1680 96.2927 96.3349 92.3262 96.3416 Retained Signal Energy 93.8224 94.8626 95.6662 96.0647 96.1711 96.1728 96.3343 Retained Signal Energy 93.8450 95.7307 96.1958 96.3504 96.4062

Table 1(a): Retained Signal Energy for Haar Wavelet Family.

Table 1(b): Retained Signal Energy for Daubechies wavelet Family

One wavelet function out each wavelet family is selected based on the maximum retained signal energy criteria at level 1 wavelet decomposition. Based on maximum retained energy criteria bior-3.9, db-10, sym-8, coif-5 wavelet functions are selected for level-5 wavelet decomposition for speech compression. Table 2 shows the values of Compression Factor (CR), Signal to noise ratio (SNR), Percentage of Zero Coefficients (PERF0), and Retained Signal Energy (PERFL2) for selected wavelet functions for both hard and soft, global threshold. Fig 3 shows the reconstructed signal after decoding and decompression of encoded and compressed speech signal using global threshold approach. Wavelet Function : bior-3.9 Performance Global Threshold Parameter Hard Threshold Soft Threshold 3.2932 3.4657 CR 23.8536 15.1721 SNR 76.5387 76.5383 PERF0 96.9588 63.5240 PERFL2
Table 2(a): Performance Parameter Table for Bior-3.9 wavelet function and Global Threshold Approach.

Table 1(c): Retained Signal Energy for Symlets Wavelet Family

Wavelet Function : db-10 Performance Global Threshold Parameter Hard Threshold Soft Threshold 3.3852 3.5429 CR 23.8335 14.1865 SNR 78.7307 78.7307 PERF0 90.8521 41.3243 PERFL2
Table 2(b): Performance Parameter Table for db-10 wavelet function and Global Threshold Approach.

Table 1(d): Retained Signal Energy for Coiflets Wavelet Family

Wavelet Function : sym-8 Performance Global Threshold Parameter Hard Threshold Soft Threshold

CR SNR PERF0 PERFL2

3.3070 23.7707 78.6117 90.7802

3.4524 14.1173 78.6117 40.9308


Fig 3(g): Reconstructed signal for coif-5 wavelet function using hard-global threshold.

Table 2(c): Performance Parameter Table for sym-8 wavelet function and Global Threshold Approach.

Wavelet Function : coif-5 Performance Global Threshold Parameter Hard Threshold Soft Threshold 3.3035 3.4460 CR 23.8271 14.1590 SNR 78.6407 78.6407 PERF0 90.8347 41.3203 PERFL2
Table 2(d): Performance Parameter Table for coif-5 wavelet function and Global Threshold Approach.

Fig 3(h): Reconstructed signal for coif-5 wavelet function using soft-global threshold.

Fig 3(a): Reconstructed signal for bior-3.9 wavelet function using hard-global threshold.

Fig 3(b): Reconstructed signal for bior-3.9 wavelet function using soft-global threshold.

Similarly table 3 shows the values of Compression Factor (CR), Signal to noise ratio (SNR), Percentage of Zero Coefficients (PERF0), and Retained Signal Energy (PERFL2) for selected wavelet functions for both hard and soft, level dependent threshold. And Fig 4 shows the reconstructed signal after decoding and decompression of encoded and compressed speech signal using level dependent threshold approach. Wavelet Function : bior-3.9 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.3905 3.6901 CR 17.7406 9.9707 SNR 78.0619 78.0619 PERF0 92.7563 58.4797 PERFL2
Table 3(a): Performance Parameter Table for Bior-3.9 wavelet function and Level Dependent Threshold Approach.

Fig 3(c): Reconstructed signal for db-10 wavelet function using hard-global threshold.

Wavelet Function : db-10 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.3150 3.5871 CR 18.2082 10.5317 SNR 78.0619 78.0619 PERF0 83.9324 37.4042 PERFL2
Table 3(b): Performance Parameter Table for db-10 wavelet function and Level Dependent Threshold Approach.

Fig 3(d): Reconstructed signal for db-10 wavelet function using soft-global threshold.

Fig 3(e): Reconstructed signal for sym-8 wavelet function using hard-global threshold.

Wavelet Function : sym-8 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.2535 3.4300 CR 17.9551 10.1267 SNR 78.0964 78.0964 PERF0 83.4822 35.9319 PERFL2
Table 3(c): Performance Parameter Table for sym-8 wavelet function and Level Dependent Threshold Approach.

Fig 3(f): Reconstructed signal for sym-8 wavelet function using soft-global threshold.

Wavelet Function : coif-5 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.2273 3.4563 CR 18.3864 10.7569 SNR 77.9466 77.9466 PERF0

PERFL2

84.1851

38.7021

Table 3(d): Performance Parameter Table for coif-5 wavelet function and Level Dependent Threshold Approach.

Fig 4(h): Reconstructed signal for coif-5 wavelet function using soft-level dependent threshold.

CONCLUSION
Compression of speech signal is essential, since raw speech is highly space consuming. In this paper wavelet transform is used for speech compression. Its performance was tested on various parameters and the following points were observed. Speech compression using wavelet transformation involves quantization of coefficients before encoding step, which is an irreversible process; hence original speech cannot be retrieved from compressed speech signal. As can be seen in table 2, the percentage of zeros introduced (PERF0) remain exactly the same for hard and soft, level dependent threshold technique. Ideally, for equal values of PERF0 the CRs shall also be equal but a difference in the values of CR is observed. This discrepancy can be accounted for the introduction of additional zeros at the quantization stage, because the coefficients are scaled down in soft threshold. It is due to this scaling that Retained signal energy and hence SNR has dropped to lower values, though audibility and understand ability of the speech was not significantly affected. Higher values of Compression factors are achieved when db-10 wavelet function is used for speech compression and signal to noise ratio are achieved when bior-3.9 wavelet function is used for speech compression. Similar inference can be made from observations for hard and soft, global threshold technique (Table 3).

Fig 4(a): Reconstructed signal for bior-3.9 wavelet function using hard-level dependent threshold.

Fig 4(b): Reconstructed signal for bior-3.9 wavelet function using soft-level dependent threshold.

Fig 4(c): Reconstructed signal for db-10 wavelet function using hard-level dependent threshold.

Fig 4(d): Reconstructed signal for db-10 wavelet function using soft-level dependent threshold.

Fig 4(e): Reconstructed signal for sym-8 wavelet function using hard-level dependent threshold.

REFERENCES
[1] Shijo M Joseph, Firoz Shah A and Babu Anto P, Spoken digit compression: A Comparative Study between Discrete Wavelet Transforms and Linear Predictive Coding International Journal of Computer Applications (0975 8887) Volume 6 No.6, September 2010. [2] Wonyong Chong, Jongsoo Kim, Speech and Image Compressions by DCT, Wavelet, and Wavelet Packet International Conference on Information, Communications and Signal ProcessingICICS '97Singapore, 9-12 September 1997 [3] Wonyong Chong, Jongsoo Kim, Speech and Image Compressions by DCT, Wavelet, and Wavelet Packet International Conference on Information, Communications and Signal ProcessingICICS '97Singapore, 9-12 September 1997 [4] P.Prakasam and M.Madheswaran, Adaptive Algorithm for Speech Compression using Cosine Packet Transform IEEE 2007 proc. International

Fig 4(f): Reconstructed signal for sym-8 wavelet function using soft-level dependent threshold.

Fig 4(g): Reconstructed signal for coif-5 wavelet function using hard-level dependent threshold.

Conference on Intelligent and Advanced Systems. pp 1168-1172. [5] AbduI Mawla M,A. Najih, Abdul Rahman Ramli, Azizah Ibrahim and Syed A.R, Comparing Speech Compression Using Wavelets With Other Speech Compression Schemes IEEE 2003 proc. Students conference on research and development (SCOReD). pp 55-58. [6]. R. Polikar. The wavelet tutorial. URL: http://users.rowan.edu/polikar/WAVELETS/WTtutor ial.html, March 1999. [7]. Gonzalez, Woods and Eddins. Digital Image Processing. Gatesmark Publishing Ltd., 2009. ISBN 9780982085400 [8] K. Subramaniam, S.S. Dlay, and F.C. Rind. Wavelet transforms for use in motion detection and tracking application. IEEE Image processing and its Applications, pages 711715, 1999. .

[9]. P.S. Addison. The Illustrated Wavelet Transform Handbook. IOP Publishing Ltd, 2002. ISBN 0-7503-0692-0. [10]. M. Tico, P. Kuosmanen, and J. Saarinen. Wavelet domain features for fingerprint recognition. IEEE Electronic Letters, 37(1):2122, January 2001. [11] Jalal Karam, End Point Detection for Wavelet Based Speech Compression Procedings of world academy of science, engineering and technology Volume 27 February 2008 ISSN 1307-6884. [12] AbduI Mawla M,A. Najih, Abdul Rahman Ramli, Azizah Ibrahim and Syed A.R, Comparing Speech Compression Using Wavelets With Other Speech Compression Schemes IEEE 2003 proc. Students conference on research and development (SCOReD). pp 55-58.

IMPROVING THE PERFORMANCE OF WEB LOG MINING BY USING K-MEAN CLUSTERING WITH NEURAL NETWORK
Abstract
The World Wide Web has evolved in less than two decades as the major source of data and Information for all domains. Web has become today not only accessible and searchable information source but also one of the most important communication channels, almost a virtual society. Web mining is a challenging activity that aims to discover new, relevant and reliable information and knowledge by investigating the web structure, its content and its usage. Though the web mining process is similar methodologies used to mine the web encompass those specific to data mining, mainly because the web has a great amount of unstructured data and the changes are frequent and rapid. In the present work, we propose a new technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multi-layered neural network using the K-means clustering algorithm. The proposed model use multi-layered network architecture with a back propagation learning mechanism to discover and analyse useful

to data mining, the techniques, algorithms, and

knowledge from the available Web log data.

Key words: Clustering algorithms, data mining,and Unsupervised Learning algorithm, Online Learning Algorithm,

Neural network, k-mean clustering, web usage mining.


mining methodologies, web structure mining, we examine only the relationships between web documents by utilizing the information conveyed by each document's hyperlinks. Data mining is a set of techniques and tools used to the no trivial process of extracting and present implicit knowledge, no knowledge before, this information is useful and human reliable; this is processing from a great set of data; with the object of describing in automatic way models, no knowledge before; to detect tendencies and patterns [1,2] The Web Mining are the set of techniques of Data Mining applied to Web [7]. The Web Usage Mining is the process of applying techniques to detect patterns of usage to Web Page [3,5]. The Web Usage Mining use the data storage in the Log files of Web server as first resource; in this file the Web server register the access at each resource in the server by the users [4,6].
2. Neural Network

1. Introduction

Web mining the application of machine learning techniques to web-based data for the purpose of learning or extracting knowledge. Web mining encompasses wide variety techniques, including soft computing. Web mining methodologies can generally be classified into one of three distinct categories: web usage mining, web structure mining, and web content mining examine web page usage patterns in order to learn about a web system's users or the relationships between the documents. In web usage mining the goal is to examine web page usage patterns in order to learn about a web system's users or the relationships between the documents. For example, the tool presented and creates association rules from web access logs, which store the identity of pages accessed by users along with other information such as when the pages were accessed and by whom; these logs are the focus of the data mining effort, rather than the actual web pages themselves. Rules created by their method could include, for example, "70% of the users that visited page A also visited page B examines web access logs. Web usage mining is useful for providing personalized web services, an area of web mining research that has lately become active. It promises t o help tailor web services, such as web search engines, to the preferences of each individual user. In the second category of web

An Artificial Neural Network (ANN) is an information-processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information. The key element of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements (neurons) working in unison to solve specific problems. An ANN is configured for a

380

specific application, such as pattern recognition or data classification, through a learning process [9].

2.1 Architecture of neural networks


2.1.1 Feed-forward networks Feed-forward ANNs allow signals to travel one way only; from input to output. There is no feedback (loops) i.e. the output of any layer does not affect that same layer. Feed-forward ANNs tend to be straightforward networks that associate inputs with outputs. They are extensively used in pattern recognition. This type of organization is also referred to as bottom-up or top-down. 2.1.2 Feedback networks Feedback networks can have signals traveling in both directions by introducing loops in the network. Feedback networks are very powerful and can get extremely complicated. Feedback networks are dynamic; their 'state' is changing continuously until they reach an equilibrium point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be found. Feedback architectures are also referred to as interactive or recurrent, although the latter term is often used to denote feedback connections in single-layer organizations.

F ig 1 : G ener a l N eur a l N e twork


3. MINING WEB USAGE DATA

is the click-stream for a single user for a particular Web site. The end of a server session is defined as the point when the users browsing session at that site has ended [3, 10]. The process of Web usage mining can be divided into three phases: preprocessing, pattern discovery, and pattern analysis [3, 8]. Pre-processing consists of converting usage information contained in the various available data sources into the data abstractions necessary for pattern discovery. Another task is the treatment of outliers, errors, and incomplete data that can easily occur due reasons inherent to web browsing. The data recorded in server logs reflects the (possibly concurrent) access of a Web site by multiple users, and only the IP address, agent, and server side click-stream are available to Identify users and server sessions. The Web server can also store other kinds of usage information such as cookies, which are markers generated by the Web server for individual client browsers to automatically track the site visitors [3, 4]. After each user has been identified (through cookies, logins, or IP/agent analysis), the click-stream for each user must be divided into sessions. As we cannot know when the user has left the Web site, a timeout is often used as the default method of breaking a users click-stream into sessions [2]. The next phase is the pattern discovery phase. Methods and algorithms used in this phase have been developed from several fields such as statistics, machine learning, and databases. This phase of Web usage mining has three main operations of interest: association (i.e. which pages tend to be accessed together), clustering (i.e. finding groups of users, transactions, pages, etc.), and sequential analysis (the order in which web pages tend to be accessed) [3, 5]. The first two are the focus of our ongoing work. Pattern analysis is the last phase in the overall process of Web usage mining. In this phase the motivation is to filter out uninteresting rules or patterns found in the previous phase. Visualization techniques are useful to help application domains expert analyze the discovered patterns.
4. Conventional Method Used In Web Mining

In Web mining, data can be collected at the server-side, client-side and proxy servers. The information provided by the data sources described above can be used to construct several data abstractions, namely users, page-views, clickstreams, and server sessions. A user is defined as a single individual that is accessing file web servers through a browser. In practice, it is very difficult to uniquely and repeatedly identify users. A pageview consists of every file that contributes to the display on a users browser at one time and is usually associated with a single user action such as a mouse-click. A click-stream is a sequential series of page-views requests. A server session (or visit)

4.1 Clustering Clustering the process of partition a set of data in a set of meaning full subclasses known as clusters. It helps users understand the natural grouping or structure in a data set. Clustering is an unsupervised learning technique which aim is to find structure in a collection of unlabeled data. It is being used in many fields such as data mining, knowledge discovery, pattern recognition and classification [3].

381

A good clustering method will produce high quality clusters in which similarity is high known as intra-classes and inter-classes where similarity is low. The quality of clustering depends upon both the similarly measure used by the method and it, s implementation and it is also measured by the its ability to discover hidden patterns. Generally speaking, clustering techniques can be divided into two categories pair wise clustering and central clustering. The former also called similarity-based clustering, groups similar data instances together based on a data-pair wise proximity measure. Examples of this category include graph partitioning-type methods. The latter, also called centroid-based or model-based clustering, represents each cluster by a model, i.e., its centroid". Central clustering algorithms [4] are often more efficient than similarity-based clustering algorithms. We choose centroid-based clustering over similarity-based clustering. We could not efficiently get a desired number of clusters, e.g., 100 as set by users. Similarity-based algorithms usually have a complexity of at least O (N2) (for computing the data-pair wise proximity measures), where N is the number of data instances. In contrast, centroid-based algorithms are more scalable, with a complexity of O (NKM), where K is the number of clusters and M the number of batch iterations. In addition, all these centroidbased clustering techniques have an online version, which can be suitably used for adaptive attack detection in a data environment 4.2 K-Mean Algorithm The K-Means algorithm is one of a group of algorithms called partitioning clustering algorithm [4]. The most commonly use partitional clustering strategy is based on square error criterion. The general objective is to obtain the partition that, for a fixed number of clusters, minimizes the total square errors. Suppose that the given set of N samples in an n-dimensional space has somehow been partitioned into K-clusters {C1, C2, C3... CK}. Each CK has nK samples and each sample is in exactly one cluster, so that nK = N, where k=1 K. The mean vector Mk of cluster CK is defined as the centroid of the cluster
nk

The square-error for the entire clustering space containing K cluster is the sum of the withincluster variations
K

E k2
k 1

2 ek

The basic steps of the K-mean algorithm are: 1. Select an initial partition with K clusters containing randomly chosen sample, and compute the centroids of the clusters, 2. Generate a new partition by assigning each sample to the closest cluster centre, 3. Compute new cluster centre as the centroids of the clusters, 4. Repeat steps 2 and 3 until optimum value of the criterion function is found or until the cluster membership stabilizes. 4.3 Problem identification: Problems with k-means In k-means, the free parameter is k and the results depend on the value of k. unfortunately; there is no general theoretical solution for finding an optimal value of k for any given data set. It take more time for calculating the data set. It can only handled the Numerical data set. The Result depend on the Metric used the measure || x-mi||.
5. Proposed Approach:-

MK = (1/nk)
i 1

xik

Where xik is the ith sample belonging to cluster CK. The square-error for cluster CK is the sum of the squared Euclidean distances between each sample in CK and its centroid. This error is also called the within-cluster variation [5]:
nk

ek2 =
i 1

( xik

M k )2

In the present work, the role of the k-means algorithm is to reduce the computation intensity of the neural network, by reducing the input set of samples to be learned. This can be achieved by clustering the input dataset using the k-means algorithm, and then take only discriminate samples from the resulting clustering schema to perform the learning process. The number of fixed clusters can be varied to specify the coverage repartition of the samples. The number of selected samples for each class is also a parameter of the selection algorithm. Then, for each class, we specify the number of samples to be selected according to the class size. When the clustering is achieved, samples are taken from the different obtained clusters according to their relative intraclass variance and their density. The two measurements are combined to compute a coverage factor for each cluster. The number of samples taken from a given cluster is proportional to the computed coverage factor. Let A be a given class, to witch we want to apply the proposed approach to extract S sample. Let k be the number of cluster fixed to be used during the k-means clustering phase. For each generated cluster cli, (i:1..k), the relative variance is computed using the following expression:

382

When Card(X) give the cardinality of a given set X, and dist(x,y) give the distance between the two points x and y. The most commonly used distance measure is the Euclidean metric which defines the distance between two points x=(p1,.pN) and y=(q1,.,qN) from RN as:

While((card(Sam(i)))<Num_samples(cli)) and (j<card(cli)) do{min:=100000; For each x from Sam(i) do {if dist(Candidates[j].point,x)<min then min:= dist(Candidates[j].point,x) ; } if (min > ) then Sam(i):=Sam(i) U{Candidates[j].point}; j:=j+1; } if card(Sam(i)) < Num_samples(cli) then repeat{Sam(i):=Sam(i)UCandidates[random]. poin }until (card(Sam(i)) = Num_samples(cli)); 3-For i=1 to k do Out_sam:=Out_sam U Sam(i);
6.Conclusion

The density value corresponding to the same cluster cli is computed like the following:

The coverage factor is then computed by: In this work, we study the possible use of the neural networks learning capabilities to classify the web traffic data mining set. The discovery of useful knowledge, user information and server access patterns allows Web based organizations to mining user access patterns and helps in future developments, maintenance planning and also to target more rigorous advertising campaigns aimed at groups of users. Previous studies have indicated that the size of the Website and its traffic often imposes a serious constraint on the scalability of the methods. As popularity of the web continues to increase, there is a growing need to develop tools and techniques that will help improve its overall usefulness.
REFRENCES: -

We can clearly see that: 0 Vr(cli) 1 and 0 Den(cli) 1 for any cluster cli. So the coverage factor Cov(cli) belong also to 1-Cluster the class A using the k-means algorithm into k cluster.the [0,1] interval. Furthermore, it is clear that:

We can so deduce easily that:

Hence, the number of samples selected from each cluster is determined using the expression Num_samples(cli)=Round(S*cov(cli) Let A be the input class; k: the number of cluster; S: the number of samples to be selected (S k); Sam(i): the resulting selected set of samples for the cluster i; Out_sam: the output set of samples selected from the class A; Candidates: a temporary array that contain the cluster points and their respective distance from the centroid. i,j,min,x: intermediates variables; : Neiberhood parameter The proposed selection model algorithm is 1-Cluster the class A using the k-means algorithm into k cluster. 2-For each cluster cli (i:1..k) do { Sam(i) :={centroid(cli)}; j:=1; For each x from cli do { Candidates [j].point :=x; Candidates [j].location :=dist(x, centroid(cli)) ; j:=j+1 ;}; Sort the array Candidates in descending order with Hence, the number of samples selected from each cluster is respect to the values of location field; j:=1;

[1] W.J. Frawley, G. Piatetsky-Shapiro, and C.J. Matheus, Knowledge Discovery in Databases: An Overview, Knowledge Discovery in Databases, G. Piatetsky-Shapiro and W.J Frawley, eds., Cambridge, Mass.: AAAI/MIT Press, pp. 1-27, 1991. [2] Mika Klemettinen, Heikki Mannila, Hannu Toivonen: A Data Mining Methodology and Its Application to Semi-automatic Knowledge Acquisition. DEXA Workshop 1997: 670-677 [3] R. Kosala, H. Blockeel, and Web Mining Research: A Survey, SIGKKD Explorations, vol. 2(1), July 2000. [4] Borges-Levene, An average linear time algorithm for web usage mining:, 2000 [5] J. Srivastava, R. Cooley, M. Deshpande, P.-N. Tan, Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data, SIGKKD Explorations, vol.1, Jan 2000. [6] P. Batista, M. J. Silva, Mining web access logs of an on-line newspaper, (2002), http://www.ectrl.itc.it/rpec/RPEC-apers/11batista.pdf.

383

[7] Cernuzzi, L., Molas, M.L. (2004). Integrando diferentes tcnicas de Data Mining en procesos de Web Usage Mining. Universidad Catlica "Nuestra Seora de la Asuncin". Asuncin. Paraguay. [8] R. Ivncsy, I. Vajk, Different Aspects of Web Log Mining. 6th International Symposium of Hungarian Researchers on Computational Intelligence. Budapest, Nov., 2005. [9] Chau, M.; Chen, H., Incorporating Web Analysis Into Neural Networks: An Example in Hopfield Net Searching, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, Volume 37, Issue 3, May 2007 Page(s):352 358

[10] Raju, G.T.; Satyanarayana, P. S. Knowledge Discovery from Web Usage Data: Extraction of Sequential Patterns through ART1 Neural Network Based Clustering Algorithm, International Conference on Computational Intelligence and Multimedia Applications, 2007, Volume 2, Issue , 13-15 Dec. 2007 Pages :88 -92 [11] Jalali, Mehrdad Mustapha, Norwati Mamat, Ali Sulaiman, Md. Nasir B. , A new classification model for online predicting users future movements, in International Symposium on Information Technology, 2008. ITSim 2008 2628 Aug. 2008, Volume: 4, On page(s): 1-7, Kuala Lumpur, Malaysia

384

A Hybrid Filter for Image Enhancement


Vinod Kumar, a Kaushal Kishore, b and Dr. Priyanka a
a

Deenbandhu Chotu Ram University of Science and Technology, Murthal, Sonepat, Haryana India b Ganpati Institute of Technology and Management, Bilaspur, Yamunanagar, Haryana, India vinodspec@yahoo.co.in,a kishorenittr@gmail.com,b priyankaiit@yahoo.co.in,a

Abstract
Image filtering processes are applied on images to remove the different types of noise that are either present in the image during capturing or introduced into the image during transmission. The salt & pepper (impulse) noise is the one type of noise which is occurred during transmission of the images or due to bit errors or dead pixels in the image contents. The images are

blurred due to object movement or camera displacement when we capture the image. This pepper deals with removing the impulse noise and blurredness simultaneously from the images. The hybrid filter is a combination of weiner filter and median filter. Keywords: Salt & Pepper (Impulse) noise; Blurredness; Median filter; Weiner filter increased the median filter not gives best result. Median filtering is a nonlinear operation used in image processing to reduce "salt and pepper" noise. Also Mean filter is used to remove the impulse noise. Mean filter replaces the mean of the pixels values but it does not preserve image details. Some details are removes with the mean filter. But in the median filter, we do not replace the pixel value with the mean of neighboring pixel values, we replaces with the median of those values. The median is calculated by first sorting all the pixel values from the surrounding neighborhood into numerical order and then replacing the pixel being considered with the middle pixel value. (If the neighboring pixel which is to be considered contains an even number of pixels, than the average of the two middle pixel values is used.) Fig.1 illustrates an example calculation.

Introduction
The basic problem in image processing is the image enhancement and the restoration in the noisy envirement. If we want to enhance the quality of images, we can use various filtering techniques which are available in image processing. There are various filters which can remove the noise from images and preserve image details and enhance the quality of image. Hybrid filters are used to remove either gaussian or impulsive noise from the image. These include the median filter and weiner filters. Combination or hybrid filters have been proposed to remove mixed type of noise during image processing from images.

Median filter
The median filter gives best result when the impulse noise percentage is less than 0.1%. When the quantity of impulse noise is

385

G(u,v) =
Where H(u, v) = Degradation function H*(u, v) = Complex conjugate of degradation function Pn (u, v) = Power Spectral Density of Noise Ps (u, v) = Power Spectral Density of undegraded image. The term Pn /Ps is the reciprocal of the signal-to-noise ratio.

Fig.1:Exp. of median filtering

Weiner filter
The main purpose of the Wiener filter is to filter out the noise that has corrupted a signal. Weiner filter is based on a statistical approach. Mostly filters are designed for a desired frequency response. The Wiener filter deals with the filtering of image from a different point of view. One method is to assume that we have knowledge of the spectral properties of the original signal and the noise, and one deals with the Linear Time Invarient filter whose output would come as close to the original signal as possible [1]. Wiener filters are characterized by the following assumption: a. signal and (additive white gaussian noise) noise are stationary linear random processes with known spectral characteristics. b. Requirement: the filter must be physically realizable, i.e. causal (this requirement can be dropped, resulting in a non-causal solution). c. Performance criteria of weiner filter: minimum mean-square error. Wiener Filter in the Fourier Domain The weiner filter is given by following transfer function:

Image Noise
Image noise is the degradation of the quality of the image. Image noise is prodouced due to the random variation of the brightness or the color information in images that is produced by the sensors and the circuitry of the scanner or digital cameras. Image noise can also originate in film grain and in the unavoidable shot noise of an ideal photon detector. Image noise is generally regarded as an undesirable by-product of image capture. The types of Noise are following:Additive White Gaussian noise Salt-and-pepper noise Blurredness Additive White Gaussian noise The Additive White Gaussian noise to be present in images are independent at each pixel and signal intensity. In color cameras where more amplification is used in the blue color channel than in the green or red channel, there can be more noise in the blue channel. Salt-and-pepper noise The image which has salt-and-pepper noise present in image will show dark pixels in the bright regions and bright pixels in the dark regions. [2]. The salt & pepper noise in images can be caused by the dead pixels, or due to analog-to-digital conversion errors, or bit errors in the transmission, etc. This all

G(u,v) =
Dividing the equation by Ps makes its behaviour easier to explain:

386

can be eliminated in large amount by using the technique dark frame subtraction and by interpolating around dark/bright pixels. Blurredness The blurredness of the image is depend on the point spread function (psf) .The psf may circular or linear. The image is blurred due to the camera movement or the object displacement.

MSE=
The PSNR is defined as:

PSNR = 10 . = 20 .
where,

Hybrid Filter
This hybrid filter is the combination of Median and weiner filter. when we arrange these filter in series we get the desired output. First we remove the impulse noise and then pass the result to the weiner filter. The weiner filter removes the blurredness and the additive white noise from the image. The result is not the same as the original image, but it is almost same.

MAXI is the maximum possible pixel value of the image.

Simulation result
The Original Image is cameraman image . Adding three types of Noise (Additive white Gaussian noise, Salt & Pepper noise blurredness) and pass this image to our hybrid filter we get the desired result. The result depend upon the blurring angle (theta) and the blurring length (Len) and the intensity of the impulse noise. The performance is compared with the MSE & PSNR of the original image and the filter output image.

Algorithm The following steps are followed when we filtered the image: If the image is colored convert it in the gray scale image. Convert the image to double for better precision. Find the median by sorting all the values of the 3*3 mask in increasing order. Replace the center pixel value with the median value. Estimate the Signal to Noise ratio. Deconvolution function is applied to filtered the image.

Original image

cameraman.tif

MSE & PSNR


The term peak signal-to-noise ratio, PSNR, is the ratio between the maximum possible power of a signal and the power of corrupting noise signal.

Fig.2 Original Image

387

len=21,Theta=11

hybrid filter output

B lurred image

Fig.3 Blurred image Len=21, Theta=11


B lurred+gaussian noisy image

with

Fig.6 Hybrid Filter output

Len=11,Theta=5,impluse noise=0.009

mean=0,var=.0001

Fig.4 Blurred image with gaussian noise of mean=0, var=.001

Fig.7 hybrid Filter Output Now we calculate the mean square error for the different conditions to check the performance of our filter. The Table shows that when the blurredness of the image vary with angle and length and the percentage of impulse noise is constant. Table 1:
Blurred length Blurrig Angle Percentage of impulse noise(%) Mean error square Peak Signal to Noise ratio

B lurred+gaussian+impluse noisy image

Fig.5 image

Blurred or Impulse noisy

21 11 0.01 0.0087 69.11 15 09 0.01 0.0079 69.30 10 07 0.01 0.0074 69.49 05 03 0.01 0.0050 70.49 02 02 0.01 0.0040 71.49 Next when the blurredness of the image is same and the percentage of the impulse

388

noise is increased, then the following results are obtained: Table 2:


Blurred length Blurrig Angle Percentage of impulse noise (%) Mean square error PSNR

Scope for future work


There are a couple of areas which we would like to improve on. One area is in improving the de-noising along the edges as the method we used did not perform so well along the edges. Instead of using the median filter we can use the adaptive median filter. we can increase the types of noise.

21 21 21 21 21

11 11 11 11 11

0.01 0.03 0.05 0.07 0.09

0.0087 0.0172 0.0268 0.0333 0.0398

68.11 66.08 64.15 63.02 62.06

References:
[1] Wavelet domain image de-noising by thresholding and Wiener filtering by Kazubek, M. Signal Processing Letters IEEE, Volume: 10, Issue no. 11, Nov. 2003 265 Vol.3. [2] Image Denoising using Wavelet Thresholding and Model Selection by Shi Zhong. Image Processing, 2000, Proceedings, 2000 International Conference held on, Volume: 3, 10-13 Sept. 2000 Pages: 262. [3] A hybrid filter for image enhancement ,by Shaomin Peng and Lori Lucke Department of Electrical Engineering University of Minnesota Minneapolis, MN 55455 [4] Performance Comparison of Median and Wiener Filter in Image De-noising. International Journal of Computer Applications Page No.(0975 8887) Volume 12 No.4, November 2010 [5] Multi-level Addaptive Fuzzy Filter for Mixed Noise Removal by Shaomin Peng and Lori Lucke. Department of Electrical Engineering University of JIinnesota Minneapolis. LIS 55455 612-625-3822 and 612-625-3588.

When the blurredness and impulse noise is simultaneously varying, we get the following results: Table 3:
Blurred length Blurrin g Angle Percentage of impulse noise (%) Mean square error PSNR

21 15 10 10 05 05

11 09 05 05 03 03

0,01 0.02 0.01 0.03 0.01 0.04

0.0087 0.0130 0.0060 0.0131 0.0052 0.0135

68.22 67.35 70.08 67.39 71.11 67.20

Conclusion We used the cameraman image in .tif format and adding three noise (impulse noise, gaussian noise, blurredness) and apply the noisy image to hybrid filter. The final filtered image is depending upon the blurring angle and the blurring length and the percentage of the impulse noise. When these variables are less the filtered image is nearly equal to the original image.

389

Trends in ICT Track: Software development & Deployment (AGILE METHODOLOGIES)


Shubh Senior Lecturer, Department of IT Jagan Institute of Management & Sciences, Rohini shubh.aneja@gmail.com Priyanka Gandhi Senior Lecturer, Department of IT Jagan Institute of Management & Sciences, Rohini talk2priyankap@gmail.com Manju Arora Lecturer, Department of IT Jagan Institute of Management & Sciences, Rohini rr_manju@yahoo.co.in

Abstract :This paper takes the tracks towards software engineering. We do not aim to provide a summary of the overall state of- the-art in software engineering. Our objective is to bring out the changes that are taking place in software engineering, methods and processes as well as highlight how technology has changed in the last 25 years. In today's unpredictable markets companies are facing to achieve more with fewer resources in shortest period of time within control operational cost, ICT is also looking to increase the value of information to make the business more profitable. So, necessity to complete and develop projects with changeable requirement, easily to manage risk , adaptability to changing market requirements has became undeniable main principles for each organization s approach Hence, using innovative methods for building project are important matter which has introduced in the recent years. Light weight methodologies (AGILE) evolve to meet changing technologies and new demands from users in dynamic business environment. The low success rate of medium-to-large software development projects highlights the need for business to adopt development practices effectively addressing change management, rapid application development, component reuse, visual programming, patterns and frameworks. A vision of the future of software engineering suggests a setting in which developers are able to wire together distributed components and services having established at an early stage, since the traditional waterfall methodology to software development, where each stage is tightly controlled in a linear progression, is more likely to fail under uncertainty . So the conditions that exist in the real world reflect those of the agile approach therefore the agile process model is a good strategy for meeting modern business challenges. Delivering software in traditional ways is challenged by agile software development to provide a very different approach to software development. Agile methods aim at fast, light and efficient than any other vigorous method to develop and support customers business without being chaotic. Agile software development methods claim to be people-oriented rather than process-oriented and adaptive rather than predictive. Solid Determination and Dedicated efforts are required in agile development to overcome the disadvantages of predefined set of steps and changing requirements to

see the desirable outcome and to avoid the predictable results. The focus of this research paper is to study agile methodologies, find out the levelheaded difficulties in agile software development and proposing Ten Ways to Improve Software Development Process

Introduction
Todays Information Technology (IT) manager is under ever-increasing pressure to deliver results in the form of applications that drive improvements to the bottom line even while IT budgets are being significantly slashed. Meanwhile, despite the fall of the Internet economy business environments continue to change at a rapid pace leaving many IT shops struggling to keep up with the pace of change. These changes have led to an increased interest in agile software development methodologies with their promise of rapid delivery and flexibility while maintaining quality. There are several software development methodologies in use today. Some companies have their own customized methodology for developing their software but many methodologies are categorized as: heavyweight and lightweight. Heavyweight methodologies, also considered as the traditional way to develop software, claim their support to comprehensive planning, detailed documentation, and expansive design. They are also known as planned methodologies. Planned methodologies try to solve these problems by imposing a disciplined process upon software development, with the aim of making software development more predictable and more efficient. To be more predictable, planned methodologies focus on creating a comprehensive up-front design, from which detailed construction plans are formulated. "Waterfall and incremental models are the most two well-known models of planned approach" The lightweight methodologies, also known as agile modeling, have gained significant attention from the software engineering community in the last few years. Unlike traditional methods, agile methodologies employ short iterative cycles,

390

flexibility and rely on tacit knowledge within a team as opposed to documentation. In the software development, the most challenging task is to develop projects under the pressure of dynamic market, where Time-To-Market (TTM) and requirements instability could fail the development process. Therefore project management should choose the development methodology that can control the problems associated with the dynamic market.

1. Heavyweight Methodologies
1.1 Waterfall The methodology that has dominated software development projects for decades is called waterfall.

Waterfall Process Model is a predictive method for software development project. The project process is based on documentation. For example, the activities consisting of requirement elicitation, analysis, design, code and test have to be well-defined in a document in the early phase. The project process must follow those activities step by step strictly. Thus, the Waterfall Process Model is said to fit large projects more since they are usually more complicated and it will be more organized to have a concrete and organized plan to follow through. However, the Waterfall Process Model does not allow the changing of requirements; thus, it is also seen as slow and illogical.

As Fig. 1.1 shows "waterfall suggests a systematic sequential approach to software development that begins with customer specification of requirements and progress through planning, modeling, construction and deployment, culminating in ongoing support of the complete software

1.2 Incremental model


This model combines elements of the waterfall model applied in an iterative fashion. By referring to Fig...1.2, the incremental model applies linear sequences in a staggered fashion as calendar time progress. Each linear sequence produces deliverable increment of the software. Using incremental model, development team can start working on the known increments, and clarify the rest later. Other problems may arise later if project is not well defined or if the definition changes much later. Rule of thumb is: 80% of the requirements should be known in the beginning. Development team should make a project priority chart, and plan the increments accordingly.

Figure 1.2 . The Iterative Enhancement model for software development

1.3 Spiral Model


Another heavyweight software development model is the spiral model, which combines elements of both design and prototyping-in-stages, in an effort to combine advantages of top-down and bottom-up

391

concepts. The spiral model was defined by Barry Boehm, based on experience with various refinements of the waterfall model as applied to large software projects. There are four main phases of the spiral model:

Objective setting Specific objectives for the project phase are identified Risk assessment and reduction Key risks are identified, analyzed and information is obtained to reduce these risks Development and Validation An appropriate model is chosen for the next phase of development. Planning The project is reviewed and plans are drawn up for the next round of spiral
The trend of planned methodologies does not serve projects that have a compelling need to get software to market quickly such as web applications. Such applications exhibit a time to market that can be a matter of a few days or weeks with giving high consideration to maximizing product values and customer satisfaction. For that reason more flexibility and more customer involvement in the development process are needed. Software projects using this methodology fail to meet their objectives. Organizations tried to cut the failure rate by insisting on more detail in the requirements and design phases. This process of requiring extensive, even exhaustive, documentation culminated in 1988 with the publication of the Department of Defense Standard for software development

repeatable. A lot of emphasis is put on the drawings focusing on the need of the system and how to resolve those needs efficiently. The drawings are then handed over to another group who are responsible for building the system. It is predicted that the building process will follow the drawings. The drawings specify how they need to build the system; it acts as the foundation to the construction process. As well, the plan predicts the task delegation for the construction team and reasonably predicts the schedule and budget for construction. Comprehensive Documentation Traditional software development view the requirements document as the key piece of documentation. A main process in heavyweight methodologies is the big design upfront (BDUF) process, in which a belief that it is possible to gather all of a customers requirements, upfront, prior to writing any code. Again this approach is a success in engineering disciplines which makes it attractive to the software industry. To gather all the requirements, get a sign off from the customer and then order the procedures (more documentation) to limit and control all changes does give the project a limit of predictability. Predictability is very important in software projects that are life critical. Process Oriented - The goal of heavyweight methodologies is to define a process that will work well for whoever happens to be using it. The process would consist of certain tasks that must be performed by the managers, designers, coders, testers etc. For each of these tasks there is a well defined procedure.

Tool Oriented Project management tools, Code


editors, compilers, etc. must be in use for completion and delivery of each task.

Plan-driven methods work best when developers can determine the requirements in advance . . . and when the requirements remain relatively stable, with change rates on the order of one percent per month. - Barry Boehm

2. Lightweight methodologies/ Agile Modeling Agile Software Development


Software development projects are highly unpredictable and that is why so many software development projects fail to meet expectations. The reason for why the Agile Software Development is needed for software development can be explained as below:

1.4 Heavyweight Characteristics


Heavyweight methodologies impose a disciplined process upon software development that makes software development more predictable and more efficient. They have not been noted to be very successful and are even less noted for being popular. Fowler criticizes that these methodologies are bureaucratic, that there is so much to follow the methodology that the whole pace of development slows down. The heavyweight methodologies have these similar characteristics.

If a predictive method like Waterfall is to avoid pain of problems, then we can say that Agile was developed to solve the pain of problems.
- M. Scott Peck, The Road Less Traveled.

2.1 What is Agile Software Development?


The Agile Software Development is a group of software process methodologies for small or medium organization is also called agile .It is isnt a set of tools or a single methodology, A recent survey conducted by Dr. Dobbs Journal shows 41 percent of development projects have now adopted agile

Predictive approach Heavyweight


methodologies have a tendency to first plan out a large part of the software process in great detail for a long span of time. This approach follows an engineering discipline where the development is predictive and

392

methodology, and agile techniques are being used on 65 percent of such projects.

2.2 Agile model


Agile was a significant departure from the heavyweight document-driven software development methodologiessuch as waterfallin general use at the time. Regular review and feedback is obtained from the end user. Thus, agile approach helps minimize risk, and lets the project adapt to changes quickly. Agile is the marvelous approach that has solutions for all problems related to the dynamic market, because agile achieves higher flexibility, and better to satisfy actual customer requirements. Agile achieves this, by developing and delivering the software product in an incremental fashion. Agile methodologies try to avoid any

development overheads, and minimize unnecessary effort. Agile framework is used under the software development methodology and it is advantageous as it helps increased business efficiencies by reducing the project budget as there are fewer defects in the final product. The lightness of the agile methodologies gives better responses to the different problems related to the dynamic market. Also the study shows that agile minimizes the cost of the changes of requirements during the development process. But there are certain drawbacks also attached to agile framework. As changes are unanticipated and comes so quickly it is difficult to avoid resistance from clients to end user training. Also a lack of documentation is often a distinctive feature as agile methods are not process-oriented.

Responding to change over following a plan

Manifesto for Agile Software Development


We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value: Individuals and interactions over processes and tools Working software over comprehensive documentation Customer collaboration over contract negotiation

That is, while there is value in the items on the right, we value the items on the left more.

2.3 The Agile Project Lifecycle


As I mentioned above, the Agile Development Framework is an iterative, incremental, and collaborative methodology for software development project. The Figure 2.3.1 shows that the Agile Software Development Lifecycle (ASDL) starts from an initial plan and ends with deployment.. Each iteration consists of planning, requirements, analysis and design, implementation, deployment, testing, and evaluation.

Figure 2.3.2 The Agile SDLC.

393

If we look into the ASDL in detail, the ASDL is comprised of six phases: Iteration 1, Iteration 0 (Warm up), Construction, Release (End Game), Production, and Retirement, see Figure 2.3.2. Each phase is described as: Iteration 1: Iteration 1 is devoted to pre project planning. Activities such as defining the business opportunity, identifying and assessing the possibility for the project are involved into this phase. Iteration 0: After completing the iteration 1, the environment setup, team formulation, support and funding obtainment, and so on taken in the iteration 0, also known as project initiation. Construction iteration: During construction iteration, a highly collaboration between team members and customers is required. Moreover, works regarding prioritized functionality implementation, system analysis and design, regular working software delivery, and verification are also embraced into this phase. Release: In this phase, activities such as testing, documentation finalization, and system deployment into production have to be done in this phase. Production: When a project turns into production phase, the goal is set as keeping systems useful and productive after deploying them to the user community. In other words, the system should be kept running and help users to use it. Retirement: In the end of the life cycle, the iteration goes to retirement phase which is to update enterprise models and remove the final version of the system data conversion if the system has become obsolete or can be complete replaced .

Business people and developers must work together daily throughout the project. Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done. The most efficient and effective method of conveying information to and within a development team is facetoface conversation. Working software is the primary measure of progress. Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely. Continuous attention to technical excellence and good design enhances agility. Simplicity the art of maximizing the amount of work not done is essential. The best architectures, requirements, and designs emerge from self organizing teams. At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.

2.5 General Understanding of the Agile Software Development


Generally, agility is defined by ability which is able to be flexible and adaptable to change. The idea of the Agile Development Framework is to create a pain free working environment for those small, collocated, selforganized teams in order to assist companies to take full advantage of the customer value of the delivered software product. On an Agile project, developers work closely with their customers to understand their needs, they are placed in a pair to implement and test their solution, and the solution is shown to the customers for quick feedback .Therefore, the business contract will not become a barrier between customers and developers, but a platform to help customers and developers work together (see Figure 2.5.1).

2.4 The Agile principle


Our highest priority is to satisfy the customer through early and continuous delivery of valuable software. Welcome changing requirements, even late in development. Agile processes harness change for the customer's competitive advantage. Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.

394

Figure 2.5.1 Comparison of contracts between traditional & the Agile Development Framework. (Source: Koch, 2005)

In addition, the requirements can be reprioritized anytime. Once the priority changed, the new higher requirement will be pulled up to the top of the stack, see Figure 2.5.2.

users as the most important factor of software methodologies. As Jim Highsmith and Alistair Cockburn state, The most important implication to managers working in the agile manner is that it places more emphasis on people factors in the project: amicability, talent, skill, and communication. If the people on the project are good enough, they can use almost any process and accomplish their assignment. If they are not good enough, no process will repair their inadequacy. As Highsmith highlights, people trump process . Adaptive The participants in an agile process are not afraid of change. Agilest welcome changes at all stages of the project. They view changes to the requirements as good things, because they mean that the team has learned more about what it will take to satisfy the market. Today the challenge is not stopping change but rather determining how to better handle changes that occur throughout a project. External Environment changes cause critical variations. Because we cannot eliminate these changes, driving down the cost of responding to them is the only viable strategy.

Conformance to Actual Agile


methodologies value conformance to the actual results as opposed to conformance to the detailed plan. Highsmith states, Agile projects are not controlled by conformance to plan but by conformance to the business value. Each iteration or development cycle adds business value to the ongoing product. For agilest, the decision on whether business value has been added or not is not given by the developers but instead by end users and customers. Balancing Flexibility and Planning Plans are important, but the problem is that software projects can not be accurately predicted far into the future, because there are so many variables to take into account. A better planning strategy is to make detailed plans for the next few weeks, very rough plans for the next few months, and extremely crude plans beyond that. In this view one of the main sources of complexity is the irreversibility of decisions. If you can easily change your decisions, this means its less important to get them right which makes your life much simpler. The consequence for agile design is that designers need to think about how they can avoid irreversibility in their decisions. Rather than trying to get the right decision now, look for a way to either put off the decision until later or make the decision in such a way that you will be able to reverse it later on without too much difficulty. Empirical Process Agile methods develop software as an empirical (or nonlinear) process. In engineering, processes are either defined or empirical. In other words, defined process is one that can be started and allowed to run to completion producing the same results every time. In software development it can not be considered a defined process because too much change occurs during the time that the team is developing the product. Laurie Williams states, It is highly unlikely

Figure 2.5.2 Difference of requirement change b/w traditional process model &the Agile Software Development.(Source: Koch, 2005)

The Agile Software Development is the work of energizing, empowering, and enabling project teams to rapidly and reliably deliver business value by engaging customers and continuously learning and adapting to their changing needs and environments.

The focus of agile software development has been on methods and practices 2.6 Characteristics of Agile Methodologies
According to Highsmith and Cockburn, what are new about agile methods is not the practices they use, but their recognition of people as the primary drivers of project success, coupled with an intense focus on effectiveness and maneuverability. This yields a new combination of values and principles that define an agile world view.. Agility... is a comprehensive response to the business challenges of profiting from rapidly changing, continually fragmenting, global markets for high-quality, high-performance, customer-configured goods and services. People Oriented- Agile methodologies consider people customers, developers, stakeholders, and end

395

that any set of predefined steps will lead to a desirable, predictable outcome because requirements change technology changes, people are added and taken off the team, and so on. Decentralized Approach Integrating a decentralized management style can severely impact a software project because it could save a lot of time than an autocratic management process. Agile software development spreads out the decision making to the developers. This does not mean that the developers take on the role of management. Management is still needed to remove roadblocks standing in the way of progress. However management recognizes the expertise of the technical team to make technical decisions without their permission. Simplicity Agile teams always take the simplest path that is consistent with their goals. Fowler states, They (agile teams) dont anticipate tomorrows problems and try to defend against them today .The reason for simplicity is so that it will be easy to change the design if needed on a later date. Never produce more than what is necessary and never produce documents attempting to predict the future as documents will become outdated. The larger the amount of documentation becomes, the more effort is needed to find the required information, and the more effort is needed to keep the information up to date. Collaboration Agile methods involve customer feedback on a regular and frequent basis. The customer of the software works closely with the development team, providing frequent feedback on their efforts. As well, constant collaboration between agile team members is essential. Due to the decentralized approach of the agile methods, collaboration encourages

discussion. As Martin Fowler describes, Agile teams cannot exist with occasional communication. They need continuous access to business expertise. Small Self-organizing teams An agile team is a self organizing team. Responsibilities are communicated to the team as a whole, and the team determines the best way to fulfill them. Agile teams discuss and communicate together on all aspects of the project. That is why agility works well in small teams. As Alistair Cockburn and Jim Highsmith highlight, Agile development is more difficult with larger teams. The average project has only nine people, within the reach of most basic agile processes. Nevertheless, it is interesting to occasionally find successful agile projects with 120 or even 250 people.

3. Comparison of Agile and Heavyweight


Traditional development approaches have been around for a very long time. Since its introduction the waterfall model has been widely used in both large and small software projects and has been reported to be successful to many projects. Despite the success it has a lot of drawbacks, like linearity, inflexibility in changing requirements, and high formal processes irrespective of the size of the project. Kent Beck took these drawbacks into account and introduced Extreme Programming, the first agile methodology produced. Agile methods deal with unstable and volatile requirements by using a number of techniques, focusing on collaboration between developers and customers and support early product delivery. A summary of the difference of agile and heavyweight methodologies is shown in the table below.

396

4. Agile Methodologies

Table 4.1: A summary of three agile software development methodologies

397

Several key ideas about agile methodologies


The keys for successful use of choosing and using agile methodologies are summarized in Table 4.2

5. We are proposing Ten Ways to Improve Software Development Process


1) Less people in meetings
Stop having those meetings with 10 people trying to resolve a problem or make a decision. If it's a technical problem then get a maximum of two relevant people at a Developers desk. Let the Developer show you a few options. Pick an option and let the Developer get on with it. The same goes for a Business problem. Just get an end users, an expert in the business subject, the project owner and a Developer. Let them quickly decide on the best solution.

5) Mistakes are a way of learning


Remove your blame culture. Let people make mistakes quickly. These so called mistakes are a great discovery process and will lead onto the eventual solution. If you start blaming people, they will clam up, they will suppress their creativity, they will be clock watchers, and they will be looking for work elsewhere.

6) Smaller teams
Linked with smaller projects are smaller teams. Small teams make communication far easier. Decisions are easier to make. Let the small teams work in a fashion that suits them. Do not impose a 'one size fits all'. Success does not come from being constrained to walking down the same path as everyone else. Let teams wander off the path and take responsibility.

2) No need for minuscule details


Stop trying to document every minuscule detail in an effort to prevent failure. You are guaranteed failure if you try and create documents that describe everything. In a lot of cases, a high level description in just a few paragraphs is sufficient and gives you the flexibility to change your requirements before to much time is spent on the actual detail.

7) Let the system evolve


The end users of the software do not know every little detail that they want. They have a high level idea about the main feature that they need. Stop trying to get the users to specify every little bit of detail. It is very hard to do that and for very little benefit. Just gather the top level requirements. Let the Developers build something that the users can experiment with. Quite often the users will then think of new features of greater benefit to the business, and also they will discard other ideas that in the beginning they had though were of high importance.

3) Obtain user feedback


Keep the end users constantly in the picture. Show them early versions of the software. Let them use it. Let them give you feedback, and above all, let them change their minds without being punished. Trust your end users; they know what works and what does not. Keep them involved.

8) Something has to give


Price, Features, Quality, Time. Something has to give. You decide which. But please do not add more features and then expect Price, Quality and Time to remain the same. You need to give and take.

4) Smaller projects
Big projects turn into NASA type projects. You probably do not have the size of team and infrastructure to support that sort of project, yet you persist in trying to do it and are then surprised when the project fails.Make your projects smaller. Instead of the big bang 10,000,000 projects that usually go wrong, concentrate on smaller projects. Obviously these smaller projects are done with the bigger picture in mind. These small projects can then be completed in three to six months, from concept to completion. A small project means that you will probably complete it satisfactory an on time. The application can then start giving you a return on your investment as soon as possible. Your teams get a great deal of satisfaction and this enthusiasm feeds into the next project.

9) Implement the top twenty percent of features


Work on the top 20% features. Quite often these features will give 80% of what the end users want. They can start using the system far quicker and start adding to the profitability of your company.

10) Empower your users


Stop being obsessed with users not being allowed to request or suggest changes. Don't put a change control mechanism in place that stops users even to daring to think of ways of improving your products

Conclusion
In this paper, I described the different approaches to software development through heavyweight and agile

398

methodologies. Furthermore, I initially criticized on both heavyweight and agile methodologies followed by the comparison. My conclusion is that the Agile Development Framework is recommendable for software development process; nevertheless, it is not suitable for all software development projects. For example, a globally distributed agile project is not suited for a tight budgeted team since it requires a lot of business trips between two countries. Moreover, the Agile Development Framework emphasizes that the change of requirements is allowable; however, change of requirements always involves lots of team members in a large software project team. Thus, using the Agile Development Framework may delay the project progress if a project manager does not assess the feasibility of implementing the Agile Development Framework in advance. Agile methods have proven their effectiveness and are transforming the software industry. As I mentioned earlier, the need for business to respond rapidly to the environment in an innovative, cost effective and efficient way is compelling the use of agile methods to developing software. The future of agile methodologies seems very dominant. In general, there are some aspects of software development project can benefit from an agile approach and others can benefit from a more predictive traditional approach. When it comes to methodologies, each project is different. One thing is clear: that there is no one-sizefits-all solution. Thats why we are proposing Ten Ways to Improve Software Development Process

References
1. Esther Jyothi and K. Nageswara Rao, Effective implementation of agile practices A collaborative and innovative framework, published in CiiT International Journal of Software Engineering and Technology, Vol 2, No 9, September 2010. 2. K. Conboy and B. Fitzgerald, Agile Drivers, Capabilities, and Value: An Over Arching Assessment Framework for Systems development, In: K. C. Desouza, Ed., Agile Information Systems; Conceptualization, Construction, and Management, Butterworth-Heinemann, Burlington, 2007, pp. 207-221. 3. Roger, S., Software Engineering a Practitioner's Approach, McGrow-Hill International Edition (2005). 4. D. Marks, Development Methodologies Compared, N CYCLES software solutions, December 2002 , www.ncycles.com , Accessed on 2/2/2005 5. J. Highsmith, Agile Project Management: Creating Innovative Products, Addison Wesley, Boston, 2004. 6. Abrahamsson, P. et al. New Directions on Agile Methods: A Comparative Analysis. In Proceedings of

the 25th International Conference on Software Engineering. IEEE 244-256, Portland, Oregon, May 2003. May be found at: http://csdl.computer.org/comp/proceedings/icse/2003/1877/ 00/18770244abs.htm 7. Alleman, G. B. and Henderson, M. Making Agile Development Work in a Government Contracting Environment. In Proceedings of the Agile Development Conference (ADC03). IEEE 114-120, Salt Lake City, Utah, June 2003. May be found at: http://csdl.computer.org/comp/proceedings/adc/2003/2013/ 00/20130114abs.htm 8. Armitage, J. Are Agile Methods Good for Design? Interactions. ACM 14-23. 11,1 January 2004. May be found at: http://portal.acm.org/citation.cfm?id=962342.962352 9. Beck, K. et al. Manifesto for Agile Software Development. Last Access: 02-7-2005. May be found at: http://www.agilemanifesto.org/ 10. Boehm, B. and Turner, R. Using Risk to Balance Agile and Plan-Driven Methods. IEEE Computer. IEEE 57-66, 36,6, June 2003. May be found at: http://csdl.computer.org/comp/mags/co/2003/06/r6057abs.h tm 11. Conboy, K. and Fitzgerald, B. Toward a Conceptual Framework of Agile Methods: A Study of Agility in Different Disciplines. In Proceedings of the 2004 ACM Workshop on Interdisciplinary Software Engineering Research. ACM 37-44, Newport Beach, CA. November 2004. May be found at: http://portal.acm.org/citation.cfm?id=1029997.1030005 12. Derbier, G. Agile Development in the Old Economy. In Proceedings of the Agile Development Conference (ADC03). IEEE 125-132, Salt Lake City, tah, June 2003. May be found at: http://csdl.computer.org/comp/proceedings/adc/2003/2013/ 00/20130125abs.htm 13. Robert C. Martin, Agile Software Development, Principles, Patterns, and Practices, Prentice Hall, 2002 14. Green, B. Agile Methods Applied to Embedded Firmware Development. In Proceedings of the Agile Development Conference (ADC04). IEEE 71-77, Salt Lake City, Utah, June 2004. May be found at: http://csdl.computer.org/comp/proceedings/adc/2004/2248/ 00/22480071abs.htm 15. Harriman, A., Hodgetts, P. and Leo, M. Emergent Database Design: Liberating Database Development with Agile Practices. In Proceedings of the Agile Development Conference (ADC04). IEEE 100-105, Salt Lake City, Utah, June 2004. May be found at: http://csdl.computer.org/comp/proceedings/adc/2004/2248/ 00/22480100abs.htm 16. Agile Alliance.2002. Agile Manifesto. http://www.agilealliance.org/.

399

Vulnerabilities in WEP Security and Their Countermeasures


Akhilesh Arora DKES-School of Computer Science, Lodhi Estate, New Delhi, India akhileshgee@yahoo.com
Abstract This paper reveals vulnerabilities and weaknesses of WEP protocol. IEEE 802.11-based WLANs have become more prevalent and are widely deployed in many places. Wireless networks offer handiness, mobility, and can even be less expensive to put into practice than wired networks in many cases. The rapid growth of wireless networks has driven both industry and the relevant standards body to respond to the flaws and security problems of WLAN which is very important for the applications hosting valuable information as WLANs broadcast radio-frequency data for the client stations to receive. In this paper the authentication and algorithm used in WEP at the transmitter and receiver side and the main security flaws of WEP used in Wi-Fi is provided. It also discusses about practical attacks on the wireless network to make them more secure. Finally it explains the improvement and solutions to enhance the level of security for the WLAN using WEP. Index TermsIEEE 802.11, Countermeasures, Vulnerabilities, WEP, WLAN, Wi-Fi. I. INTRODUCTION

is susceptible to eavesdropping. The goal of the WEP was to raise the level of security for WEPenabled wireless devices. . Data protected by the WEP is encrypted to provide Confidentiality and data privacy: As it prevents casual eavesdropping. Access control: As it protects access to a wireless network infrastructure. It has a feature for discarding all packets that are not properly encrypted using WEP. Data integrity: As it prevents tampering with transmitted messages; the integrity checksum field is included for this purpose. Unfortunately, several serious security flaws have been identified in WEP mechanism after it went into operation. This paper focuses on of the working of WEP, the security issues of WEP algorithm and proposes solutions and improvements to the same. II. WEP AUTHENTICATON WEP uses two ways to authenticate Wi-Fi users:

Wi-Fi (Wireless Fidelity) is one of todays leading wireless technologies with Wi-Fi support being integrated into more and more devices: laptops, PDAs, mobile phones with which one stay online anywhere where a wireless signal is available. The basic theory behind wireless technology is that signals can be carried by electromagnetic waves that are then transmitted to a signal receiver. The growing popularity of wireless local area networks (WLANs) over the past few years, has led many enterprises to realize the inherent security issues which often goes unnoticed [2]. Wireless LANs are open to hackers who are trying to access sensitive information or spoil the operation of the network. To avoid such attacks, security algorithm such as Wired Equivalency Privacy (WEP) has been adopted. In a wireless network transmitted messages are broadcasted using radio hence it

400

1) Open authentication: This authentication mechanism is used to provide quick and easy access for Wi-Fi users. AP operating in this mode accepts an authentication request and responds with an authentication success message. The disadvantage of this mechanism is that it allows any wireless station to access the network.

WEP (Wired Equivalent Privacy) was the default encryption protocol introduced in the first IEEE 802.11 standard back in 1999. The Wired Equivalent Privacy (WEP) was designed to provide the security of a wired LAN by encryption through use of the RC4 algorithm [3].

Fig 1: Open Authentication


2) Shared key authentication or WEP authentication: It is a cryptographic technique which is used to provide only legitimate and authenticated users to access the AP i.e. the supplicant device knows a secret key shared among them. In WEP authentication process, firstly, the wireless station sends an authentication request, then the AP responds with a challenge text (an arbitrary 128 bit number).The wireless station encrypt the challenge using its WEP key and sends the result back to the AP. The AP decrypts the response to the challenge sent by the wireless station using its WEP key and compares the result with the challenge that was sent to the wireless station. If the decrypted response matches the challenge sent then access is granted otherwise access is denied.

Fig 3: WEP encryption

1. At the transmitting end: There are two processes in the WEP Encryption process. In first process, one computes a 32-bit integrity check value (IVC) by using CRC-32 algorithm over the message plaintext to protect against unauthorized data modification; the other encrypts the plain text. For the second process, the secret key of 40 bits or 104 bits is concatenated with an 24 bits initialisation vector(IV) resulting in a 64-bit total key size. WEP inputs the resulting key, so-called seed, into the PRNG that yields a key sequence equal to the length of the plaintext plus the ICV. The resulting sequence is used to encrypt the expanded plaintext by doing a bitwise XOR. A final encrypted message is made by attaching the IV in front of the cipher text [1].The initialisation vector is the key to WEP security, so to maintain a decent level of security and minimise disclosure the IV should be incremented for each packet so that subsequent packets are encrypted with different keys. For a 128-bit key, the only difference is that the secret key size becomes 104 bits and the IV remains 24 bits. . The encrypted message C was therefore determined using the following formula: C = [M || ICV (M)] + [RC4 (K || IV)] Where, || is a concatenation operator and + is a XOR operator.

Fig 2: WEP Authentication III. WEP ALGORITHM

2.

At the receiving end:

401

among them, i.e., the keystream or the shared secret key. Because XORing two cipher texts that use the same key stream would cause the key stream to be cancelled out and the result would be the XOR of the two plaintexts. Encrypted plaintext can be recovered through various techniques, if packets with the same IV are discovered. Worse, the reuse of a single key by many users also helps make the attacks more practical, since it increases chances of IV collision. Fig 4: WEP decryption For the decryption at the receivers side the PreShared Key and IV are concatenated to make a secret key sequence. The Cipher text and Secret Key go to in CR4 algorithm and a plaintext and ICV come as a result. The plaintext inserted to Integrity Algorithm to make a new ICV i.e. ICV [3]. Using the CRC-32 algorithm on the recovered plaintext and comparing the output ICV' to the ICV decryption can be verified. If ICV' is not equal to ICV, there is an error in the received message and then an indication is sent back to the sending station informing about the error.
LXXXIII. SECURITY FLAWS IN WEP

As the key size increases the security of a cryptographic technique increases. For key size greater than 80 bits, brute force1 is extremely difficult. The standard key in use today is 64 bits. The WEP protocol was not created by experts in security or cryptography, so it quickly proved vulnerable to various issues [1]. Determined hackers can crack WEP keys in a busy network within a relatively short period of time. 1. Size of the IV is too short and can be reused frequently IVs are too short (24 bits less than 5000 packets required for a 50% chance of collision) and IV reuse is allowed (no protection against message replay), hence WEP is vulnerable. The subsequent packets are encrypted with same keys which are repeated. So to maintain a descent level of security and minimize disclosure the IV should be incremented for each packet so that subsequent packets are encrypted with different keys. A 24-bit IV is not long enough to ensure this on a busy network. As well-know, this 24bit IV provides for 16,777,216 different RC4 cipher streams for a given WEP key, for any key size. Such a small space of IVs guarantees the reuse of the same keystream. The RC4 cipher stream is XOR-ed with the original packet to give the encrypted packet that is transmitted, with the IV attached with each packet. If a hacker collects enough frames based on the same IV, the individual can determine the shared values

Lack of key management and no built-in method of updating keys In most wireless networks that use key shared authentication there is one single WEP key shared between every node on the network i.e. access points and wireless stations have the same WEP key [3]. Since synchronizing the change of keys is tedious and difficult, network administrators must personally visit each wireless device in use and manually enter the appropriate WEP key. This may be acceptable at the installation stage of a WLAN or when a new client joins the network, but anytime the key becomes compromised or there is a loss of security, the key must be changed. This may not be a huge issue in a small organization with only a few users, but it can be impractical in large corporations, which typically have hundreds of users. As a consequence, potentially hundreds of users and devices could be using the same key for long periods of time. All wireless network traffics from all users will be encrypted using the same key; this makes it a lot easier for someone listening to traffic to crack the key, as all the packets are being transmitted using the same key. Hence this practice impacts the security [1]. 3. RC4 algorithm weaknesses Initiation vector (IV), a 3-byte random number generated by the computer is combined with a key chosen by the user to generate the final key for WEP encryption/decryption in the RC4 algorithm. The IV is transmitted with the encrypted data in the packet without any protection so that the receiving end knows how to decrypt the traffic as the receiver has a copy of the same secret key. This mode of operation makes stream ciphers vulnerable to several attacks. When the wireless network is using the same userchosen key and duplicate IV on multiple packets, then attacker would know that all those packets with the same IV are being encrypted with the same key, and can then build a dictionary based on the packets collected. By knowing that the RC4 cryptography system uses the XOR algorithm to encrypt the

2.

402

plaintext (user data) with the key, the attacker can find the possible value of the packets. The attacker can do this because the XOR result of two different plaintext values is the same as the XOR result of two cipher-text-encrypted values with the same key. If the attacker can guess one of the plaintext packets, then they can decrypt the other packet encrypted with the same key [6]. A weakness in the random key generator of the RC4 algorithm used in WEP can permit an attacker to collect enough packets with IVs that match certain patterns to recover the user-chosen key from the IVs. To avoid encrypting two cipher-texts with the same key stream, an Initialization Vector (IV) is used to augment the shared secret key and produce a different RC4 key for each packet. Generally, to get enough "interesting" IVs to recover the key, millions of packets are to be sniffed, so it could take days, if not weeks, to crack a moderately used wireless network. Several tools are available to do the sniffing and decoding. Air-snort is the famous one: it runs on Linux and tries to break the key when enough useful packets are collected. 4. Bit-Manipulation WEP doesn't protect the integrity of the encrypted data. The RC4 cryptography system performs the XOR operation bit by bit, making WEP-protected packets vulnerable to bit-manipulation attack. This attack requires modification of any single bit of the traffic to disrupt the communication or cause other problems [7]. 5. Easy forging of authentication messages There are currently two ways to authenticate users before they can establish an association with the wireless network. Open System and Shared Key authentication.. Open system authentication usually means you only need to provide the SSID or use the correct WEP key for the AP. Shared Key authentication involves demonstrating the knowledge of the shared WEP key by encrypting a challenge. The insecurity here is that the challenge is transmitted in clear text to the STA, so if someone captures both challenge and response, then they could figure out the key used to encrypt it, and use that stream to encrypt any challenge he/she would receive in the future [7]. So by monitoring a successful authentication, the attacker can later forge an authentication. The only advantage of Shared Key authentication is that it reduces the ability of an attacker to create a denial-of-service attack by

sending garbage packets (encrypted with the wrong WEP key) into the network. To handle the task of proper authenticating wireless users turn off Shared Key authentication and depend on other authentication protocols, such as 802.1x.

Lack of proper integrity (The CRC-32 algorithm is not cryptographically secure) The integrity check stage also suffers from a serious weakness due to the CRC32 algorithm used for this task. CRC32 is commonly used for error detection, but was never considered cryptographically secure due to its linearity, CRC-32 is linear, which means that it is possible to compute the bit difference of two CRCs based on the bit difference of the messages over which they are taken. In other words, flipping bit n in the message results in a deterministic set of bits in the CRC that must be flipped to produce a correct checksum on the modified message [1]. Because flipping bits carries through after an RC4 decryption, this allows the attacker to flip arbitrary bits in an encrypted message and correctly adjust the checksum so that the resulting message appears valid.
LXXXIV. WEP KEY CRACKING USING AIRCRACK

6.

Practical WEP cracking can easily be demonstrated using tools such as Aircrack [4] (created by French security researcher Christophe Devine). Aircrack contains three main utilities, used in the three attack phases required to recover the key: airodump: wireless sniffing tool used to discover WEP-enabled networks, aireplay: injection tool to increase traffic, aircrack: WEP key cracker using collected unique IVs packets. Currently aireplay only supports injection on specific wireless chipsets, and support for injection in monitor mode requires the latest patched drivers. Monitor mode is the equivalent of promiscuous mode in wired networks, preventing the rejection of packets not intended for the monitoring host (which is usually done in the physical layer of the OSI stack) and thus allowing all packets to be captured. The main goal of the attack is to generate traffic in order to capture unique IVs used between a legitimate client and an access point. Some encrypted data is easily recognizable because it has a fixed length, fixed destination address etc. This is the case with ARP request packets, which are sent to the broadcast address (FF:FF:FF:FF:FF:FF) and have a fixed length of 68 octets. ARP requests can be

403

replayed to generate new ARP responses from a legitimate host, resulting in the same wireless messages being encrypted with new IVs. In this example, 00:13:10:1F:9A:72 is the MAC address of the access point (BSSID) on channel 1 with the SSID hakin9demo and 00:09:5B:EB:C5:2B is the MAC address of a wireless client . Executing the sniffing commands requires root privileges. We proceed in the following way:

4. Using IVs and decrypting the packet Finally, aircrack is used to recover the WEP key. Using the pcap file makes it possible to launch this final step while airodump is still capturing data:
Syntax: #aircrack [options] <input pcap file> # aircrack -x -0 wep-crk.cap
LXXXV. OTHER TYPES OF AIRCRACK ATTACKS

1. Activating monitor mode


The first step is to activate monitor mode on our wireless card, so we can capture all the traffic by the command: # airmon.sh start ath0

Aircrack also makes it possible to conduct other interesting attacks types. Let's have a look at some of them. Attack 1: De-authentication This attack can be used to recover a hidden SSID (i.e. one that isnt broadcast), capture a WPA 4-way handshake or force a Denial of Service. The aim of the attack is to force the client to re-authenticate, which coupled with the lack of authentication for control frames (used for authentication, association etc.) makes it possible for the attacker to spoof MAC addresses. A wireless client can be de-authenticated using the following command, causing de-authentication packets to be sent from the BSSID to the client MAC by spoofing the BSSID: # aireplay -0 5 -a 00:0C:F1:19:77:5C ath0 00:13:10:1F:9A:72 -c

2. Discovering nearby networks and their clients The next step is to discover nearby networks and their clients by scanning all 14 channels that Wi-Fi networks can use:
Syntax: # airodump <interface> <output prefix> [channel] [IVs flag] Interface is your wireless interface to use required. Output prefix is just the filename it'll prepend, -required. Channel is the specific channel we'll scan, leave blank or use 0 to channel hop. IVs flag is either 0 or 1, depending on whether you want all packets logged, or just IVs. # airodump ath0 wep-crk 0 Once the target network has been located, capture should be started on the correct channel to avoid missing packets while scanning other channels.

Mass de-authentication is also possible (though not always reliable), involving the attacker continuously spoofing the BSSID and resending the deauthentication packet to the broadcast address: # aireplay -0 0 -a 00:13:10:1F:9A:72 ath0 Attack 2: Decrypting arbitrary WEP data packets without knowing the key This attack is based on the KoreK proof-of-concept tool called chopchop which can decrypt WEPencrypted packets without knowledge of the key. The integrity check implemented in the WEP protocol allows an attacker to modify both an encrypted packet and its corresponding CRC. Moreover, the use of the XOR operator in the WEP protocol means that a selected byte in the encrypted message always depends on the same byte of the plaintext message. Chopping off the last byte of the encrypted message corrupts it, but also makes it possible to guess at the value of the corresponding plaintext byte and correct the encrypted message accordingly.

3. ARP Injection Next, we can use previously gathered information to inject traffic using aireplay. Injection will begin when a captured ARP request associated with the targeted BSSID appears on the wireless network:
Syntax: #aireplay -3 -b <AP MAC Address> -h <Client MAC Address> ath0 The -3 specifies the type of attack (3=ARP Replay) # aireplay -3 -b 00:13:10:1F:9A:72-h 00:0C:F1:19:77:5C -x 600 ath0

404

If the corrected packet is then re-injected into the network, it will be dropped by the access point if the guess was incorrect (in which case a new guess has to be made), but for a correct guess it will be relayed as usual. Repeating the attack for all message bytes makes it possible to decrypt a WEP packet and recover the keystream. Remember that IV increment is not mandatory in WEP protocol, so it is possible to reuse this keystream to spoof subsequent packets (reusing the same IV). The wireless card must be switched to monitor mode on the right channel. Decrypting WEP packets without knowing the key

but the access point will drop any packets not encrypted with the correct WEP key. Syntax: #aireplay -1 30 -e '<ESSID>' -a <BSSID> -h <Fake MAC> ath0 #aireplay -1 0 -e hackdemo -a 00:13:10:1F:9A:72 -h 0:1:2:3:4:5 ath0 In this example, Aireplay is used to fake an authentication and association request for the SSID hakin9demo (BSSID: 00:13:10:1F:9A:72) with the spoofed MAC address 0:1:2: 3:4:5. Some access points require clients to re-associate every 30 seconds. This behavior can be mimicked in aireplay by replacing the second option (0) with 30.
LXXXVI. SOME ANTICIPATED PROBLEMS

The attack must be launched against a legitimate client (still 00:0C:F1:19: 77:5C in our case) and aireplay will prompt the attacker to accept each encrypted packet # aireplay -4 -h 00:0C:F1:19:77:5C ath0

There are lots of problems that can come up that will make the above fail, or work very slowly [3]. No traffic No traffic is being passed; therefore you can't capture any IVs. In this case we need to inject some special packets to trick the AP into broadcasting. MAC Address filtering AP is only responding to connected clients. Probably because MAC address filtering is on. In this case, we Use airodump screen to find the MAC address of authenticated users and change our MAC to theirs and continue on or use the -m option to specify aircrack to filter packets by MAC Address, ex. -m 00:12:5B:4C:23:27 Can't Crack even with tons of IVs Some of the statistical attacks can create false positives and lead you in the wrong direction. In this case, try using -k N (where N=1.17) or -y to vary your attack method or Increase the fudge factor. By default it is at 2, by specifying -f N (where N>=2) will increase your chances of a crack, but take much longer. Doubling the previous fudge factor is a good option.
LXXXVII. IMPROVEMENTS IN WEP

Reading a pcap file from the attack

# tcpdump -s 0 -n -e -r replay_dec-0916-114019.cap Two pcap files are created: one for the unencrypted packet and another for its related keystream. The resulting file can be made humanread. Replaying a forged packet

# aireplay -2 -r forge-arp.cap ath0 Finally, aireplay is used to replay this packet. This method is less automated than Aircracks own ARP request spoofing (the -1 option), but its more scalable the attacker can use the discovered keystream to forge any packet that is no longer than the key-stream (otherwise the keystream has to be expanded). Attack 3: Fake authentication The WEP key cracking method described earlier (Attacks 1 and 3) requires a legitimate client associated with the access point to ensure the access point does not discard packets due to a nonassociated destination address. If open authentication is used, any client can be authenticated and associated with the access point,

WEP doesn't protect the data well enough. So to configure the WEP securely, follow the suggestions: Use the highest security available: If your devices support 128-bit WEP, then use it. It is extremely hard to brute-force a 128-bit WEP key. If you cannot dynamically change WEP keys, then set the 128-bit policy and change the key periodically [1]. The length of this period depends on how busy your wireless network traffic will be. Use MIC when available: We also

405

mentioned that the WEP key does not protect the integrity of the packet. For integrity of data, Message Integrity Code (MIC) for TKIP is computed by a new algorithm namely Michael [9]. Message Integrity Code (MIC) is computed to detect errors in the data contents, either due to transfer errors or due to purposeful alterations. A 64bit MIC is added to the Data and the ICV. Currently IEEE is working on 802.11i both to fix the problem in WEP and to implement 802.11x and Message Integrity Checksum (MIC) for data confidentiality and integrity. MIC will generate checksums for the encrypted data to ensure the integrity of the data where ICV is CRC of Data and MIC. Use improved data encryption - TKIP: Temporal Key Integrity Protocol (TKIP) uses a hashing algorithm and, by adding an integritychecking feature using 128 bits as secret key, is an alternative to WEP that fixes all the security problems and does not require new hardware. Like WEP, TKIP uses the RC4 stream cipher as the encryption and decryption processes and all involved parties must share the same secret key. This secret key must be 128 bits and is called the "Temporal Key" (TK). TKIP also uses an Initialization vector (IV) of 48-bit and uses it as a counter. Even if the TK is shared, all involved parties generate a different RC4 key stream. Since the communication participants perform a 2-phase generation of a unique "Per-Packet Key" (PPK) that is used as the key for the RC4 key stream [1]. TKIP is a TGis response to the need to do something to improve security for equipment that already deployed in 802.11. TGi has proposed TKIP as a mandatory to implement security enhancement for 802.11, and patches implementing it will likely be available for most equipment.TKIP is a suite of algorithms wrapping WEP, to achieve the best security that can be obtained given the problem design constraints. TKIP adds four new algorithms to WEP: A cryptographic message integrity code, or MIC, called Michael, to defeat forgeries; A new IV sequencing discipline, to remove replay attacks from the attackers arsenal; A per-packet key mixing function, to decorrelate the public IVs from weak keys; and A re-keying mechanism, to provide fresh encryption and integrity keys, undoing the threat of attacks stemming from key reuse.

The TKIP re-keying mechanism updates what are called temporal keys, which are consumed by the WEP encryption engine and by the Michael integrity function. Use dynamic WEP keys: Dynamic keys Changes WEP keys dynamically. When using 802.1x, dynamic WEP keys should be used if possible. A dynamic key is generated using the static shared key and a random number. A hash function is used to generate the dynamic key where the static shared key and the random number are the inputs. This uses the same frame size as WEP protocol. But it provides dynamic key for encryption at each authentication session and based on this key, another temporary dynamic key is generated per data frame basis. The authentication mechanism used is a more improved version compared with WEP. Using dynamic WEP keys will provide a higher level of security and will counter some of the known WEP insecurities. To use dynamic WEP an access point has to be purchased that can provide them and use clients that support them [10]. Use 802.1xSecure Authentication through EAP: The IEEE 802.1X authentication protocol (also known as Port- Based Network Access Control) is a framework originally developed for wired networks, providing authentication, authorization and key distribution mechanisms, and implementing access control for users joining the network. . It also manages the WEP key by periodically and automatically sending a new key, to avoid some of the known WEP key vulnerabilities. The data confidentiality will be protected by these dynamic WEP keys. The IEEE 802.1X architecture is made up of three functional entities: The supplicant joining the network, The authenticator providing access control, The authentication server making authorization decisions [11]. In wireless networks, the access point serves as the authenticator. Each physical port (virtual port in wireless networks) is divided into two logical ports making up the PAE (Port Access Entity). The authentication PAE is always open and allows authentication frames through, while the service PAE is only opened upon successful authentication (i.e. in an authorized state) for a limited time (3600 seconds by default). The decision to allow access is usually made by the third entity, namely the authentication

406

server (which can either be a dedicated Radius server or for example in home networks a simple process running on the access point). The 802.11i standard makes small modifications to IEEE 802.1X for wireless networks to account for the possibility of identity stealing. Message authentication has been incorporated to ensure that both the supplicant and the authenticator calculate their secret keys and enable encryption before accessing the network [1]. The supplicant and the authenticator communicate using an EAP-based protocol. The role of the authenticator is essentially passive it may simply forward all messages to the authentication server. EAP is a framework for the transport of various authentication methods, allowing only a limited number of messages (Request, Response, Success, and Failure), while other intermediate messages are dependent on the selected authentications method: EAP-TLS, EAP-TTLS, PEAP, Kerberos V5, EAPSIM etc. When the whole process is complete, both entities have a secret master key. Communication between the authenticator and the authentication server proceeds using the EAPOL (EAP Over LAN) protocol, used in wireless networks to transport EAP data using higher-layer protocols such as Radius.
LXXXVIII. CONCLUSION

techniques. The experience with WEP shows that it is difficult to get security right due to security flaws derived from the protocols design.

XC. REFERENCES
[1] Songhe Zhao and Charles A. Shoniregun, , Critical Review of Unsecured WEP,. 2007 IEEE Congress on Services (SERVICES 2007). [2] Garcia, R. H. A. M., An Analysis of Wireless Security, CCSC: South Central Conference. 2006. [3] Arash Habibi Lashkari, F. Towhidi, R. S. Hoseini, Wired Equivalent Privacy(WEP), ICFCC Kuala Lumpur Conference, 2009. [4] S Vinjosh Reddy, K Rijutha, K SaiRamani and Sk Mohammad Ali, Wireless Hacking - A WiFi Hack By Cracking WEP, 2010 2nd International Conforence on Education Technology and Computer (ICETC), 2010. [5] Min-kyu Choi, Rosslin John Robles, Chang-hwa Hong, Tai-hoon Kim, Wireless Network Security International Journal of Multimedia and Ubiquitous Engineering Vol. 3, No. 3, July, 2008.

[6] Fluhrer, S.R., Mantin, I. & Shamir, A., Weaknesses in the Key
Scheduling Algorithm of RC4. Selected Areas in Cryptography, 2001, pp: 124. [7] Curran, K. and Smyth, E., Demonstrating the Wired Equivalent Privacy (WEP) Weaknesses Inherent in Wi-Fi Networks. Information Systems Security; Sep/Oct 2006, Vol. 15 Issue 4, 2006, pp. 17-38. [8] Karygiannis, T. and Owens, L., Wireless Network Security 802.11, Bluetooth and Handheld Devices, 2003. Available from: http://csrc.nist.gov/publications/drafts/draft-sp800-48.pdf [9] Ferugson N. Michael, An improved MIC for 802.11 WEP. IEEE802.11-02/020r0, 2002, (1). [10] Manjula Sandirigama, Rasika Idamekorala, Security Weaknesses of WEP Protocol IEEE 802.11b and Enhancing the Security With Dynamic Keys. TIC-STH 2009. pp: 433-438. [11] Arunesh Mishra and Willian A. Arbaug, An Initial Security Analysis of The IEEE 80 2.1x Standard.UMIACS-TR-2002-10, pp: 112

LXXXIX. Wireless networks like Wi-Fi being


the most spread technology over the world is vulnerable to the threats of hacking. It is very important to protect a network from the hackers in order to prevent exploitation of confidential data. In this research, we explain the authentication and algorithm used in WEP at the transmitter and receiver side. Then we discussed about all major problems in WEP, talked about the whole process of practical attacks on WEP encryption of Wi-Fi in order to make a system protected. And finally explained about improvement and solutions by using advanced security policies and more secured data encryption

407

Implementation of Ethernet Protocol and DDS in Virtex-5 FPGA for Radar Applications
Garima chaturvedi*1 , Dr.Preeta sharan*2 ,Peeyush sahay#3
*ECE dept. The Oxford College Of Engineering bangalore,India
garima2107@gmail.com preeta50sharan@rediffmail.com peeyush.sahay@lrde.com
#

Electronics and Radar Development Establishment (LRDE), DRDO, India

Abstract: The paper presents the Field programmable gate array (FPGA) based implementation of EnDat protocol as well as the use of direct digital synthesizer in radar applications. The paper explains the use of FPGA for the implementation providing all the design models and supporting results. It brings out the features and advantages of the design. Key Words: Field Programmable Gate Array (FPGA); Direct Digital Synthesizer (DDS) , Endat Protocol

In our design we have chosen AD9910 DDS device. The AD9910 employs an advanced, proprietary DDS technology that provides a significant reduction in power consumption without sacrificing performance. The user has access to the three signal control parameters that control the DDS: frequency, phase, and amplitude.

I. INTRODUCTION: This paper presents the Field Programmable Gate Array (FPGA) based designs which are used for communication purposes. Both the DDS as well as Endat interface have been implemented on a single Xilinx Virtex V Pro FPGA based hardware. The paper follows a systematic approach with interface stages of Direct Digital Synthesizer (DDS) with FPGA as well as the Endat interface. The paper explains all these stages of design in detail. Virtex V devices are user programmable with various configurable elements and embedded cores optimized for high performance system design. These devices are used for high performance logic with low power serial connectivity. High performance system design includes high speed data transfer applications which is the major concern of modern communication systems. This design has been mainly done for data transfers in radar systems. The paper is organized as follows. In section (II) a brief description of DDS is discussed. Endat protocol is discussed in section (III). Hardware implementation and result is discussed in section (IV).The paper concludes in section (V). . II. DIRECT DIGITAL SYNTHESIZER This module sterilizes the frequency from a fixed frequency reference clock. A basic direct digital synthesizer consist of a frequency reference, a numerically controlled oscillator & a digital to analog converter as shown in figure 1.

Fig 1: - Direct Digital Synthesizer block diagram The DDS also enables fast phase and amplitude switching capability. The AD9910 is controlled by programming its internal control registers via a serial I/O port. The AD9910 includes an integrated static RAM to support various combinations of frequency, phase, and/or amplitude modulation. It is specified to operate over the extended industrial temperature range.
AD9910 has four modes of operation. Single tone RAM modulation Digital ramp modulation

Parallel data port modulation In our design we are using the parallel data port modulation. In parallel data port modulation mode, the modulated DDS signal control parameters are supplied directly from the 18-bit parallel data port. The data port is partitioned into two sections. The 16 MSBs make up a 16-bit data-word and the two LSBs make up a 2-bit 408

destination word. The AD9910 generates a clock signal on the PDCLK pin that runs at of the DAC sample rate. It serves as a data clock for the parallel port. Each rising edge of PDCLK is used to latch the 18 bits of user-supplied data into the data port. The AD9910 also accepts a user-generated signal applied to the TxENABLE pin that acts as a gate for the user supplied data. By default, TxENABLE is considered true for Logic 1 and false for Logic 0. When TxENABLE is true, the device latches data into the device on the expected edge of PDCLK. When TxENABLE is false, even though the PDCLK may continue to operate, the device ignores the data supplied to the port. 1. DDS Core The direct digital synthesizer (DDS) block generates a reference signal (sine or cosine based). The parameters of the reference signal (frequency, phase, and amplitude) are applied to the DDS at its frequency, phase offset, and amplitude control inputs, as shown in Figure 2. The output frequency (fOUT) of the AD9910 is controlled by the frequency tuning word (FTW) at the frequency control input A logic high on the CSB pin followed by a logic low resets the SPI port to its initial state and defines the start of the instruction cycle. From this point, the next eight rising SCLK edges define the eight bits of the instruction byte for the current communication cycle.

The AD9910 serial port is a flexible, synchronous serial communications port allowing easy interface to many industry-standard microcontrollers and microprocessors. The interface allows read and write access to all registers that configure the AD9910. Single or multiple byte transfers are supported as well as MSB-first or LSB-first transfer formats. Serial data input/output can be accomplished through a single bidirectional pin (SDIO) or through two unidirectional pins (SDIO/SDO). There are two phases to any communication cycle with the AD9910: Phase 1 and Phase 2. Phase 1 is the instruction cycle, which writes an instruction byte into the device. This byte provides the serial port controller with information regarding Phase 2 of the communication cycle: the data transfer cycle. The Phase 1 instruction byte defines whether the upcoming data transfer is read or write, the number of bytes in the data transfer, and a reference register address for the first byte of the data transfer.

XCI. latched on the rising edge of SCLK whereas output data is always valid after the falling edge of SCLK. Register contents change immediately upon writing to the last bit of each transfer byte. XCII. Instruction byte contains information contained in the following bit map:

Fig 3:-Instruction Bit Map Bit 7, R/W, determines whether a read or a write data transfer occurs after the instruction byte write. Logic high indicates a read operation. Logic 0 indicates a write operation. Bits<6:5>, N1 and N0, determine the number of bytes to be transferred during the data transfer cycle. The bits decode as shown in Table below

Fig 2:-DDS Block Diagram Where FTW is a 32-bit integer ranging in value from 0 to 2,147,483,647 (231 1), which represents the lower half of the full 32-bit range. This range constitutes frequencies from dc to Nyquist (that is, fSYSCLK). 2. Serial programming

409

Data is always written into the device on SDIO pin. However, SDIO can also function as a bidirectional data output line. Data is read from SDO pin for protocols that use separate lines for transmitting and receiving data. Fig 4:-Byte Transfer Count Active low input starts and gates a communication cycle. It allows more than one device to be used on the same serial communication lines. CSB must stay low during the entire communication cycle. Incomplete data transfers are aborted anytime the CSB pin goes high. The serial clock pin is used to synchronize data to and from the device and to run the internal state machines.

Fig 5:- MSB First Format component via its parallel interface (address and data bus). At a system clock frequency of 64 MHz, the serial interface can be operated at a clock rate of up to 8 MHz (EnDat only). At 100 MHz, EnDat can be operated at a clock rate of up to 16 MHz. The soft macro has a serial interface to the encoder as well as a parallel (or serial) microprocessor interface to the application.

XCIII.

Fig 6:- LSB First Format

III. ENDAT INTERFACE The Endat interface is a digital, bidirectional interface for encoders. It is capable both of transmitting position values from incremental and absolute encoders as well as transmitting or updating information stored in the encoder, or saving new information. It is done in a serial transmission method. The data is transmitted in synchronism with the clock signal from the subsequent electronics. The type of transmission (position values, parameters, diagnostics, etc.) is selected by mode commands that the subsequent electronics send to the encoder. The EnDat 2.2 interface, a pure serial interface, is also suited for safety-related applications. The soft macro (EnDat master) serves as an interface component for position encoders with an EnDat interface. It is used in subsequent electronics. The application (software of an C) can access the register of the interface

Fig 7:- Encoder Subsequent electronics shown with parallel interface. Another parallel microprocessor interface has been implemented for safety-related applications. The protocol computer, consisting of the sequencing controller and clock generator, send and receive module, autonomously executes the interface protocol (EnDat or SSI) to the encoder. Component configuration, writing to the send register as well as reading of the receive and control registers is done by register access of the application (FPGA). EnDat is a bidirectional serial interface. The interface component starts transmission by 410

sending a clock frequency with a defined length. Synchronously with the clock, a mode command (and other data, if required) is transmitted to the encoder. The encoder responds by sending the requested data to the subsequent electronics. The transmission direction can change multiple times during data transfer.

values constantly available for the control loop, even during a parameter request.

IV.

HARDWARE IMPLEMENTATION AND RESULTS Field programmable Gate Array (FPGA) has multiple slices, which are programmable as digital units like gates. FPGA is used for digital designs using millions of gates in specific ways for a specific application. Application Specific Integrated Circuit (ASIC) is very specific to the design of that specific application which makes it very difficult to modify the existing design after manufacturing. Other disadvantages of ASIC over FPGA are its development cost and long design time. In FPGAs, designing time is comparatively less and we can replace most of ICs in the case of FPGAs which makes it more cost effective and reusable. FPGA is used in our design because we can make a complex digital design which can be used for communication purpose. We are using Virtex V pro because they are user programmable and are used for high performance logic with low power serial connectivity. The design uses the Integrated Software Environment (ISE) Foundation from Xilinx. This section explains the simulation and hardware results of our design. These designs have been evaluated both for logic and timing in the simulation environment using ModelSim simulator. In figure 9 spi_interface clk is the link clock coming from the signal processor card.we can also see that csb is low during the entire communication cycle. After one set of data is transferred it goes high for one clock cycle and then again goes low for the next set of data transfer. SCLK is used to synchronize data to and from the device and to run the internal state machine.All data is registered on the rising edge of SCLK. Reset signal is the signal which resets the design. Dataout is 32 bit data which is made available after getting eight nibble data Dataout goes with the data instruction to the interface. After each communication cycle the control is shifted to the next channel. Flag is also

Fig 8:- The complete sequence is controlled by application

Transmitted data are identified as either position values, position values with additional information, or parameters. The type of information to be transmitted is selected by mode commands. Mode commands define the content of the transmitted information. Every mode command consists of three bits. To ensure reliable transmission, every bit is transmitted redundantly (inverted or redundant). If the encoder detects an incorrect mode transmission, it transmits an error message. The EnDat 2.2 interface can also transfer parameter values in the additional information together with the position value. This makes the current position

411

generated after sending frames of the same fixed data in the present channel. The responses obtained were in agreement with the required specifications. Post map and post route simulations, followed by hardware cosimulation have been done for confirmation of performance prior to downloading the final circuit on a target board.

Fig 9:- Simulation Result V. CONCLUSION The work presented in the paper explains the implementation of reliable technique of Ethernet protocol used for data transfer implemented in FPGAs. This design has been validated and is in use in different ground based and ship based radar systems and found to work reliably. The digital methods used for implementation facilitate high flexibility, simplicity and reliability. VI. REFERENCES
[1] Merill Skolnik - Introduction to Radar Systems,2nd ed, McGrawHill, 1980 [2] Jauko Vankka- DDS Theory,Design & Applications. [3] Xilinx IP Cores and System Generator Users Guide [4] User Handbook, Analog Devices SHARC DSP [5] Endat data sheets. [6] NadavLevanon and Eli Mozenson Radar Signals IEEE Press, Wiley Interscience 2004

412

CCK Coding Implementation in IEEE802.11b Standard


Mohd. Imran Ali Dept. of University School of Information & Technology(USIT) Guru Gobind Singh Indrapratha Univarsity, Sec-16,Dwarka,New Delhi. Email-imran.ali32@gmail.com
1.Abstract The IEEE 802.11 committee, to implement Wireless Local Area Networks (WLANs) has adopted a new modulation CCK for 11 Mbps rates in the 2.4 GHz ISM band. This paper discusses the new CCK modulation scheme and the considerations that led to the adoption of this technique for the standard. Intersil and the IEEE initiated trade studies to identify compatible modulation methods that would build on the 802.11 one and two Mbps modes but achieve higher data rates. Complementary Code Keying (CCK), a variation of M-ary Orthogonal Keying was finally picked as the modulation. The CCK modulation has the same chip rate and therefore the same null to null bandwidth as the lower rates. This makes it interoperable with the existing 802.11 networks by incorporating the same preamble and header that already has a rate change mechanism. This paper discusses the new CCK modulation scheme and the considerations that led to the adoption of this technique for the standard. It also covers the implementation of the technique.

The IEEE 802.11 standards board has approved a higher rate extension to the physical layer of the 802.11 WLAN standard with the intention of delivering Ethernet like speeds over existing 802.11 WLAN systems. This effort was directed at the 2.4 GHz ISM band which is available almost worldwide and offers 83.5MHz of spectrum into which up to 3 channels can be implemented. IEEE 802.11 is not the only group setting standards for wireless LANs. There are other standards efforts like: Bluetooth, Home RF Working Group and Personal Area Networks that seek to define WLANs for various activities, but 802.11 is the only one addressing high data rates for building wide networks.

3.CCK Background
Complementary codes were originally conceived by M. J. E Golay for infrared multislit spectrometry. However, their properties make them good codes for radar and communications applications.This publication defines a complementary series as a pair of equally long sequences elements with the same composed of two types of elements which have the property that the number of pairs of like elements with any given separation in one series is equal to the number of pairs of unlike separation in the other series. Figure shows this property for This property is more easily expressed in terms of the autocorrelation function which is discussed in the following section.

2.Introduction
Complementary Code Keying (CCK) is a modulation scheme used with wireless networks (WLANs) that employ the IEEE 802.11b specification. In 1999, CCK was adopted to replace the Barker code in wireless digital networks. There is increasing market demand for higher datarate wireless local-area-networks (WLANs). This demand motivates the search for new signaling waveforms and receiver architectures. This proect report presents a new signaling waveform for use with RAKE receivers operating in the indoor high multipath environment. Complementary Code Keying (CCK) was developed by Intersil and Lucent Technologies for use at 2.4 GHz for the IEEE 802.11 standard (which has now been approved). The CCK waveform is based on complementary codes which have their origins in RADAR and multislit spectrometry applications. In this paper the background and properties of CCK will be explained. Furthermore, it will be shown that this code provides an increased tolerance to multipath distortion and is attractive for use in highdata rate WLAN applications. Intersil is currently shipping a second generation WLAN PRISM( chipset which uses CCK.

413

2.1 Autocorrelation Properties


Good code sets for communications applications require good auto and cross correlation properties. Good means that the codes have a large correlation only at zero offset and small or zero otherwise. Multipath will cause the received signal to have multiple echoes of the signal which will cause both interchip and intersymbol interference. This is especially bad if the have poor auto of cross correlation..

of the codeword. For two bits of information per codeword, the whole codeword could be rotated in a quadri-phase fashion: 0, 90, 180 or 270 degrees. To carry higher data payloads per codeword, the codewords N chips are selected from a codeword set. Extra information bits select the particular codeword out of the set. An example of this is where Walsh (Hadamard) codes are used for the codeword set. For example, with a 16 chip codeword, 16-ary Walsh codewords could be used. An extra two bits of information can specify the rotational phase of the codeword: 0, 90,180 or 270 degrees. For the 2.4 GHz ISM band, IEEE 802.11 uses a codeword set which contains 64 complex codewords and quadriphase modulation. This establishes 8 information bits per transmitted codeword. The 8 chips are QPSK. The optimum receiver correlates the received signal with the codeword set. Based on this analysis, the minimum distance between 2 different complementary codes is N/2 symbols. Therefore, it is possible to correct N/4-1 symbol errors. If M phases are possible, this yields a minimum Euclidean distance of

2.2 Binary Complementary Codes


A binary complementary code is a subset of the more general class of codes known as polyphase codes. The IEEE 802.11 CCK codes are polyphase complementary codes. The following definition for binary complementary codes is borrowed intact from R. Sivaswamys Multiphase Complementary Codes. Complementary codes, also referred to as binary complementary sequences or series, comprise a pair of equal finite length sequences having the property that the number of pairs of like elements with any given separation in one series is equal to the number of pairs of unlike elements with the same separation in the other.

4. Mathematical Description

3. Codeword Rules
For a RAKE receiver to work well it is necessary to have a proper transmit signal structure. The transmit symbols need a DSSS structure where the transmitted bandwidth is larger than the information bandwidth. Codewords are formed from N chips. The term codeword is used here since use of the term symbol may cause confusion between chips and codeword. The chips are sent with a simple signaling element like QPSK. The codewords chips may be fixed as in a signature sequence or pseudo-random. Some of the information is imposed through a phase modulation

414

5. CONCLUSIONS
This paper describes a new method to improve the performance of block codes for general multipath environments in radio transmission. We showed that CCK codes have robust performance in a variety of multipath environments. DSSS with use of Barkers code is more stable according to the long range and distance between transmitter and receiver in comparison with CCK. The multi-path performance of CCK is better than MBOK. The CCK waveform has better Eb/N0 performance than DPSK. Transmitters part of the CCK transceiver is a low cost effective, but the receivers part is more complex.

6. References
[1] Golay, Marcel J. E.; Complementary Series, IRE Transactions on Information Theory, April 1961, pp. 82-87. [2] Tseng, C.-C. and Liu, C. L. Complementary Sets of Sequences, IEEE Transactions on Information Theory, September 1972. [3] Sivaswamy, R., Multiphase Complementary Lets use an example to see how a typical code word is generated. Assume the 11Mbps mode and a data bit stream given as d7, d6, d5,,d0 = 1 0 1 1 0 1 0 1. Thus from Table Substituting the phase parameter values into the code word formula we have: By Eulers formula we have: And our complex code word is c = {1, 1, j, j, j, j, 1, 1} Codes, IEEE Transactions on Information Theory, September 1978, pp. 546-552. [4] Sivaswamy, R., Digital and Analog Subcomplementary Sequences for Pulse Compression, IEEE Transactions on Aerospace and Electronic Systems, March 1978, [5] Kretschmer, F. F. Jr., and Lewis, B. L., Doppler Properties of Polyphase Coded Pulse Compression Waveforms, IEEE Transactions on Aerospace and Electronic Systems, July 1983, pp. 521-531. [6] M. Webster, et. al., Intersil/Lucent TGb Compromise CCK (11 Mbps) Proposal, doc.:IEEE P802.11-98/246, July 1998. [7] MOLISH, F.,A. Wireless Communications IEEE Press, 2005. 668p ISBN 9780470848883

Fig. 3: IEEE 802.1 lb 11 Mbps CCK generation.

415

Cognitive Radio and Management of Spectrum


2 Prof.Rinkoo Bhatia Narendra Singh Thakur Khaira Associate Prof. M.Tech Scholar (CTM) (CTM) ra.bhatia.et@itmuniverse.in thakurnarendra2009@gmail.com
1

Prateek Bhadauria

Nishant Dev

M.Tech Scholar (CTM) M.Tech Scholar bhadauria.prateek@gmail.com nishantkhaira@gmail.com

Institute of Technology and Management ,Gwalior techniques are used to solve these spectrum inefficiency problems. The key enabling Abstract technology of dynamic spectrum access techniques is cognitive radio (CR) In recent years, demand for wireless technology, which provides the capability to communication services has grown far share the wireless channel with licensed beyond earlier predictions, raising serious users in an opportunistic manner. CR concerns about future radio spectrum networks are envisioned to provide high shortages. Current spectrum management bandwidth to mobile users via policy is characterised by static spectrum heterogeneous wireless architectures and allocation where radio spectrum is allocated dynamic spectrum access techniques. This on a long term for large geographical goal can be realized only through dynamic regions on exclusive basis. This is an and efficient spectrum management effective way to prevent interference, but it techniques. CR networks, however, impose leads to highly inefficient use of radio unique challenges due to the high fluctuation spectrum. This paper investigates in the available spectrum, as well as the possibilities of policy change towards diverse quality of service (QoS) dynamic spectrum access. Cognitive radio requirements of various applications. technology is proposed as a key technology In order to address these challenges, each enabling this policy change, but ensuring CR user in the CR network must: compatibility with legacy wireless systems. Determine which portions of the spectrum Based on hierarchical overlay model of coare available. existence between primary licensed users Select the best available channel. and secondary cognitive users interference Coordinate access to this channel with avoidance constraints are presented. other users. 1. INTRODUCTION Vacate the channel when a licensed user is Current wireless networks are characterized by a static spectrum allocation policy, where governmental agencies assign wireless spectrum to license holders on a long-term basis for large geographical regions. Recently, because of the increase in spectrum demand, this policy faces spectrum scarcity in particular spectrum bands. In contrast, a large portion of the assigned spectrum is used sporadically, leading to underutilization of a significant amount of spectrum. Hence, dynamic spectrum access
416

detected

From this definition, two main characteristics of cognitive radio can be defined.
2.1 Cognitive Capability: Through real-time

interaction with the radio environment, the portions of the spectrum that are unused at a specific time or location can be identified. CR enables the usage of temporally unused spectrum, referred to as spectrum hole or quently, the best spectrum can be selected, shared with other users, and exploited without interference with the licensed user.
Figure1. architecture Cognitive radio transceiver

These capabilities can be realized through spectrum management functions that address four main challenges: spectrum sensing, spectrum decision, spectrum sharing, and spectrum mobility.

Figure2. Spectrum hole concept

2. COGNITIVE RADIO TECHNOLOGY


Cognitive radio techniques that provide the capability to share the spectrum in an opportunistic manner. CR is defined as a radio that can change its transmitter parameters based on interaction with its environment .

2.2 Reconfigurability: A CR can be programmed to transmit and receive on a variety of frequencies, and use different access technologies supported by its hardware through this capability, the best spectrum band and the most appropriate operating parameters can be selected and reconfigured. In order to provide these capabilities, CR requires novel radio frequency (RF) transceiver architecture. The main components of a CR transceiver are the radio front-end and the base-band processing unit. In the RF front-end the received signal is amplified, mixed, and analog-to-digital (A/D) converted. In the baseband processing -unit, the signal is modulated/demodulated. Each component can be reconfigured via a control bus to adapt to the time-varying RF environment. The novel characteristic of the CR transceiver is the wideband RF front-end that is capable of simultaneous sensing over a wide frequency range. This functionality is related mainly to the RF hardware technologies, such as wideband antenna, power amplifier, and adaptive filter. RF hardware for the CR should be capable of being tuned to any part of a large range of spectrum. However, because the CR transceiver receives signals from various transmitters operating at different power levels, bandwidths, and locations; the RF

417

front-end should have the capability to detect a weak signal in a large dynamic range.

3. COGNITIVE RADIO NETWORK ARCHITECTURE


A comprehensive description of the CR network architecture is essential for the development of communication protocols that address the dynamic spectrum challenges.
3.1 NETWORK COMPONENTS

The components of the CR network architecture it can be classified as two groups: the primary network and the CR network. The primary network (or licensed network) is referred to as an existing network, where the primary users have a license to operate in a certain spectrum band. If primary networks have an infrastructure, primary user activities are controlled through primary base stations. Due to their priority in spectrum access, the operations of primary users should not be affected by unlicensed users. The CR network (also called the dynamic spectrum access network, secondary network, or unlicensed network) does not have a license to operate in a desired band. Hence, additional functionality is required for CR users to share the licensed spectrum band. CR networks also can be equipped with CR base stations that provide single-hop connection to CR users. Finally, CR networks may include spectrum brokers that play a role in distributing the spectrum resources among different CR networks .
3.2 SPECTRUM HETEROGENEITY

the spectrum through wideband access technology. CR networks can be classified as licensed band operation and unlicensed band operation. Licensed band operation: The licensed band is primarily used by the primary network. Hence, CR networks are focused mainly on the detection of primary users in this case. if primary users appear in the spectrum band occupied by CR users, CR users should vacate that spectrum band and move to available spectrum immediately. Unlicensed band operation: In the absence of primary users, CR users have the same right to access the spectrum.
3.3 NETWORK HETEROGENEITY

CR users have the opportunity to perform three different access types: CR network access: CR users can access their own CR base station, on both licensed and unlicensed spectrum bands. Because all interactions occur inside the CR network, their spectrum sharing policy can be independent of that of the primary network. CR ad hoc access: CR users can communicate with other CR users through an ad hoc connection on both licensed and unlicensed spectrum bands. Primary network access: CR users can also access the primary base station through the licensed band. Unlike for other access types, CR users require an adaptive medium access control (MAC) protocol, which enables roaming over multiple primary networks with different access technologies.

CR users are capable of accessing both the licensed portions of the spectrum used by primary users and the unlicensed portions of

4. SPECTRUM MANAGEMENT
CR networks impose unique challenges due to

418

their coexistence with primary networks as well as diverse QoS requirements. Thus, new spectrum management functions are required for CR networks with the following critical design challenges: Interference avoidance: CR networks should avoid interference with primary networks. QoS awareness: To decide on an appropriate spectrum band, CR networks should support QoS-aware communication, considering the dynamic and heterogeneous spectrum environment. Seamless communication: CR networks should provide seamless communication Regardless of the appearance of primary users. To address these challenges the spectrum management process consists of four major steps: Spectrum sensing: A CR user can allocate only an unused portion of the spectrum. Therefore, a CR user should monitor the available spectrum bands, capture their information, and then detect spectrum holes. Spectrum decision: Based on the spectrum availability, CR users can allocate a channel. This allocation not only depends on spectrum availability, but is also determined based on internal (and possibly external) policies. Spectrum sharing: Because there may be multiple CR users trying to access the spectrum, CR network access should be coordinated to prevent multiple users colliding in overlapping portions of the spectrum. Spectrum Mobility: CR users are regarded as visitors to the spectrum. Hence, if the specific

portion of the spectrum in use is required by a primary user, the communication must be continued in another vacant portion of the spectrum.

5. CONCLUSION
Todays spectrum management is still based on the same principles as set out at the time of the crystal radio. This results in highly ineffective use of spectrum. New innovative technologies, such as Cognitive Radio systems offer a huge potential to increase spectrum efficiency thereby facilitating Dynamic Spectrum Access. To make Dynamic Spectrum Access possible RF spectrum regulations will need to be adapted. Cognitive radio networks provide high bandwidth to mobile users via heterogeneous wireless architectures and dynamic spectrum access techniques. Spectrum management functions address the challenges for the realization of this new network paradigm. These are spectrum sensing, spectrum decision, spectrum sharing, and spectrum mobility. The spectrum management presents a dilemma with no optimal solution science, technology and even markets all have significant drawbacks.

References:
[1.] Ian F. Akyildiz, Won-Yells Lee, Mehmet C. Vuran, and Shantidev Mohanty, Georgia Institute of Technology,IEEE Communication Magzine-April-2008. [2] I. F. Akyildiz et al., NeXt Generation/Dynamic Spectrum Access/Cognitive Radio Wireless Networks: A Survey, Comp. Networks J., vol. 50, Sept. 2006. [3] S. Haykin, Cognitive Radio: BrainEmpowered Wireless Communications, IEEE JSAC, vol. 23, no. 2, Feb. 2005. [4] F. K. Jondral, Software-Defined Radio Basic and Evolution to Cognitive Radio, EURASIP J. Wireless Commun. and Networking, 2005.

419

Vulnerabilities in WEP Security and Their Countermeasures


Akhilesh Arora DKES-School of Computer Science, Lodhi Estate, New Delhi, India akhileshgee@yahoo.com adopted. In a wireless network transmitted Abstract This paper reveals vulnerabilities and messages are broadcasted using radio hence it weaknesses of WEP protocol. IEEE 802.11-based WLANs is susceptible to eavesdropping. The goal of the have become more prevalent and are widely deployed in WEP was to raise the level of security for WEPmany places. Wireless networks offer handiness, enabled wireless devices. . Data protected by mobility, and can even be less expensive to put into practice than wired networks in many cases. The rapid the WEP is encrypted to provide
growth of wireless networks has driven both industry and the relevant standards body to respond to the flaws and security problems of WLAN which is very important for the applications hosting valuable information as WLANs broadcast radio-frequency data for the client stations to receive. In this paper the authentication and algorithm used in WEP at the transmitter and receiver side and the main security flaws of WEP used in Wi-Fi is provided. It also discusses about practical attacks on the wireless network to make them more secure. Finally it explains the improvement and solutions to enhance the level of security for the WLAN using WEP. Index TermsIEEE 802.11, Countermeasures, Vulnerabilities, WEP, WLAN, Wi-Fi.

I.

INTRODUCTION

Wi-Fi (Wireless Fidelity) is one of todays leading wireless technologies with Wi-Fi support being integrated into more and more devices: laptops, PDAs, mobile phones with which one stay online anywhere where a wireless signal is available. The basic theory behind wireless technology is that signals can be carried by electromagnetic waves that are then transmitted to a signal receiver. The growing popularity of wireless local area networks (WLANs) over the past few years, has led many enterprises to realize the inherent security issues which often goes unnoticed [2]. Wireless LANs are open to hackers who are trying to access sensitive information or spoil the operation of the network. To avoid such attacks, security algorithm such as Wired Equivalency Privacy (WEP) has been

Confidentiality and data privacy: As it prevents casual eavesdropping. Access control: As it protects access to a wireless network infrastructure. It has a feature for discarding all packets that are not properly encrypted using WEP. Data integrity: As it prevents tampering with transmitted messages; the integrity checksum field is included for this purpose. Unfortunately, several serious security flaws have been identified in WEP mechanism after it went into operation. This paper focuses on of the working of WEP, the security issues of WEP algorithm and proposes solutions and improvements to the same. II. WEP AUTHENTICATON WEP uses two ways to authenticate Wi-Fi users:

420

1) Open authentication: This authentication mechanism is used to provide quick and easy access for Wi-Fi users. AP operating in this mode accepts an authentication request and responds with an authentication success message. The disadvantage of this mechanism is that it allows any wireless station to access the network.

WEP (Wired Equivalent Privacy) was the default encryption protocol introduced in the first IEEE 802.11 standard back in 1999. The Wired Equivalent Privacy (WEP) was designed to provide the security of a wired LAN by encryption through use of the RC4 algorithm [3].

Fig 1: Open Authentication


2) Shared key authentication or WEP authentication: It is a cryptographic technique which is used to provide only legitimate and authenticated users to access the AP i.e. the supplicant device knows a secret key shared among them. In WEP authentication process, firstly, the wireless station sends an authentication request, then the AP responds with a challenge text (an arbitrary 128 bit number).The wireless station encrypt the challenge using its WEP key and sends the result back to the AP. The AP decrypts the response to the challenge sent by the wireless station using its WEP key and compares the result with the challenge that was sent to the wireless station. If the decrypted response matches the challenge sent then access is granted otherwise access is denied.

Fig 3: WEP encryption

1. At the transmitting end: There are two processes in the WEP Encryption process. In first process, one computes a 32-bit integrity check value (IVC) by using CRC-32 algorithm over the message plaintext to protect against unauthorized data modification; the other encrypts the plain text. For the second process, the secret key of 40 bits or 104 bits is concatenated with an 24 bits initialisation vector(IV) resulting in a 64-bit total key size. WEP inputs the resulting key, so-called seed, into the PRNG that yields a key sequence equal to the length of the plaintext plus the ICV. The resulting sequence is used to encrypt the expanded plaintext by doing a bitwise XOR. A final encrypted message is made by attaching the IV in front of the cipher text [1].The initialisation vector is the key to WEP security, so to maintain a decent level of security and minimise disclosure the IV should be incremented for each packet so that subsequent packets are encrypted with different keys. For a 128-bit key, the only difference is that the secret key size becomes 104 bits and the IV remains 24 bits. . The encrypted message C was therefore determined using the following formula: C = [M || ICV (M)] + [RC4 (K || IV)] Where, || is a concatenation operator and + is a XOR operator.

Fig 2: WEP Authentication III. WEP ALGORITHM

3.

At the receiving end:

421

Fig 4: WEP decryption For the decryption at the receivers side the PreShared Key and IV are concatenated to make a secret key sequence. The Cipher text and Secret Key go to in CR4 algorithm and a plaintext and ICV come as a result. The plaintext inserted to Integrity Algorithm to make a new ICV i.e. ICV [3]. Using the CRC-32 algorithm on the recovered plaintext and comparing the output ICV' to the ICV decryption can be verified. If ICV' is not equal to ICV, there is an error in the received message and then an indication is sent back to the sending station informing about the error.

IV, the individual can determine the shared values among them, i.e., the keystream or the shared secret key. Because XORing two cipher texts that use the same key stream would cause the key stream to be cancelled out and the result would be the XOR of the two plaintexts. Encrypted plaintext can be recovered through various techniques, if packets with the same IV are discovered. Worse, the reuse of a single key by many users also helps make the attacks more practical, since it increases chances of IV collision. Lack of key management and no built-in method of updating keys In most wireless networks that use key shared authentication there is one single WEP key shared between every node on the network i.e. access points and wireless stations have the same WEP key [3]. Since synchronizing the change of keys is tedious and difficult, network administrators must personally visit each wireless device in use and manually enter the appropriate WEP key. This may be acceptable at the installation stage of a WLAN or when a new client joins the network, but anytime the key becomes compromised or there is a loss of security, the key must be changed. This may not be a huge issue in a small organization with only a few users, but it can be impractical in large corporations, which typically have hundreds of users. As a consequence, potentially hundreds of users and devices could be using the same key for long periods of time. All wireless network traffics from all users will be encrypted using the same key; this makes it a lot easier for someone listening to traffic to crack the key, as all the packets are being transmitted using the same key. Hence this practice impacts the security [1]. 9. RC4 algorithm weaknesses Initiation vector (IV), a 3-byte random number generated by the computer is combined with a key chosen by the user to generate the final key for WEP encryption/decryption in the RC4 algorithm. The IV is transmitted with the encrypted data in the packet without any protection so that the receiving end knows how to decrypt the traffic as the receiver has a copy of the same secret key. This mode of operation makes stream ciphers vulnerable to several attacks. When the wireless network is using the same userchosen key and duplicate IV on multiple packets, then attacker would know that all those packets with the same IV are being encrypted with the same key, and can then build a dictionary based on the packets collected. By knowing that the RC4 cryptography 8.

IV. SECURITY FLAWS IN WEP


As the key size increases the security of a cryptographic technique increases. For key size greater than 80 bits, brute force1 is extremely difficult. The standard key in use today is 64 bits. The WEP protocol was not created by experts in security or cryptography, so it quickly proved vulnerable to various issues [1]. Determined hackers can crack WEP keys in a busy network within a relatively short period of time. 7. Size of the IV is too short and can be reused frequently IVs are too short (24 bits less than 5000 packets required for a 50% chance of collision) and IV reuse is allowed (no protection against message replay), hence WEP is vulnerable. The subsequent packets are encrypted with same keys which are repeated. So to maintain a descent level of security and minimize disclosure the IV should be incremented for each packet so that subsequent packets are encrypted with different keys. A 24-bit IV is not long enough to ensure this on a busy network. As well-know, this 24bit IV provides for 16,777,216 different RC4 cipher streams for a given WEP key, for any key size. Such a small space of IVs guarantees the reuse of the same keystream. The RC4 cipher stream is XOR-ed with the original packet to give the encrypted packet that is transmitted, with the IV attached with each packet. If a hacker collects enough frames based on the same

422

system uses the XOR algorithm to encrypt the plaintext (user data) with the key, the attacker can find the possible value of the packets. The attacker can do this because the XOR result of two different plaintext values is the same as the XOR result of two cipher-text-encrypted values with the same key. If the attacker can guess one of the plaintext packets, then they can decrypt the other packet encrypted with the same key [6]. A weakness in the random key generator of the RC4 algorithm used in WEP can permit an attacker to collect enough packets with IVs that match certain patterns to recover the user-chosen key from the IVs. To avoid encrypting two cipher-texts with the same key stream, an Initialization Vector (IV) is used to augment the shared secret key and produce a different RC4 key for each packet. Generally, to get enough "interesting" IVs to recover the key, millions of packets are to be sniffed, so it could take days, if not weeks, to crack a moderately used wireless network. Several tools are available to do the sniffing and decoding. Air-snort is the famous one: it runs on Linux and tries to break the key when enough useful packets are collected. 10. Bit-Manipulation WEP doesn't protect the integrity of the encrypted data. The RC4 cryptography system performs the XOR operation bit by bit, making WEP-protected packets vulnerable to bit-manipulation attack. This attack requires modification of any single bit of the traffic to disrupt the communication or cause other problems [7]. 11. Easy forging of authentication messages There are currently two ways to authenticate users before they can establish an association with the wireless network. Open System and Shared Key authentication.. Open system authentication usually means you only need to provide the SSID or use the correct WEP key for the AP. Shared Key authentication involves demonstrating the knowledge of the shared WEP key by encrypting a challenge. The insecurity here is that the challenge is transmitted in clear text to the STA, so if someone captures both challenge and response, then they could figure out the key used to encrypt it, and use that stream to encrypt any challenge he/she would receive in the future [7]. So by monitoring a successful authentication, the attacker can later forge an authentication. The only advantage of Shared Key authentication is that it reduces the ability of an

attacker to create a denial-of-service attack by sending garbage packets (encrypted with the wrong WEP key) into the network. To handle the task of proper authenticating wireless users turn off Shared Key authentication and depend on other authentication protocols, such as 802.1x.

12. Lack of proper integrity (The CRC-32 algorithm is not cryptographically secure) The integrity check stage also suffers from a serious weakness due to the CRC32 algorithm used for this task. CRC32 is commonly used for error detection, but was never considered cryptographically secure due to its linearity, CRC-32 is linear, which means that it is possible to compute the bit difference of two CRCs based on the bit difference of the messages over which they are taken. In other words, flipping bit n in the message results in a deterministic set of bits in the CRC that must be flipped to produce a correct checksum on the modified message [1]. Because flipping bits carries through after an RC4 decryption, this allows the attacker to flip arbitrary bits in an encrypted message and correctly adjust the checksum so that the resulting message appears valid.

V. WEP KEY CRACKING USING AIRCRACK


Practical WEP cracking can easily be demonstrated using tools such as Aircrack [4] (created by French security researcher Christophe Devine). Aircrack contains three main utilities, used in the three attack phases required to recover the key: airodump: wireless sniffing tool used to discover WEP-enabled networks, aireplay: injection tool to increase traffic, aircrack: WEP key cracker using collected unique IVs packets. Currently aireplay only supports injection on specific wireless chipsets, and support for injection in monitor mode requires the latest patched drivers. Monitor mode is the equivalent of promiscuous mode in wired networks, preventing the rejection of packets not intended for the monitoring host (which is usually done in the physical layer of the OSI stack) and thus allowing all packets to be captured. The main goal of the attack is to generate traffic in order to capture unique IVs used between a legitimate client and an access point. Some encrypted data is easily recognizable because it has a fixed

423

length, fixed destination address etc. This is the case with ARP request packets, which are sent to the broadcast address (FF:FF:FF:FF:FF:FF) and have a fixed length of 68 octets. ARP requests can be replayed to generate new ARP responses from a legitimate host, resulting in the same wireless messages being encrypted with new IVs. In this example, 00:13:10:1F:9A:72 is the MAC address of the access point (BSSID) on channel 1 with the SSID hakin9demo and 00:09:5B:EB:C5:2B is the MAC address of a wireless client . Executing the sniffing commands requires root privileges. We proceed in the following way:

The -3 specifies the type of attack (3=ARP Replay) # aireplay -3 -b 00:13:10:1F:9A:72-h 00:0C:F1:19:77:5C -x 600 ath0

8. Using IVs and decrypting the packet Finally, aircrack is used to recover the WEP key. Using the pcap file makes it possible to launch this final step while airodump is still capturing data:
Syntax: #aircrack [options] <input pcap file> # aircrack -x -0 wep-crk.cap

5. Activating monitor mode The first step is to activate monitor mode on our wireless card, so we can capture all the traffic by the command: # airmon.sh start ath0 6. Discovering nearby networks and their clients The next step is to discover nearby networks and their clients by scanning all 14 channels that Wi-Fi networks can use:
Syntax: # airodump <interface> <output prefix> [channel] [IVs flag] Interface is your wireless interface to use required. Output prefix is just the filename it'll prepend, -required. Channel is the specific channel we'll scan, leave blank or use 0 to channel hop. IVs flag is either 0 or 1, depending on whether you want all packets logged, or just IVs. # airodump ath0 wep-crk 0 Once the target network has been located, capture should be started on the correct channel to avoid missing packets while scanning other channels.

Other types of Aircrack attacks


Aircrack also makes it possible to conduct other interesting attacks types. Let's have a look at some of them. Attack 1: De-authentication This attack can be used to recover a hidden SSID (i.e. one that isnt broadcast), capture a WPA 4-way handshake or force a Denial of Service. The aim of the attack is to force the client to re-authenticate, which coupled with the lack of authentication for control frames (used for authentication, association etc.) makes it possible for the attacker to spoof MAC addresses. A wireless client can be de-authenticated using the following command, causing de-authentication packets to be sent from the BSSID to the client MAC by spoofing the BSSID: # aireplay -0 5 -a 00:0C:F1:19:77:5C ath0 00:13:10:1F:9A:72 -c

Mass de-authentication is also possible (though not always reliable), involving the attacker continuously spoofing the BSSID and resending the deauthentication packet to the broadcast address: # aireplay -0 0 -a 00:13:10:1F:9A:72 ath0 Attack 2: Decrypting arbitrary WEP data packets without knowing the key This attack is based on the KoreK proof-of-concept tool called chopchop which can decrypt WEPencrypted packets without knowledge of the key. The integrity check implemented in the WEP protocol allows an attacker to modify both an encrypted packet and its corresponding CRC. Moreover, the use of the XOR operator in the WEP protocol means that

7. ARP Injection Next, we can use previously gathered information to inject traffic using aireplay. Injection will begin when a captured ARP request associated with the targeted BSSID appears on the wireless network:
Syntax: #aireplay -3 -b <AP MAC Address> -h <Client MAC Address> ath0

424

a selected byte in the encrypted message always depends on the same byte of the plaintext message. Chopping off the last byte of the encrypted message corrupts it, but also makes it possible to guess at the value of the corresponding plaintext byte and correct the encrypted message accordingly. If the corrected packet is then re-injected into the network, it will be dropped by the access point if the guess was incorrect (in which case a new guess has to be made), but for a correct guess it will be relayed as usual. Repeating the attack for all message bytes makes it possible to decrypt a WEP packet and recover the keystream. Remember that IV increment is not mandatory in WEP protocol, so it is possible to reuse this keystream to spoof subsequent packets (reusing the same IV). The wireless card must be switched to monitor mode on the right channel. Decrypting WEP packets without knowing the key

(Attacks 1 and 3) requires a legitimate client associated with the access point to ensure the access point does not discard packets due to a nonassociated destination address. If open authentication is used, any client can be authenticated and associated with the access point, but the access point will drop any packets not encrypted with the correct WEP key. Syntax: #aireplay -1 30 -e '<ESSID>' -a <BSSID> -h <Fake MAC> ath0 #aireplay -1 0 -e hackdemo -a 00:13:10:1F:9A:72 -h 0:1:2:3:4:5 ath0 In this example, Aireplay is used to fake an authentication and association request for the SSID hakin9demo (BSSID: 00:13:10:1F:9A:72) with the spoofed MAC address 0:1:2: 3:4:5. Some access points require clients to re-associate every 30 seconds. This behavior can be mimicked in aireplay by replacing the second option (0) with 30.

The attack must be launched against a legitimate client (still 00:0C:F1:19: 77:5C in our case) and aireplay will prompt the attacker to accept each encrypted packet # aireplay -4 -h 00:0C:F1:19:77:5C ath0

VI. SOME ANTICIPATED PROBLEMS


There are lots of problems that can come up that will make the above fail, or work very slowly [3]. No traffic No traffic is being passed; therefore you can't capture any IVs. In this case we need to inject some special packets to trick the AP into broadcasting. MAC Address filtering AP is only responding to connected clients. Probably because MAC address filtering is on. In this case, we Use airodump screen to find the MAC address of authenticated users and change our MAC to theirs and continue on or use the -m option to specify aircrack to filter packets by MAC Address, ex. -m 00:12:5B:4C:23:27 Can't Crack even with tons of IVs Some of the statistical attacks can create false positives and lead you in the wrong direction. In this case, try using -k N (where N=1.17) or -y to vary your attack method or Increase the fudge factor. By default it is at 2, by specifying -f N (where N>=2) will increase your chances of a crack, but take much longer. Doubling the previous fudge factor is a good option.

Reading a pcap file from the attack

# tcpdump -s 0 -n -e -r replay_dec-0916-114019.cap Two pcap files are created: one for the unencrypted packet and another for its related keystream. The resulting file can be made humanread. Replaying a forged packet

# aireplay -2 -r forge-arp.cap ath0 Finally, aireplay is used to replay this packet. This method is less automated than Aircracks own ARP request spoofing (the -1 option), but its more scalable the attacker can use the discovered keystream to forge any packet that is no longer than the key-stream (otherwise the keystream has to be expanded). Attack 3: Fake authentication The WEP key cracking method described earlier

425

VII. IMPROVEMENTS IN WEP


WEP doesn't protect the data well enough. So to configure the WEP securely, follow the suggestions: Use the highest security available: If your devices support 128-bit WEP, then use it. It is extremely hard to brute-force a 128-bit WEP key. If you cannot dynamically change WEP keys, then set the 128-bit policy and change the key periodically [1]. The length of this period depends on how busy your wireless network traffic will be. Use MIC when available: We also mentioned that the WEP key does not protect the integrity of the packet. For integrity of data, Message Integrity Code (MIC) for TKIP is computed by a new algorithm namely Michael [9]. Message Integrity Code (MIC) is computed to detect errors in the data contents, either due to transfer errors or due to purposeful alterations. A 64bit MIC is added to the Data and the ICV. Currently IEEE is working on 802.11i both to fix the problem in WEP and to implement 802.11x and Message Integrity Checksum (MIC) for data confidentiality and integrity. MIC will generate checksums for the encrypted data to ensure the integrity of the data where ICV is CRC of Data and MIC. Use improved data encryption - TKIP: Temporal Key Integrity Protocol (TKIP) uses a hashing algorithm and, by adding an integritychecking feature using 128 bits as secret key, is an alternative to WEP that fixes all the security problems and does not require new hardware. Like WEP, TKIP uses the RC4 stream cipher as the encryption and decryption processes and all involved parties must share the same secret key. This secret key must be 128 bits and is called the "Temporal Key" (TK). TKIP also uses an Initialization vector (IV) of 48-bit and uses it as a counter. Even if the TK is shared, all involved parties generate a different RC4 key stream. Since the communication participants perform a 2-phase generation of a unique "Per-Packet Key" (PPK) that is used as the key for the RC4 key stream [1]. TKIP is a TGis response to the need to do something to improve security for equipment that already deployed in 802.11. TGi has proposed TKIP as a mandatory to implement security enhancement for 802.11, and patches implementing it will likely be available for most equipment.TKIP is a suite of algorithms wrapping WEP, to achieve the best

security that can be obtained given the problem design constraints. TKIP adds four new algorithms to WEP: A cryptographic message integrity code, or MIC, called Michael, to defeat forgeries; A new IV sequencing discipline, to remove replay attacks from the attackers arsenal; A per-packet key mixing function, to decorrelate the public IVs from weak keys; and A re-keying mechanism, to provide fresh encryption and integrity keys, undoing the threat of attacks stemming from key reuse. The TKIP re-keying mechanism updates what are called temporal keys, which are consumed by the WEP encryption engine and by the Michael integrity function. Use dynamic WEP keys: Dynamic keys Changes WEP keys dynamically. When using 802.1x, dynamic WEP keys should be used if possible. A dynamic key is generated using the static shared key and a random number. A hash function is used to generate the dynamic key where the static shared key and the random number are the inputs. This uses the same frame size as WEP protocol. But it provides dynamic key for encryption at each authentication session and based on this key, another temporary dynamic key is generated per data frame basis. The authentication mechanism used is a more improved version compared with WEP. Using dynamic WEP keys will provide a higher level of security and will counter some of the known WEP insecurities. To use dynamic WEP an access point has to be purchased that can provide them and use clients that support them [10]. Use 802.1xSecure Authentication through EAP: The IEEE 802.1X authentication protocol (also known as Port- Based Network Access Control) is a framework originally developed for wired networks, providing authentication, authorization and key distribution mechanisms, and implementing access control for users joining the network. . It also manages the WEP key by periodically and automatically sending a new key, to avoid some of the known WEP key vulnerabilities. The data confidentiality will be protected by these dynamic WEP keys. The IEEE 802.1X architecture is made up of three functional entities: The supplicant joining the network,

426

The authenticator providing access control, The authentication server making authorization decisions [11]. In wireless networks, the access point serves as the authenticator. Each physical port (virtual port in wireless networks) is divided into two logical ports making up the PAE (Port Access Entity). The authentication PAE is always open and allows authentication frames through, while the service PAE is only opened upon successful authentication (i.e. in an authorized state) for a limited time (3600 seconds by default). The decision to allow access is usually made by the third entity, namely the authentication server (which can either be a dedicated Radius server or for example in home networks a simple process running on the access point). The 802.11i standard makes small modifications to IEEE 802.1X for wireless networks to account for the possibility of identity stealing. Message authentication has been incorporated to ensure that both the supplicant and the authenticator calculate their secret keys and enable encryption before accessing the network [1]. The supplicant and the authenticator communicate using an EAP-based protocol. The role of the authenticator is essentially passive it may simply forward all messages to the authentication server. EAP is a framework for the transport of various authentication methods, allowing only a limited number of messages (Request, Response, Success, and Failure), while other intermediate messages are dependent on the selected authentications method: EAP-TLS, EAP-TTLS, PEAP, Kerberos V5, EAPSIM etc. When the whole process is complete, both entities have a secret master key. Communication between the authenticator and the authentication server proceeds using the EAPOL (EAP Over LAN) protocol, used in wireless networks to transport EAP data using higher-layer protocols such as Radius.

WEP at the transmitter and receiver side. Then we discussed about all major problems in WEP, talked about the whole process of practical attacks on WEP encryption of Wi-Fi in order to make a system protected. And finally explained about improvement and solutions by using advanced security policies and more secured data encryption techniques. The experience with WEP shows that it is difficult to get security right due to security flaws derived from the protocols design.

X. REFERENCES
[1] Songhe Zhao and Charles A. Shoniregun, , Critical Review of Unsecured WEP,. 2007 IEEE Congress on Services (SERVICES 2007). [2] Garcia, R. H. A. M., An Analysis of Wireless Security, CCSC: South Central Conference. 2006. [3] Arash Habibi Lashkari, F. Towhidi, R. S. Hoseini, Wired Equivalent Privacy(WEP), ICFCC Kuala Lumpur Conference, 2009. [4] S Vinjosh Reddy, K Rijutha, K SaiRamani and Sk Mohammad Ali, Wireless Hacking - A WiFi Hack By Cracking WEP, 2010 2nd International Conforence on Education Technology and Computer (ICETC), 2010. [5] Min-kyu Choi, Rosslin John Robles, Chang-hwa Hong, Tai-hoon Kim, Wireless Network Security International Journal of Multimedia and Ubiquitous Engineering Vol. 3, No. 3, July, 2008.

[6] Fluhrer, S.R., Mantin, I. & Shamir, A., Weaknesses in the Key
Scheduling Algorithm of RC4. Selected Areas in Cryptography, 2001, pp: 124. [7] Curran, K. and Smyth, E., Demonstrating the Wired Equivalent Privacy (WEP) Weaknesses Inherent in Wi-Fi Networks. Information Systems Security; Sep/Oct 2006, Vol. 15 Issue 4, 2006, pp. 17-38. [8] Karygiannis, T. and Owens, L., Wireless Network Security 802.11, Bluetooth and Handheld Devices, 2003. Available from: http://csrc.nist.gov/publications/drafts/draft-sp800-48.pdf [9] Ferugson N. Michael, An improved MIC for 802.11 WEP. IEEE802.11-02/020r0, 2002, (1). [10] Manjula Sandirigama, Rasika Idamekorala, Security Weaknesses of WEP Protocol IEEE 802.11b and Enhancing the Security With Dynamic Keys. TIC-STH 2009. pp: 433-438. [11] Arunesh Mishra and Willian A. Arbaug, An Initial Security Analysis of The IEEE 80 2.1x Standard.UMIACS-TR-2002-10, pp: 112

VIII. CONCLUSION
IX. Wireless networks like Wi-Fi being the most
spread technology over the world is vulnerable to the threats of hacking. It is very important to protect a network from the hackers in order to prevent exploitation of confidential data. In this research, we explain the authentication and algorithm used in

427

Impact of MNCs on entrepreneurship


Ms. Sonia Lecturer, Department of Management Jagan Institute of Management & Sciences, Rohini Sonia_smile27@yahoo.com of entrepreneurs and the patterns exhibited by Abstract entrepreneurs and their organizations in a variety of Multinational corporations (MNCs) are huge industrial, regional, national and cultural setting. industrial organizations having a wide network of branches and subsidiaries spread over a number of countries. There may be agreement among companies of different countries in respect of division of production, market, etc. Their operations extend beyond their own countries, and cover not only the developed countries but also the LDCs. (Linguistic Data Consortium) Many MNCs have annual sales volume in excess of the entire GNPs of the developing countries in which they operate. MNCs have great impact on the development process of the underdeveloped countries and operation of MNCs in underdeveloped countries. It plays an important role in the economic development of underdeveloped countries like: Filling Trade and Revenue Gap: An inflow of foreign capital can reduce or even remove the deficit in the BOP and revenue gap between targeted governmental tax revenues and locally raised taxes. By taxing MNC profits, LDC governments are able to mobilize public financial resources for development projects. Key words:- innovation ,opportunity, traditionally , liberalisation and economic policy reforms Introduction An entrepreneur is one who always searches for change, responds to it and exploits it as an opportunity. Innovation is the basic tool of entrepreneurs, the means by which they exploit change as an opportunity for different business of service. --Peter Drucker To put it very simply an entrepreneur is someone who perceives opportunity, organizes resources needed for exploiting that opportunity and exploits it. Laptops ,mobile phone, Motor Bikes, Credit Cards, Courier Service, and Ready to eat Foods are all examples of entrepreneurial ideas that got converted into products or services. ENTREPRENEURS A systematic innovation, which consists in the purposeful and organized search for changes, and it is the systematic analysis of the opportunities such changes might offer for economic and social innovation. -- Peter Drucker Entrepreneurship is the practice of starting new organizations or refreshing mature organizations, particularly new businesses generally in response to identified opportunities.Entrepreneurship is a creative human act involving the mobilization of resources from one level of productive use to a higher level of use. "It is the process by which the individual pursues opportunities without regard to resources currently controlled."Entrepreneurs involve a willingness to take responsibility and ability to put mind to a task and see it through from beginning to finishing point. Entrepreneurs are considered to be a significant determinant of economic development. New entrepreneurial activities play a critical part in the process of creative innovation, employment, and growth. While India has traditionally been an entrepreneurial country, it fares poorly in various global studies exploring the entrepreneurial and business potential of countries but, on the other hand our domestic consumption the most traditional basis, our domestic consumption, in virtually any sector,

Filling Management/Technological Gap: Multinationals not only provide financial


resources but they also supply a package of needed resources including management experience, entrepreneurial abilities, and technological skills. These can be transferred to their local counterparts by means of training programs and the process of learning by doing. Moreover, MNCs bring with them the most cultured technological knowledge about production processes while transferring modern machinery and equipment to capital poor LDCs. Such transfers of knowledge, skills, and technology and many more This paper has highlighted that the focus of research in entrepreneurship has been extended over the last decade. Most remarkably, several studies have recently given greater consideration to entrepreneurial behavior, search processes selected by different types of entrepreneur, the differing organizational forms through which entrepreneurial behavior is expressed and the importance of the external environment. The focus of future research should increasingly gather more information on wealth creation and the behavior

428

has the potential to at least double, or treble, from current levels virtually in any sector perhaps, just to catch up with a country like China. Then, there is the entire global opportunity, across various sectors internationally, the "Made in India" tag is now an increasingly respected brand, valued for quality, reliability, and competitiveness. Truly, with economic reforms in the country, and with the virtual removal of all trade barriers, the world is now our market and our opportunity. The search of these opportunities requires a strong spirit of Entrepreneurship.Entrepreneurial activities are largely largley depending on the type of organization that is being started. Entrepreneurs ranges in scale from solo projects (even involving the entrepreneur only part-time) to major undertakings creating many job opportunities. Many "high-profile" entrepreneurial ventures seek venture capital or angel funding in order to raise capital to build the business. Angel investors generally seek returns of 20-30% and more extensive involvement in the business. GROWTH OF ENTREPRENEURSHIP IN INDIA The proper understanding of the growth of Entrepreneurs of any country would evolve within the situation of the economic history of the particular country becomes the subject matter of this section. The growth of Entrepreneurs in India is, therefore, presented into two sections viz. Entrepreneurs in pre-Independence India Entrepreneurs in post-Independence India ENTREPRENEURS IN PRE-INDEPENDENCE ERA The evolution of the Indian Entrepreneurs can be traced back to even as early as Rigveda, when metal handicrafts were prevalent in the society. This would bring the point home that handicrafts Entrepreneurs in India were as old as the human civilization itself, Before India came in contact with the West, people were organized in a particular type of economic and social system of the village community system. Evidently, organized industrial activity was observable among the Indian artisans in a few recognizable products in cities like Varanasi, Allahabad, Gaya, and Puri&Mirzapur which were established on their river basins. Very possibly this was because the rivers served as a means of transportation facilities. Unfortunately, so much of the impressive Indian handicraft industry, which was basically a cottage and small sector, declined at the end of the 18th century for various reasons. These may be listed as1. Disappearance of the Indian royal courts which patronized the crafts earlier.

2. The lukewarm attitude of the British colonial


government towards the Indian crafts. 3. Imposition of heavy duties on the imports of the Indian goods into England. 4. Low priced British made goods produced on large scale which reduced the competing capacity of the product of the Indian handicrafts. 5. Development of transport in Indian facilitating the easy access of British product even to far-flung remote part of the country. 6. Changes in the tastes and habits of the Indian, developing a craze for foreign products.and the unwillingness of the Indian craftsmen to adapt to the changing tastes and needs of the people. Some scholars hold the view that manufacturing Entrepreneurs in India emerged as the latent and clear effect of the East India Companys introduction in India. The actual appearance of manufacturing Entrepreneurs can be noticed in the second half of the 19th century. Prior to 1850, in the beginning, the Parsis were the founder manufacturing entrepreneurs in India. Ranchodlalchotalal, a Nagar Brahman, was the first Indian to think of setting up the textile manufacturing in modern factory lines in 1847, but failed. In his second attempt, he succeeded in setting up a textile mill in 1861 at Ahmadabad.Historical evidences also do confirm that after the East India Company lost monopoly in 1813, the European Managing Agency Houses entered in to business, trade and banking. And, these houses markedly influenced eastern India's Industrial scene Reasons for slow growth of entrepreneurshipduring the British period in India. 1. Not given proper protection: The enterprises were not given proper protection by the British Government. 2. Discouragement by the British Government: Only those industries in which the British Government put their own capital were given encouragement. 3. High railway freight charges: The railway freight charges were higher for locations not nearer to the ports. This proved that the transportation of the goods manufactured for the Indian markets were more expensive than goods meant for exports. 4. Very high tariffs: The British imposed very high tariffs on goods made in India . 5. Constantly harassing for getting licenses: Entrepreneurs were constantly harassed for getting licenses and finance to established and run industries. 6. No facilities for technical education: there were almost no facilities for technical education

429

which alone could strength Indian industrial Entrepreneurs. 7. Entrepreneurs faced high competition from abroad: The Indian original entrepreneurs faced unfair competition from machine made goods exported to India from abroad. 8. Lack of transportation and communication facilities: Lack of transportation and communication facilities acted as a stumbling blot in the way of industrial growth. 9. Not encouragement for the establishment of heavy industries: The British Government did not encourage the establishment of heavy industries like heavy machinery, iron and steel which are necessary for rapid industrialization. 10. Political confusion: Political confusion and closing down of large courts discouraged the growth of Entrepreneurs. 11. Multi-currency system: Prevalence of multicurrency system affected the business environment and blocked the growth. 12. In spite of above problems, the export trade of textile in the 17th century showed an ascending trend. During this period, grouping of Indian merchants into JSA (joint stock association) for the purpose of managing the supply of textiles to European companies was highly significant. This helped in exporting huge volume of textiles to the European markets leading to favorable terms of trade. PARTITION OF UNDIVIDED INDIA ON 15th AUGUST 1947 Before we close our review of entrepreneurial growth in post-Independence era, it will be in the fitness of the things to shed some light on the effects of partition on India's industrial economy so as to describe Independent India's industrial background. Following are some major effects of partition on 15th August 1947 on the Indian industrial economy:Demographic Effects :77% of the area and 82% the population remain in India while rest goes to Pakistan. Industrial Activity :90% of the total industrial establishment with 93% of industrial workers (Jute, Iron & Steel and Paper industries) remain in India whereas 10% of the total industrial establishments with 7% of industrial workers (Cotton textile, sugar, cement,glass and chemical industries) land in Pakistan. Mineral and Natural Resources :97% of the total value of minerals in India in which major deposits of coal, mica, manganese, iron, etc. remain in India whereas 3% of the total value of minerals with major deposits of

Gypsum, rock salt, etc. become part of Pakistan. Manpower and Manager Skill :-India was at loss whereas those Muslims possessed these skills migrated landed in Pakistan. Transport Facilities:83% of the total route mileage remained in India whereas only 07% of the total road mileage in Pakistan. Major Ports: India lost major ports which badly affected Indias exports. ENTREPRENEURSHIP DURING POSTINDEPENDENCE ERA After taking a long sign of political relief in 1947, the Government of India tried to spell out the priorities to create a scheme for achieving balanced growth. For this purpose the Government came forward with the first Industrial Policy in 1948 which was revised from time to time."The Government in her various industrial policy statements identified the responsibility of the State to promote, assist and develop industries in the national interest. It also openly recognized the very important role of the private sector in accelerating industrial development and, for this, enough space was reserved for the private sector. The Government took three important measures in her industrial resolutions:(i) to maintain a proper distribution of economic power between private and public sector;(ii) to encourage the tempo of industrialization by spreading Entrepreneurs from the existing centers to other cities, towns and villages, and(iii) to circulate the Entrepreneurs acumen concentrated in a few dominant communities to a large number of industrially potential people of varied social points. To achieve these important objectives, the Government focus on the development of small-scale industries in the country. Particularly since the Third Five Year Plan, the Government started to provide various incentives and concessions in the form of capital, technical know-how, markets and land to the potential entrepreneurs to establish their business. . Several institutions like Directorate of Industries, Financial Corporations, Small-Scale Industries Corporations and Small Industries Service Institute were also established by the Government to facilitate the new entrepreneurs in setting up their enterprises. Especially , the small-scale units emerged very rapidly in India witnessing a fantastic increase in their number from 121,619 in 1966 to 190,727 in 1970 registering an increase of 17,000 units per year during the period under reference. It contributes almost 40% (2011)of the gross industrial value added in the Indian economy. In 2011 SSI Sector plays a major role in India's present export performance.

430

45%-50% of the Indian Exports is contributed by SSI Sector...Liberalization was way for growth of Entrepreneurs in India Postliberalization, Entrepreneurs has generally increased in India, Dr Mani told Business Line. And knowledge-intensive Entrepreneurs in sectors such as IT and biotechnology have also increased since the economic liberalization process started in 1991, he added. The number of new companies formed during the 1980-2006 period points to a possible growth in the number of Entrepreneurs. Figures from the Ministry of Corporate Affairs show that from 1980 to 1991, the average number of companies formed each year was 14,379, while from 1992 to 2006, latest the average number of companies formed per year was 33,835. According to the paper, liberalization itself kick-started the growth of Entrepreneurs in India because it presented businesses in the country with new market pportunities. Liberalization also reduced entry barriers for new entrepreneurs as it dispensed with or reduced regulatory measures such as industrial licensing. Similarly, improved availability of financial support from both official and private sources boosted the growth of Entrepreneurs. However, Entrepreneurs in India could have grown much faster if the capital market had been strengthened to support the system. Even today, the capital market is not a major source of finance for enterprises, which mostly rely on internal sources of funding or debt. A study of 588 start-ups that participated in a competition conducted recently by National Entrepreneurs Network revealed that 70 per cent relied on personal savings for initial funding, he pointed out. Government-supported and publicprivate partnership ventures such as the National Science and Technology Entrepreneurs Development Board, Promotion Program and business incubators in colleges and technology parks also facilitated the growth of Entrepreneurs in India. CURRENT SCENARIO OF ENTREPRENEURSHIP IN INDIA According to the Global Entrepreneurs Monitor (2010) report, India ranked highest on necessitybased entrepreneurship (7.5%), and fifth from the bottom on opportunity based entrepreneurship (3.7%). Surprisingly it was the more developed countries like France, Belgium, Japan and Israel that were lower than India. According to the NSS 62nd round, in rural India, almost 50 % of all workers are self-employed in rural area 57 % among males and nearly 62 % among females, while the corresponding figures in urban India are 42 for males and 44 for females. The NSSO defines a self-employed person

as one who has worked in household enterprises as ownaccount worker; worked in household enterprises as an employer or worked in household enterprises as helper.. According to the 5th Economic Census conducted by the Central Statistical Organization (CSO), there are 41.83 million establishments in the country engaged in different economic activities other than crop production and plantation. Five states viz. Tamil Nadu (10.60 %), Maharashtra (10.10 %), West Bengal (10.05 %), Uttar Pradesh (9.61 %) and Andhra Pradesh (9.56 %) together account for about 50 percent of the total establishments in the country. The same five states also have the combined share of about 50 per cent of the total employment. SCOPE OF ENTREPRENEURS DEVELOPMENT IN INDIA In India there is a lack of quality people in industry, which demands high level of Entrepreneurs development program throughout the country for the growth of Indian economy. India is now truly a land of opportunity. John Redwood, Economic Competitiveness Policy Group, UK The scope of Entrepreneurs development in a country like India is great. Especially since there is a widespread concern that the acceleration in GDP growth in the post reforms period has not been accompanied byan equal expansion in employment. Results of the 64th round of the National Sample Survey Organization (NSSO) show that unemployment figures in 2004-05 were as high as 8.9 million. Incidentally, one million more Indian joined the rank of the unemployed between 2005-06 and 2007-08. The rising unemployment rate (9.2% 2008 est.) in India has resulted in growing frustration among the youth. In 64th round at the all-India level, unemployment rate was nearly 8 per cent in the current daily status approach. The unemployment rate stood at nearly 4 per cent in current weekly status approach and 2 per cent in the usual status approach In addition there is always problem of underemployment. As a result, increasing the entrepreneurial activities in the country is the only solace. Incidentally, both the reports prepared by the Planning Commission to generate employment opportunities for 10 crore people over the next ten years have strongly recommended self-employment as a way-out for teaming unemployed youth. We have all the requisite technical and knowledge base to take up the entrepreneurial challenge. The success of Indian entrepreneurs in Silicon Valley is evident as proof. The only thing that is lacking is confidence and mental preparation. We are more of a reactive kind of a people. We need to get out of this and

431

become more proactive. What is more important than the skill and knowledge base is the courage to take the plunge. Our problem is that we do not stretch ourselves. However, it is appreciative that the current generations of youth do not have hang-ups about the previous legacy and are willing to experiment. These are the people who will bring about Entrepreneurship in India. At present, there are various organizations at the country level and state level offering support to entrepreneurs in various ways. The Govt. of India and various State Governments have been implementing various schemes & programs aimed at nurturing Entrepreneurs over the last four decades. For example, MCED in Maharashtra provides systematic training, dissemination of the information and data regarding all aspects of Entrepreneurship and is conducting research in Entrepreneurs. Then there are various Govt. sponsored scheme for the budding entrepreneurs. Hardly a day passes without hearing about the rise of India in the global economy. Newspapers, magazines, television shows and internet keep on reporting about it. Actually the reasons are many. A country with the worlds second largest population and one of the largest economies has been growing so fast for so long. (GDP) in India expanded 8.20% in the fourth quarter of 2010 over the same quarter, previous year. From 2004 until 2010 and now India has become one of the most promising and fastest growing economies. This has raised fears will India lead the world economy; through its low cost, will it bid down wages elsewhere? The real GDP growth averaged 8.6% since FY 2003 and is expected to grow by an average of 9% a year through 2012.Since the emerging countries have increased their contribution to the world economy, more and more MNCs have started showing greater interest in them with India at the top, due to its large and attractive consumer base. India and China are also being called as re-emerging countries because they have started regaining their former prominence. Until the 19th century, these countries were the worlds two biggest economies producing on an average 80% of the world GDP. However, Europes industrial revolution and globalization dipped their contribution to only 40% by 1950. But with an annual growth of around 7% over the past 5 years, compared to 2.7% in the rich economies, they have bounced back. According to IMF forecasts, they are expected to grow at an average of 6.8% a year, while developed countries will have a growth of a mere 2.7% in the next 5 years

India ranked 48 in the global competitiveness index and 31 in the business competitiveness index 20072008 (Exhibits III (a) and III (b)). India has emerged stronger on the global investment radar, much ahead of US and Russia. It was ranked the second best FDI destination after China in 2010.. Indias value proposition is based on three major parameters its low cost-high quality scalability model giving it an edge over other emerging destinations; a quality pool of knowledgeable English speakers; and the ability to focus on core competencies and talent to strengthen and expand existing business offerings. Its market potential and macroeconomic stability are the key drivers of FDI attractiveness. However, compared to its close competitor China, a survey by AT Kearney revealed that investors favor China over India for its market size, access to export markets, government incentives, favorable cost structure, infrastructure, and macroeconomic climate. The same investors cite Indias highly educated workforce, management talent, rule of law, transparency, cultural affinity, and regulatory environment as more favorable than what China presents. Moreover, they maintain that China leads in manufacturing and assembly, while India leads for IT, business processing, and R&D investments. Respondents mark system of government, lack of infrastructure and an unclear policy framework as negative impact MNCs operating in India.Almost 70% of Multinational Corporate (MNCs) participating in the first "CIIA. T. Kearney MNC Survey 2005" have evinced a high likelihood of making additional medium and longterm investments in India. The investment outlook in the medium term appears not to be dictated by their current performance in India, with most companies indicating a medium to high likelihood of investment, irrespective of performance. Further, three out of every four MNCs state that their performance in India has met or exceeded internal targets and expectations. Tailoring products and prices to suit Indian tastes, appointing local leadership and indigenization are key factors for success in India, in their experience. Survey findings indicate that in a comparison with other emerging Asian economies (China, Malaysia, Thailand, and Philippines), India is perceived to be at par in terms of FDI attractiveness, even though current performance of MNCs in India compares favorably (i.e. Indian operations are perceived to perform better than those in most other SE Asian countries). While more than three-quarters of the survey respondents ranked India higher than Malaysia, Thailand, and Philippines in terms of MNC performance, they were more traditional in their

432

outlook on Indias FDI attractiveness relative to these economies. Indias market potential, labour competitiveness and macro-economic stability were unanimously highlighted as the key drivers of FDI attractiveness. Global investors view India and China as two distinct markets: Investors favour China over India for its market size, access to export markets, government incentives, favorable cost structure, infrastructure, and macroeconomic climate. The same investors cite Indias highly educated workforce, management talent, rule of law, transparency, cultural affinity, and regulatory environment as more favorable than what China presents. Moreover, they maintain that China leads in manufacturing and assembly, while India leads for IT, business processing, and R&D investments.

7
A key take-away from the study is that while India will attract global investor interest due to the total size of its economy, there is much more to be done to become more investor friendly and maintain investor interest. Respondents put forth that system of government, lack of infrastructure, and an ambiguous policy framework are specific challenges that adversely impact MNCs operating in India and influence perception of India vis-vis other emerging economies. There was agreement amongst the MNC participants that the government needs to rationalize policies (i.e. rationalize tax structure, reduce trade barriers); invest in infrastructure - physical and information technology and; Accelerate reforms (political reforms to improve stability, privatization and deregulation, labour reforms), which in turn would help in accelerating additional investments.. In India, 77% ranked operating costs and 72% ranked employee wages as very important advantages; in China, 64% and 61% did. In China, 76% ranked access to local markets as very important. In India, 64% did. The availability of qualified workers was perceived as a more significant advantage in India than in China, with 60% in India and 43% in China saying this was very important. MNCs RISING INTEREST IN INDIA:

With globalization, trade barriers have come down and business giants have spilled across the world. Emerging economies have been their profitable markets. With flaring global interest in Indian economy and its huge consumer base, many Multi National Companies (MNCs) have started foraying there to extract the maximum market share. Some viewed India as a high potential market, while others wanted to exploit it as a lowcost manufacturing base. In spite of Indias huge potential, MNCs have shown a mixed performance. Many, who were remarkably successful elsewhere, have failed or are yet to succeed. Indian market poses special challenges due to its heterogeneity, in terms of economic development, income, religion, cultural mix and tastes. On top is the heating competition among local players as well as the leading MNCs. Not all companies have been struggling to understand Indian consumer behavior. Doing business in India is at a turning point; market entry strategies, for example, that clicked once dont always promise success always. Success in India will not happen overnight. It requires commitment, management drive and focus on long-term objectives. Proper business models are needed. They are not prescribed but need to be derived from the mechanisms that enabled them to develop the global management processes (providing the global support and technology) and the local management processes (driving local autonomy and capability). Critical success factors for MNCs in India are highlighted in the Exhibit IV. MNCs need to invest heavily on market research to analyze the local preferences and craft their marketing and branding strategies accordingly. Among the various MNCs having subsidiaries in India are Colgate, Palmolive, Procter & Gamble, General Electric, IBM, Intel, Pepsi, Coco Cola, Microsoft, Oracle, Unilever etc. Almost 70% of MNCs that have participated in the first CII-AT Kearney MNC Survey 2005 have evinced a high likelihood of making additional medium- to long-term investment in India. Apart from that, three out of every four MNCs have met or

433

exceeded their internal targets and expectations in India. MNCs in India face a range of challenges. This book examines their much-needed critical success factors. The Indian Innovation System after 1991: The macro-picture of the Indian economy changed over the 1990s. Today, Indias manufacturing sector accounts for approximately 17 per cent of real GDP, 12 per cent of total workforce and 80% of merchandise exports.

8
Total manufacturing gross value added showed a trend growth rate between 1980 and 2010 of 6.8% (compared to 12.8% in China and 11.2% in Malaysia). More interestingly, the industrial growth during the 1980s and 1990s were roughly the same. The share of manufacturing in GDP also remained roughly the same between 1990 and 2000 (Nagaraj, 2003). The service sector grew at about the same rate as industry, 7.6%, during 199297, but in 1997-2001 services grew at an annual rate of 8.1% compared to the 4.8% growth of industry (Acharya, 2002). The service sector in India is larger than either the agriculture sector or the industrial sector. It has been growing at least as fast as the industrial sector, and faster than the agriculture sector, reinforcing its dominance. FDI inflows into India are believed to be less than 10% of those into China during the same period, Basic goods account for the largest proportion of the FDI approvals accounting for almost 39%. Government Support for Technological Innovation: The large government scientific agencies in the atomic energy and space programs also have programs to involve industry in developing technologies and products for their programs well as commercializing spin-offs. A recent and more ambitious effort has been to launch an initiative (called the New Millennium Indian Technology Leadership proposal) to attain a global leadership position in selected niche areas by supporting scientific and technological innovation in these areas - Both revenue and capital expenditure on R&D are 100% deductible from taxable income under the Income Tax Act.

- A weighted tax deduction of 125% is allowed for sponsored research in approved national laboratories and institutions of higher technical education. - A weighted tax deduction of 150% is allowed on R&D expenditure by companies in governmentapproved inhouse R&D centers in selected industries. - A company whose principal objective is research and development is exempt from income tax for ten years from its inception. - Accelerated depreciation is allowed for investment in plant and machinery made on the basis of indigenous technology. - Customs and excise duty exemptions for capital equipments and consumables required for R&D. - Excise duty exemption for three years on goods designed and developed by a wholly owned Indian company and patented in any two countries out of: India, the United States, Japan and any country of the European Union. Economic Reforms: FDI Policy Most sectors including manufacturing activities permitted 100% FDI under automatic route (No prior approval required) Industrial Licensing Licensing limited to only 5 sectors (security, public health & safety considerations) Exchange Control All investments are on repatriation basis Original investment, profits and dividend can be freely repatriated Taxation Companies incorporated in India treated as Indian companies for taxation

9 Convention on Avoidance of Double Taxation


with 71 countries including Korea Future of India: Indian economy will be the fastest growing economy over the next 3 - 5 decades. In $ terms, Indian economy will be one of the largest in the world. Indias per capita income in $ terms will grow by 35 times in the next 47 years ( i.e. 2050) Indian rupee is likely to appreciate by almost 300 %inthe next 3 5 decades. India is a research hub to the world.

434

INDIA IN 1998 INDIA TODAY

435

The objective of the paper is to identify a set of themes that best illustrate progress and developments in the focus of entrepreneurship research over the past ten years (see Westhead and Wright (2000) for a critical review of the literature related to selected themes). Following a review of existing literature surrounding these themes, we identify several research gaps that need to be addressed. Key aspects of the entrepreneurial process and context are highlighted in Figure 1. The exposition in the paper follows the themes highlighted in the figure. First, we examine the changing focus of the nature of entrepreneurship theory, and in particular the shift of emphasis towards the examination of behavior. Second, we review studies focusing upon different types of entrepreneur. Third, the process of entrepreneurship is examined in terms of: (a) studies focusing upon opportunity recognition, information search and learning, and (b) resource acquisition and competitive strategies selected by entrepreneurs. Fourth, organizational modes selected by entrepreneurs are examined with regard to corporate venturing, management buy-outs and buy-ins, franchising and the inheritance of family firms. Fifth, the external environment for entrepreneurship is discussed. Sixth, the outcomes of entrepreneurial endeavors are analyzed. CONCLUSION

10
There are ample opportunities in small businesses in India and such opportunities will transform India in the coming future. For such revolution to happen there needs to be support both at the governmental and societal level. For the government it is important to realize that the goal of small business owners will be to remain selfemployed. Such people may not need financial assistance but they will need marketing and legal assistance in order to sustain themselves. Practical and cost effective programs need to be developed to address their needs because self-employed people will represent an important segment in economic revitalization. Entrepreneurs development is the key factor to fight against unemployment, poverty and to prepare ourselves for globalization

in order to achieve overall Indian economic-progress. Entrepreneurs development should be viewed as a way of not only solving the problem of unemployment but also of overall economic and social advancement of the nation. Wide-scale development of Entrepreneurs can help not only in generating self employment opportunities and thereby, reducing unrest and social tension amongst the unemployed youths but also in introducing small business dynamism, encouraging innovative activities and facilitating the process of balanced economic development. Entrepreneurs can be developed through appropriately designed program . A huge number of institutions at all levels in the private as well as public sectors have been rendering services through incentives, training and facilities for promotion of Entrepreneurs. The National Alliance of Young Enterprises (NAYE), Small Entrepreneurial Development Institute of India (SEDII), National Institute of Entrepreneurs and Small Business Development (NIESBUD), Centre for Employment Development (CED), Integrated Rural Development Program (IRDP), Prime Minister RozgarYojana (PMRY) Small Industry Development Bank of India (SIDBI), District Industrial Centre (DIC), National Employees Board (NEB) Training of Rural youth and Self Employment (TRYSEM), Trade Related Entrepreneurs Assistance of Development (TREAD), Self Employment Program for Educated Youth (SEPEY), Village and Khadi Commission (VKC), etc., are the various program for promoting Entrepreneurs BIBLIOGRAPHY 1. Entrepreneurial Development by S.S. Khanka 2. Dynamics of Entrepreneurial Development and Management Millennium Edition by Vasant Desai 3. http://www.articlesbase.com/Entrepreneurs-articles/ 4. http://knowledgeportal.in/ 5. http://dobato.blogspot.com/2006/02/scope-ofEntrepreneurs-development.html 6. http://www.thehindubusinessline.com/

7. http://papers.ssrn.com/sol3/
BOOKS: conflict management- by Algert N.E

436

Multilayered Intelligent Approach - An Hybrid Intelligent Systems


A.

Neeta Verma, Swapna Singh


Inderprastha Engineering College Ghaziabad U.P India neeta140@gmail.com singhswapna@yahoo.com
Abstract In this paper the intelligent techniques of hybrid intelligence system ,that is a hierarchical layers approach is discussed This paper represents the approach to solve the complex problems in real-world with the help of intelligent techniques including traditional hard computing techniques (e.g., expert systems) and soft computing techniques (e.g., fuzzy logic, neural networks, and genetic algorithms)... Hybrid Artificial Intelligence Systems combines symbolic and subsymbolic techniques to construct more robust and reliable problem solving models. In hybrid architectures the integration of some AI techniques can be combined. Different aspects of intelligent behavior can be modeled.. Machine Learning and Neural Networks model, intuitive and inductive reasoning, Evolutionary systems model the adaptive behavior, Fuzzy and traditional Expert Systems model deductive reasoning ,and Case Based Reasoning combine deduction and experience. Hybridization of intelligent systems is a promising research field of modern computational intelligence concerned with the development of the next generation of intelligent systems. KeywordsMachine learning, expert system, neural network, inductive reasoning, deductive reasoning, intuitive reasoning.

hybridization or fusion of these techniques has, in recent years, contributed to a large number of new intelligent system designs. Research on hybrid symbolic and sub symbolic systems has provided an excellent foundation for models and techniques that are now used in applications and development tools. The existing systems demonstrate their feasibility and advantages, and many are in use in practical situations.

XVIII. INTRODUCTION

Intelligent Hybrid Systems are defined as models and computational programs based on more than one Artificial Intelligence technology. In Figure 1 the representation of a schematic view of the integration of some AI techniques that can be combined in hybrid architectures is shown. Different aspects of intelligent behaviour can be modelled. Machine Learning and Neural Networks model the intuitive and inductive reasoning; Evolutionary Systems model the adaptive behaviour; Fuzzy and traditional Expert Systems model deductive reasoning; and Case Based Reasoning combines deduction and experience.
A fundamental stimulus to the investigations of hybrid intelligent systems is the awareness in the academic communities that combined approaches might be necessary if the remaining tough problems in artificial intelligence are to be solved. The integration of different learning and adaptation techniques to overcome individual limitations and to achieve synergetic effects through the

There are several features that define a complex system (a)Uniqueness usually complex systems are unique or number of similar systems is unweighted. (b)Hardly predictable complex systems are very hard to predict. It means that it is hard to calculate the next state of a complex system if the previous states are known.
. REASONING APPROACHES A. Inductive Reasoning
II APPLIED FOR HYBRID SYSTEM

Inductive reasoning is open-ended and exploratory especially at the beginning Inductive reasoning works from observation (or observations) toward generalizations and theories. This is also called a bottom-up approach.

Inductive reasoning starts from specific observations (or measurement if you are mathematician or more precisely statistician), look for patterns, regularities (or irregularities), formulate hypothesis that we could work with and finally ended up developing general theories or drawing conclusion. In this approach, we observe a number of specific instances and from them infer a general principle or law..
B. Deductive reasoning

2.Evolutionary Systems model the adaptive behaviour .

3.Fuzzy and traditional expert Systems model the deductive reasoning . 4,Case Based Reasoning combine deduction and experience .These modules interact during the solving
process or prepare the data to future processing by other module. The input data in a hybrid system may be provided for some procedural pre-processing method and then passed to the system modules .These modules interact during the solving process or prepare the data to future processing by other module. The result can be sent to another procedural processing to prepare the system output. Depending upon the problem complexity, this output may be entrance to another hybrid system. IV. CLASSIFICATION OF HYBRID INTELLIGENT SYSTEMS
There are several architectures for integrating neural and symbolic models shown in Figure 2..

Deductive methods involve beginning with a general concept or given rule and moving on to a more specific conclusion.It is the process of reaching a conclusion that is guaranteed to follow, if the evidence provided is true and the reasoning used to reach the conclusion is correct. The conclusion also must be based only on the evidence previously provided; it cannot contain new information about the subject matter.
C. Intuitive Reasoning

Intuitive reasoning has to do with the way something appears to be, how something "seems" or "looks", and is based on unverified guesses. While it may seem to be very rudimentary, it is very useful in giving a starting point from which induction or deduction can proceed. As compared with inductive, deductive reasoning is narrow in nature and is concerned with testing or confirming hypothesis.
D. Case Based Reasoning

A. Stand Alone No integration or communication between the modules. They are implemented independently. Stand-

Case-based reasoning (CBR) is defined as the process of solving new problems based on the solutions of similar past problems. It has been argued that case-based reasoning is not only a powerful method for computer reasoning, but also a pervasive behaviour in everyday human problem solving; or, that all reasoning is based on past cases personally experienced. In comparison to rule based systems (see expert system) which are useful where only one or a few solutions to a problem are possible, case based systems are useful in solving complex problems with many alternative solutions.
III. COMPONENTS
OF

Fig 2. Different Hybrid intelligence systems

alone architectures are composed by independent modules without any integration between the parts . This can be used in comparison of different diagnosis in computer repair. Although these models are not an alternative to hybrid solutions, but they are a direct means of comparing solutions offered by both techniques for a same problem. Second, the implementation of a module after the other allows the validation of the first system. B. Transformational

Transformational models are similar to stand-alone architectures regarding the independence between the modules. In transformational models the system begins

HYBRID INTELLIGENCE SYSTEM

1.Machine Learning and Neural Networks model the intuitive and inductive reasoning .

as one model (ES or NN) and ends up as the other .The limitations of transformational models include the absence of automated means of transforming one technique to the other and the difficulty in maintain both modules when new features are added to the solution. This is used in model marketing decision aid. An NN is built ti identify trends and relationships within the data and it used as basis to built an expert system to assist marketing researchers in allocating advertising. C. Loosely Coupled In these models expert system and neural network are independent and separate modules that communicate via data files. Both modules can be pre-post or coprocessors in relation to the other. This is used for forecasting of workforce utilization The NN predicts the workforce and ES allocates the task. They are easy to implement (each module can be developed separately in several shells available commercially) and maintain (due to the simple interface between the modules).On the other hand, this architecture is slower in the operation and may have redundancy in the independent developments of each module (identical data may be considered independently)

through memory resident data structures. Tightlycoupled architectures are more adequate to embedded systems this Can be used as forecasting stock prices and consequent definition of appropriate strategy.
E. Fully Integrated

The last level of integration is the fullyintegrated architecture. NN and ES share data and knowledge representation and the communication between them is accomplished due to the dual nature (neuro-symbolic) of the structures .This can be used for object identification based on feature data received from sensors and environmental data.
V. STAGES OF HYBRID INTELLIGENT SYSTEMS There are six stages in the construction of hybrid intelligent systems: 1. Problem analysis 2. Property matching 3. Hybrid category selection 4. Implementation 5. Validation 6. Maintenance Most current hybrid intelligent systems are built either from scratch or following this development process. This

is based on object-oriented techniques current hybrid intelligent systems are built either from scratch or following this development process. Hybrid intelligent systems are very important for complex problem solving and decision making. At the same time, they are difficult to build. Many hybrid intelligent systems used in different application fields appeared in the past years. A typical development cycle in the implementation of these hybrid intelligent systems is shown in Fig. 3. This is based on object-oriented

D. Tightly-Coupled

The next level of integration is the tight coupling model. The only difference is how the communication between the modules takes place. In these modules, NN and ES communicate

techniques.

The integration of different learning and adaptation techniques to overcome individual limitations and to achieve synergetic effects through the hybridisation are to be solved. The successes in integrating of expert systems and neural networks, and the advances in theoretical research on hybrid systems, point to similar opportunities for when other intelligent technologies are included in the mix. From knowledge of their strengths and weaknesses, we can construct hybrid systems to mitigate the limitations and take advantage of the opportunities to produce systems that are more powerful than those that could be built with single technologies.
Fig 3. Stages of hybrid intelligence system

VIII.

ACKNOWLEDGMENT

VI. APPLICATIONS OF HYBRID SYSTEMS

The hybrid neuro -symbolic systems have several applications, some in commercial environments. These include systems that require fault tolerance, generalization, implicit and explicit reasoning, incremental learning and flexible architectures. The neuro-symbolic systems in several areas of engineering, medical and company diagnostic are used. Hybrid systems are a current trend in Artificial Intelligence. Particularly regarding neurosymbolic systems, one can expect the dissemination in several areas, especially with the development of hybrid shells that integrating different AI technologies under the same environment. Nevertheless, the commercial application of hybrid systems in large scale depends on research in several topics, including the study of unified architectures rather than hybrid solutions and the development of formal knowledge representation models for neural networks.
VII. CONCLUSIONS

We would like to thank to our Director for providing the full cooperation, support and for helpful comments on an earlier draft of this paper.

REFERENCES

This paper discusses the approach for stimulus to investigations into hybrid intelligent systems. The awareness in the research and development communities that combined approaches will be necessary if the remaining tough problems are in artificial intelligence.

[1] A.Nikitenko, J.Grundspenkis The kernel of hybrid intelligent system based on inductive, deductive and case based reasoning KDS2001 conference proceedings, St. Petersburg 2001. [2] P.S.Rosenblum Improving Accuracy by Combining Rule-based and Case based Reasoning 1996. [3] Bradley J. Rhodes Margin Notes Building a Contextually Aware Associative Memory, The Proceedings of the International Conference on Intelligent User Interfaces (IUI '00), New Orleans, LA, January 9-12, 2000. [4] M. Pickering The Soft Option 1999.A.R.Golding. [5] J.R. Quinlan Comparing Connectionist and Symbolic Learning Methods Basser Department of Computer Science University of Sydney, 1990. [6] J.R. Quinlan Improved Use of Continuous Attributes in C4.5 Journal of Artificial Intelligence Research 1996.

Green ICT: A Next Generation Entrepreneurial Revolution


Pooja Tripathi Associate Professor , Indraprastha Engg. College, Email: poojatripathi_75@rediffmail.com, trippooja@gmail.com

Abstract Entrepreneurship is the act of being an

Key word: Green ICT, entrepreneurial, Energy Efficiency, Energy Conservation

entrepreneur, which can be defined as "one who undertakes innovations, finance and business acumen in an effort to transform innovations into economic goods. It has assumed super importance for accelerating economic growth both in developed and developing countries. It promotes capital formation and creates wealth in country. Green ICT provides a tremendous opportunity to use energy efficiently and reduce carbon Introduction The Indian economy is undergoing a profound transformation from its two-century old industrial past to a knowledge society future. Information and communication technology (ICT) now

permeates virtually all aspects of our lives. ICT is inextricably linked with our desire for a prosperous and competitive economy, a

emissions. Beyond the benefits, it would help India on a path to energy independence, but it is a task that requires collective efforts from the organizations/the citizens and the government as well.

sustainable environment, and a more democratic, open, healthy society. As the global energy consumption is high and rising, conventional fuel sources are becoming increasingly scarce and expensive. Further, emissions resulting from the use of fossil fuels have been linked to global climate change and, within a rising number of countries, are subject to regulation. Consequently, governments, businesses and consumers around the world are seeking products and services to improve energy efficiency.

by economic growth, increasing consumerism, skylines studded with sky crappers and snazzy cars adoring the roads. Our lives are become Green ICT = Economic Benefit + Sustainability increasingly comfortable with all the luxuries at our disposal. We take pride in the life where everything is just a click away, but take a deeper look at the quality of life and we realize that we have lost a lot in this bargain for material pleasure. Figure 2 shows the benefits of the technology used in our day to day life.
Some where we have smothered the sound of chirping birds and tuned our ears to the endless noise on the roads, in place of the smoothing sights of the greenery we have man made concrete monster all

Figure1 ICT emission Projections

over, the whiff of fresh air has been replaced by obnoxious smell of deadly smoke. It seems

World marketed energy consumption was 462 quadrillion Btu in 2005. Going forward, global energy consumption is forecast to increase 19% between 2005 and 2015 to 551 quadrillion Btu. Fig 1 shows the total ICT emissions from various sectors. Today we are living in the world driven

environment, the very basis of our lives, has been put on the stake in the name of development.

=
The Global CO2 Emissions from the IT Industry
Figure 3 Significant role of ICT emissions for the Global Climate change Ozone layer depletion is exposing us to harmful UV rays, the increasing levels of pollutions are the severe threats to our very existence. Conventional fuels such as oil and other liquid petroleum products, natural gas and coal are the world's leading sources of energy. Together, these sources are expected to account for approximately 85% of the world's energy in 2010. Even considering the technological advancements and increasing penetration of renewable energy sources, the share of world energy supplied by conventional fuels is expected to remain flat to 2015. As fossil fuels, these resources are finite and current projections indicate that they will be depleted within a relatively short timeframe. Further, use of these fuels results in greenhouse gas emissions, which are linked to global climate change. Together with the fact that power generation using these sources is becoming Figure 4 Stake holder views for the Energy Management

The Global CO2 Emissions from the Airline Industry


shown in the Figure 4. The people at the lower level will start thinking for the measures for the conservation of the energy consumed. For eg. The uses of solar energy, wind energy are being used to drive various instruments and machines. The other group will think of the product that can be energy efficient. The another group will look to explore the generation of energy through other means such

renewable energy.

increasingly expensive, current energy use patterns are unsustainable. The various stake holders involved in the energy management will have their own perspective as

Hence we can say any body and everybody can make a difference when it comes to saving our

environment

Green ICT: Solution to the Problem

IT Solutions can reduce emissions by 15%

original hardware and reducing power and cooling consumption. Virtualization may be both a means to deliver cloud computing and a solution delivered by it. Green ICT group are currently using virtualization in the data centre and now
Various Green ICT Initiatives

looks at desktop virtualization options such as a Citrix solution. D. 3. Cloud Computing Gartner defines it as "a service-based, scalable & elastic, shared, metered by use, & uses internet technologies" mechanism. One of the tenets of cloud computing is that you use what you need. Additionally, this technology is run on shared infrastructure and computers run at high

There are several Green Initiatives and is working towards an objective whereby the initiatives become widely used practices on campus. There are many new Green beginnings that organizations are aspiring to introduce. Some of the implemented and future initiatives are described below:

B. 1. Paper-Based Processes Automated Automating paper-based document workflow can reduce cost, improve accuracy, and reduce document processing costs. The Green IT groups are taking steps to reduce paper-based processes by automating various forms while publishing them online and create fillable versions of various documents. The Mac ID request form is a perfect example of that. It is now available online, it is also submittable and retrievable online which eliminates the usage of paper. Paper-based reports can also be automated. C. 2. Virtualization Virtualization allows combining several

utilization, which results in a reduced power usage. Even though you increase single-computer power usage when consolidating, net power decreases as you take other systems offline. The Green ICTs data centre is based on the cloud computing concept. E. 4. Teleconferencing Rising energy costs and environmental concerns bring into question, an employee's need to commute or travel for business (meeting,

interview, etc.). Teleconferencing is a good Green alternative to a travel for meetings. Thus, Green ICT groups provide this service to the

organizations. applications into virtual machines on one single, powerful, physical system, thereby unplugging the

F. 5. Voice Over IP (VoIP) Technology This technology reduces the telephony wiring infrastructure by sharing the existing Ethernet copper and is considered to be a Green alternative to most traditional telephony technologies. The Green ICT groups are exploring the VoIP telephony to be implemented in all new buildings and major renovations, preceded by an

framed for these professionals in major handling the below mentioned tasks in the organizations. Develop Metrics and Set Benchmarks o IT industry standards for data center (server room) o Office building industry standards o Industry averages Consumption study (electric utility, electrician) o Standards for new equipment

infrastructure upgrade. G. 6. Energy Efficiency Power consumption is to be managed by using monitoring tools that allows for centralized control of power management settings (monitor power management, system standby, and

Report o SCORE Assessment Certification and Standards Thus, there is a need for the start of the curriculum which can train the young minds to take up career in the Green ICT. Conclusion Amazing opportunities are available in the field of Green ICT as it is soon occupying the envious numero uno position in the long list of challenges starting back at the human kind.

hibernate). Green ICT group will work towards the recognition of the organization by the Environmental Protection Agency by receiving a Certificate in recognition of the commitment to a better environment and reducing greenhouse gas emissions as a participant in the Low Carbon IT Campaign. Green ICT Challenges A career in Green ICT is challenging yet rewarding in wake of the fact that saving the environment and sustainable development has become high on agenda of policy makers and governments. The Green ICT professionals have to be developed and the competency model has to

References 1. Green computing wikipedia. http://en.wikipedia.org/wiki/Green_co mputing2. World Business Council for sustainable development

2. Energy Efficiency, Best Practices, Foundation for community association research. 3. Indian Green Building Council, [www.igbc.in/site/igbc/index.jsp\] 4. Urban buildings: Green and Smart, the way to go [www.hindubusinessline.in] 5. SMART BUILDINGS: Make Them Tech Smart; [http://voicendata.ciol.com/] 6. Traffic congestion in Indian cities: Challenges of a rising power; Kyoto of the Cities, Naples 7. Smart grip, Wikipedia;[ en.wikipedia.org/wiki\]

8. Global Energy Network Institute, Geni.org 9. Smart 2020, United States Report Addendum, GeSI, 2008 10. Automation in power distribution, Vol. 2 No.2, iitk.ac.in 11. Electricity sector in India, Wikipedia 12. Electrical Distribution in India: An overview, Forum for regulators 13. Optimize energy use, WBDG sustainable Development, 12-21-2010

ABSTRACTS & PPTS

AN INNOVATION FRAMEWORK FOR PRACTICEPREDOMINANT ENGINEERING EDUCATION


Om Vikas AN INNOVATION FRAMEWORK FOR PRACTICEPREDOMINANT ENGINEERING EDUCATION Innovation Paradigm Creativity involves new ideas/concepts Innovation is successful implementation of creative ideas having impact on economy and society. Innovation may be linked to improvements in efficiency, productivity, quality, competitive positioning, etc. Invention is constructing something new out-of-box Discovery is finding something unknown Jugaad may connote to innovate in nonformal way that may not be explained instantly in structured manner. Such innovation may be even by untrained worker. Entrepreneur creates value to convert material into resources; creates new business/service.

Om Vikas | Dr.OmVikas@gmail.com | | +91-9868 404 129 | CSI Conference, 14-15 May 2011

Inclusive Innovation Inclusive innovation may include both structured innovation and jugaad unstructured/ intuitive innovation. This requires to change mindset and promote scientific temper of people, at large at school and college levels and in various sectors of economy.

Global Innovation Index Ranking Produced jointly by the Boston Consulting Group, the National Association of Manufacturers and the Manufacturing Institute

Innovation Components & Nations Economy

Innovation

HRD Demands of Industry Cognitive Knowledge (Know-What): basic mastery of discipline Practical proficiency (Know-how): Ability to translate theory into practice. Instinctive Perception (Know-why): In-depth perception of cause and effect of relationships Achievement Motivation (care-why): desire to achieve success Inter-Personal interaction (concern-who): ability to deal with people for common goal

Innovation in Education Next Generation would require Knowledge (technical & design +standards ) and Skills (teamwork, organization & multi-user dealing + dealing with cross functional business processes) PBL (Project-based learning) is compromise between Constructivist learning theory and situated learning theory.

Innovation Radar In order to understand Innovation processes in an organization, a new framework : Innovation Radar is proposed that consists of 4 key dimensions : (i) Offerings : Product/services (What), (ii) Users it serves (Who), (iii) Processes it employs (How) and (iv) Point of Presence (Where). When an organization is able to identify and pursue niche / neglected innovation dimensions, it can change the basis of competition, leaving other organizations at a distinct disadvantage. Successful innovation strategies tend to focus on a few highimpact dimensions rather than attempting many dimensions at once.

Innovation Radar: a 360-degree view

ICT in Education Ubiquitous learning. Mobile Learning. Personalized learning. Redefinition of learning spaces. Teacher-generated open content. Smart portfolio assessment. Teacher managers/mentors.

Engineering Education : 5-Years versus 4-Years IIT Kanpur : 1st Review(1970-72), 2nd Review(1979-81) Duration of B.Tech program was 5 years until 1980 Subjects Proportion: 5-years B .Tech Humanities & Social Services ~ 20% Mathematics & Basic Science ~25% Engineering Science ~25% Engineering Analysis & Design ~25% Electives ~ 5%

Engineering Education: 5-Years versus 4-Years 5-years BTech program was reduced to 4 years from 1981 onwards and IIT Kanpur conducted Third Review (1990-92) In 4- Years B Tech, Program Most of the Professional Course were retained. HSS subjects were reduced to half from 8-10 to 45 Good HSS content is necessary for a wellrounded engineering education. In case of L-T-P , Lecture content is on high side. New subjects are suggested: - More in professional stream. - Communication skills.

Engineering Education: Present Scenario Students dont have clear concepts of core engineering science subjects. Most of the students have poor communication skills. Sensitivity to society is very low. Basic concepts of multilingual computing are not introduced in ICT. Focus on R&D is minimal. Use of open technologies / standards is not encouraged. Projects are not properly defined and seriously guided by faculty Teaching-Learning processes are not innovation-centric. E-Learning resources are not integrated into classroom instructions.

Engineering Education Training: To Groom Up Academic Leaders The modular (trimester) Program is suggested for orientation of fresh teachers. The content covered will include Basic Concepts of Basic Science: PCM 25% Core Concepts of Engineering Sc.(case-study based) 25% Teaching-Learning Processes / Pedagogy 20% Professional Communicative Skills 10% Managerial Skills 10% Life Management & Ethics in Engineering 5% Project: High-Tech solution in Low-Tech Environment 5%

Apoorv Agarwal
CSE 2 Yr. Vidya college of engg.
nd

Apeksha Aggarwal
CSE 3rd Yr. Gyan Bharti Inst. Of Tech.

Mobile Ad-hoc Networks (MANET) Mobile Random and constantly changing.

Ad-hoc Not engineered.

Networks Data applications which use temporary links to communicate.

Characteristics of MANETs Dynamic Topology: links formed and broken with mobility.

Possibly uni-directional links. Independent of fixed architecture. Also called ON THE FLY NETWORKS i.e. any where, any time and for virtually any application.

Ad-hoc On-demand Distance Vector Routing

Dynamic Source Routing

Temporally Ordered Routing Algorithm (TORA)

CONCLUSIONS

APPLICATIONS OF AD HOC NETWORKS

Arun Kumar Sharma


M.C.A.(Final Year) K.E.C., Ghaziabad

THANK YOU

HANISH KUMAR
K.E.C. MCA(IV Sem)

Mr. S.K. Mourya


Mahatma Gandhi Mission College Of Engneering & Technology (Noida)

Green ICT: A Next Generation Entrepreneurial Revolution


Prof Pooja Tripathi Associate Professor , Indraprastha Engg. College, To whom all the contacts may be sent: poojatripathi_75@rediffmail.com, trippooja@gmail.com

Abstract

The Indian economy is undergoing a profound transformation from its two-century old industrial past to a knowledge society future. Information and communication technology (ICT) now permeates virtually all aspects of our lives. ICT is inextricably linked with our desire for a prosperous and competitive economy, a sustainable environment, and a more democratic, open, healthy society.

The paper reviews the global market for ICTs which are both energy efficient in themselves and enable energy conservation within various sectors of the global economy. The paper outlines the need for improved energy efficiency and introduces several of the most significant opportunities to improve energy efficiency through the use of ICTs through 2015.

Key word: Green ICT, entrepreneurial, Energy Efficiency, Energy Conservation

ROLE OF 21ST CENTAURY : ICT NEED OF THE DAY. Saurabh Choudhry* Introduction: Globalization and technological changeprocesses that have accelerated in tandem over the past fifteen yearshave created a new global economy powered by technology, fuelled by information and driven by knowledge. Information and communication technologies (ICTs)which include radio and television, as well as newer digital technologies such as computers and the Internet. Rajiv Gandhi by starting the IT revolution was one of the biggest achievement . The name, Sam Pitroda can better be explained by the yellow phone booths all across India. Yes, it was mainly because of the efforts of this inventor, technocrat, and social thinker that telecom revolution started in India. Mr. Sam Pitroda revolutionized the state of telecommunications in India. Currently, Mr. Pitroda is the Chairman and CEO of World-Tel Limited, an International Telecom Union (ITU) initiative. He is also the Chairman and Founder of Sevend high-technology. 2 Discussion: ICTs stand for information and communication technologies and are defined, for the purposes of this primer, as a diverse set of technological tools and resources used to communicate, and to create, disseminate, store, and manage information. The Promise of ICTs in Education For developing countries ICTs have the potential for increasing access to and improving the relevance and quality of education. It thus represents a potentially equalizing strategy for developing countries. [ICTs] greatly facilitate the acquisition and absorption of knowledge, offering developing countries unprecedented opportunities to enhance educational systems, improve policy formulation and execution, and widen the range of opportunities for business and the poor. One of the greatest hardships endured by the poor, and by many others who live in the poorest countries, is their sense of isolation.
2

Conclusion: The introduction of ICTs in education, when done without careful deliberation, can result in the further marginalization of those who are already underserved and/or disadvantaged. For example, women have

less access to ICTs and fewer opportunities for ICT-related training compared to men because of illiteracy and lack of education, lack of time, lack of mobility, and poverty.2 References :Chitra Bajpai ., New dimensions in knowledge process outsourcing in journal of management development and information technology Volume 8 pages 51-59., Dec 10. Various Websites 1 2 http://www.apdip.net/publications/iespprimers/eprimer-edu.pdf http://www.indobase.com/indians-abroad/sam-pitroda.html

Reusability Of Software Components Using Clustering


Meenakshi Sharma1,Priyanka Kakkar2, Dr. Parvinder Sandhu3,Sonia Manhas4 Sri Sai College Of Engg. & Tech., Pathankot1,2,4,,Rayat Kharar3 HOD CSE1, Mtech CSE 4th sem2,HOD CSE3 mss.s.c.e.t@gmail.com1,pinkudiya@gmail.com2

Abstract:- Software professionals have found reuse powerful means to potentially overcome
the situation of software crisis. Anything that is produced from a software development effort can potentially be reused and reduces the software development cost. But the issue of how to identify reusable components from existing systems has remained unexplored. The requirement to improve software productivity has promoted the research on software metric technology. There are metrics for identifying the quality of reusable components but the function that makes use of these metrics to find reusability of software components is still not clear. These metrics if identified in the design phase or even in the coding phase can help us to reduce the rework by improving quality of reuse of the component and henceimprove the productivity due to probabilistic increase in the reuse level. CK metric suit is most widely used metrics for the object oriented (OO) software.

The Proceedings Of
2nd National Conference on Innovation and Entrepreneurship in Information and Communication Technology

The organisors thank all the authors/delegates for sharing their knowledge, views and time.

Disclaimer: The views expressed by the authors do not necessarily represent those of the editorial board or publisher. Although care has been taken to avoid errors, this proceeding is being published on the condition and understanding that all the information provided here is merely for reference and must not be taken as having authority of or binding in any way on the authors, editors and publisher who do not owe any responsibility for any damage or loss to any person, for the result of any action taken on the basis of this work. The publisher shall be obliged if mistakes are brought to the notice. The author is responsible for obtaining copyright release, corporate and security clearances prior to submitting material for consideration. Publisher assumes that authors have gone through the process of clearance and are granted when a paper is submitted.

You might also like