You are on page 1of 23

UNIVERSITY OF WITWATERSRAND

Vacation Work: MECN310


Util Labs Pty. Ltd.
Joseph Thomas 0710343E 1/20/2012

Executive Summary

The following document is details the work undertaken while at Util Labs. The problem statement and the focus of the six weeks were to do with pattern recognition of electricity consumption in a household. This is to combat incorrect billing. A method for non-intrusive application load monitoring was discovered. With programmed logic, applications can be identified depending on the magnitude of the power draw. These techniques reduce computational intensity compared to other matching techniques (convolution). This ensures that the data can be processed and sent from the MCU without further modifications. The deliverable was to find a method to characterize a house to prevent cable switching and incorrect metering. Two types of data were analysed with different data intervals. Data with intervals of one second and five minutes needed deferent techniques to analyse. Per second data was found to be easy to establish an application class, providing a characteristic matrix of a household with multiply application classes (Table 1, page 5). Sensitivity analysis was conducted and found that the method of analysis worked for data intervals up to two minute (section 4.2, page 6). The data that the national grid provides is power data (W.h), with data intervals of 5 minutes. Habitual analysis is applied data with five minute intervals providing a characteristic matrix dependent on the average power consumption and the time of day (Table 2, page 7).

Declaration

Blank Page

ii

Table of Contents
Executive Summary.................................................................................................................................. i Declaration .............................................................................................................................................. ii List of Tables .......................................................................................................................................... iv List of Figures .......................................................................................................................................... v List of Abbreviations .............................................................................................................................. vi 1. Introduction .................................................................................................................................... 1 1.1 Company ....................................................................................................................................... 1 1.2 Project ........................................................................................................................................... 2 2. Objectives........................................................................................................................................ 3

3. Methodology ....................................................................................................................................... 3 3.1 Process .......................................................................................................................................... 4 4. Observations ................................................................................................................................... 5 4.1 Logic .......................................................................................................................................... 5 4.2 Sensitivity Analysis .................................................................................................................... 6 4.3 Habitual Data ............................................................................................................................ 7 5. 6. 7. Analysis ........................................................................................................................................... 8 Conclusions ..................................................................................................................................... 9 Recommendations .......................................................................................................................... 9

References ............................................................................................................................................ 10 Appendix A: Filtering ............................................................................................................................. 11 Appendix B ............................................................................................................................................ 12 Appendix C: Sensitivity Analysis ............................................................................................................ 13 Appendix D: Testing .............................................................................................................................. 15

iii

List of Tables
Table 1: Power Signature Matrix ............................................................................................................ 5 Table 2: Habitual Matrix ......................................................................................................................... 7

iv

List of Figures
Figure 1: Frequency Spectrum .............................................................................................................. 11 Figure 2: Sample data ........................................................................................................................... 11 Figure 3: Normalized frequency spectrum ........................................................................................... 11 Figure 4: Localised Differential vs Ordinary .......................................................................................... 12 Figure 5: Discrete pulse data and Localised rate of change (pulse/sec) vs time (sec) .......................... 12 Figure 6: Pulse and rate of change data: 2 second interval .................................................................. 13 Figure 7: Pulse and rate of change data: 10 second interval ................................................................ 13 Figure 8: Pulse and rate of change data: 60 second interval ................................................................ 13 Figure 9: Pulse and rate of change data: 5 minute interval .................................................................. 14 Figure 10: Per second data testing ....................................................................................................... 15 Figure 11: Habitual testing of 5 minute data ........................................................................................ 15

List of Abbreviations

Abbreviation

Meaning

NIALMS eddi w.r.t. ULMS MCU D&D OCP VAR kW.h

Non-Intrusive Application Load Monitoring Electricity Demand Display Unit with respect to Utility Load Monitoring System Master Control Unit Design and Development Operations Centre Personnel Volt Ampere Reactive Kilo Watt . Hour

vi

1. Introduction
1.1 Company
Founded in 2008, Util Labs has 60 employees working at the office in Midrand. The product is a Utility Load Management System (ULMS). This device measures and displays the power consumption of households in real time. This, in effect, empowers customers to change their consumption habits and monitor the improvements. Additionally the ULMS allows municipalities to monitor and communicate to individuals. This in turn leads to better customer management and support. An organogram of the organisation is provided on the last page of this document. The structure of the organisation is roughly separated into three groups: Design and Development (D&D) Operations Finance/Administration The ULMS has the technology capable of making the grid smart. The eddi is a plug and play device that measured and displays the power consumption of the house. It works by fitting into any three pin plug. The display unit alerts customers at times of high load to change their consumption to avoid load shedding. After sufficient warning, including telephone notification, if the consumers do not comply, they are disconnected from the grid with a click of a mouse. The control point can remotely limit the consumption at different levels, depending on the power needs. This avoids over consumption, preventing waist and controlling segments of the grid which before was not possible. It remotely sends data on a low bandwidth, high speed line that enables control of power consumption. Currently there are 14000 households on the system. The goal is to reach four million South African households. This could decrease the power consumption of the country by up to 10%. The service it provides is unmatched and is not under direct competition at the moment. They provide an alternative to load shedding and have developed a device that can communicate in real time with customers, to Eskom and City Power inside their houses. The service providers, such as Eskom and City Power, are the customers and the national grid is the client. [1] At the 2011 Eskom eta awards, Util Labs Pty Ltd won overall in the coveted innovation category. The principles of ISO 9001 are followed within the organisation. This sets a bench mark for quality and efficiency within the organisation. [2]

1.2 Project

Incorrect billing is a serious problem for all parties involved. Customers are annoyed at no end to pay for something that they are not consuming. To rectify the problem manually can take time and expend resources. The company involved can ultimately lose their customers if the problem is not speedily resolved. Most of the time, however the customer and the provider are not aware that wires have been switched. The purpose of the investigation is to develop software to recognise the appliances of a house from the power usage data. This recognition can differentiate houses, allowing for digital switching of meters that are tampered with in the field. An appliance can be identified by its pattern of power usage, the time in the day it is used and the length of time that is operates for. Assuming different houses have different applications, a distinction between different houses as opposed to the same house with a new appliance must be found. The input data was power consumption data (kW.h) from test units in the field. The data was handles in two separate stages. The first stage was per second power data and the second stage used per five minute data. The five minute data is the actual collection rate for all the devices in the field. The problems that face the analysis include aliasing and filtering of the data. Noise generated by some units as it interferes with the measurement system. Multiple appliances run concurrently in any particular house hold in a seemingly random manner. The entire pattern recognition procedure was done on Microsoft Excel. To be implemented in the national grid, it must be programmed into a language named Octave. My involvement was strictly to find a matching criterion for houses. Research and programming was done for the bulk of the work duration. The more experienced programmers would then take over and weed out the issues as the system is tested on with a larger data sample. The sample data first used was per second power consumption at night. The national grid runs on a pulse per five minute model, 300 times less data intensive. Sensitivity analysis is to be done on the model to check its applicability for per five minute data. Decreasing the time between pulses for analysis is only viable of micro experimentation. This is however not a viable option as the total data to be processed by the central system will exponentially increase.

The testing phase was continual on a limited data sample. A number of houses were available to gather data from. The program development followed an iterative process of trial and error.

2. Objectives

1) Could houses be differentiated and characterised by their power consumption characteristics? 2) Identify an identification technique for a house using power usage data (kW.h). 3) Separate appliance classes from power usage data into 4) Find the usage characteristics for each appliance. 5) Normalize the data and compile a matrix of power usage to characterize a house. 6) Identify a matching criterion to differentiate houses from a group.

3. Methodology

The following is an example of a non-intrusive appliance load monitoring system (NIALMS). The received input is discrete power data from the electronic demand display instrument (eddi) for particular houses. Each measurement is taken as a pulse. Each pulse represents 8.5742 W of power used per seconds. The data acquisition rate is at 1 Hz for the initial analysis. The usability of the code developed for 1 Hz is then tested and modified for an acquisition rate of 0.003 Hz (pulse per five minute). The general stages of handling the 1 Hz data are listed below: 1. Work with pulses 2. Smoothen the function with an anti-aliasing low pass filter 3. Numerically differentiate the data 4. Smoothen differentiation by finding localised peaks 5. Use logic to find characteristics of each appliance 6. Compile data for each appliance in a matrix for the specific house 7. Minimising the difference between matrices produces a match

3.1 Process
The Nyquist theorem states that the sampling rate must be at least twice the maximum frequency of the system.[3].This avoids aliasing therefore smoothens spurious data patterns. The solution to the problem is to run the data through a low-pass filter. The formula is presented below.[4] ( [] Where: RC: T: : Y: X: Time constant, or frequency below which the frequencies are cut out Sampling frequency Smoothing factor Discrete power point Discrete time point [ ] ( [] ) [ ])

Excel has an add-in that can find the Fourier transform of given data. The given power (Watt.second) was transformed and converted from the time domain to the frequency domain. The frequency spectrum of the sample data is shown in Appendix A. The reduction of the noise after the data is run through the filter is clear. The Nyquist theorem states the sampling rate must be at least half the maximum frequency. However, in practise this can be increased to four times or above. Therefore an alpha value of 0.2 was finally used.

When the data is corrected for aliasing, segmentation of the data can begin. The first step is to numerically differentiate the discrete data points. This is to provide the absolute deviation and segment the data into its root components. Central difference method was used. The formula is presented below:[5] ( ) ( )

( )

Where: F(i): h: The discrete data point The step difference in time between alternating data points.

4. Observations

The first differential represents the discrete pulses with respect to time. The second differential represents the absolute change of power consumption. To reliably analyse the data with logic, the peaks must be distinct. The differential data and the given data are shown in Appendix B, page 12. Using logic in Excel, with the data from Figure 5, page 12, appliance usage is extracted. There were two apparent trends in the power consumption. There were repeated cycles with inductive kick of a motor corresponding to a fridge. The second involved high peak of a resistive element of a heater. The total usage time and average cycle time was compiled for each appliance in a matrix. This matrix corresponds to a houses power usage signature. An example of such a matrix is presented in Table 1.

Table 1: Power Signature Matrix

Total on (sec) Element fridge 3605 27087

Average Period (sec) 901.25 2083.615

Peak change (pulse/sec) 175 30

With multiple houses, a match can be found by minimising the difference between the matrices using an L2 norm. Each component must be normalised to prevent a component from dominating the total. The results are then analysed with multiple houses. The next step is to find the correlation between the matrices to find the minimum difference and the maximum likelihood match. [6]

4.1 Logic

The two components used different logic to identify them. The fridge had a repeated cycle with an inductive kick which drew a significant amount of power at the start of each cycle. The heating element had a definite on and off time.

The differential data was used to find the switching points of the appliances. The element had a large positive spike when turned on and an equally large negative when turned off. These events were marked as with a positive and negative constant respectively. A count function measured the cells between these markers, corresponding the usage time in seconds for each cycle.

The fridge could be marked at each inductive spike. This marked the start of a new cycle and the end of the previous one. A different counter found the time between each cycle and the cumulative usage time. The running time of a fridge is continuous, so the critical point is the average cycle time and not the total.

4.2 Sensitivity Analysis

The data collection rate used on the grid has five minute intervals. This is done due to the large customer base and to relieve the processing done by the central server. The five minute interval makes data handling easier, but data analysis trickier. As the data is gradually converted from per second to per 5 minute there are a few things that stop working. The low pass filter has negligible effect on the disturbance. This is shown in Appendix A, Figure 3 on page 11. The frequency spectrum is normalized and there is only a noticeable difference when the graph is magnified (notice the maximum is 0.02 instead of 1). Using the current analysis for per second data, the data was extended to longer intervals to study the limits of its usability. As the intervals get longer, the period of the fridge jumps in and out of sync with the collection rate. This produces regions of inconsistency with second differential. The logic for application identification needs distinct rate of change characteristics. The analysis and identification becomes uncertain for collection intervals higher than two minutes. The trend is shown in Appendix C, Figure 6 to Figure 9, pages 13 to 14. The rate of change characteristics becomes multi-peaked at five minute intervals. This shows that the collection rate and the period of the application is not is sync. Sensitivity analysis discovered that the process only works for data intervals that provide distinct change characteristics. Per second analysis techniques only work reliably up to two minute intervals of data collection. It is possible to make localized peaks as described in the precious section and shown in Appendix B, Figure 4 on page 12. This has increased uncertainty with increased data intervals. Time dependent, habitual usage techniques were considered to find reliable characteristics for houses using five minute data intervals.

4.3 Habitual Data

This sort of analysis uses the fact that the power habits of an average working household stays constant during the week. The day is segmented into three parts: morning, day and night. Using similar logic as before, the applications are identified and compiled in a matrix depending on the time of day it is used. An additional characteristic is sleep. This is the time where the minimum amount of power is used. In terms of logic, it is the time between the last switch off, and the first switch on. A sample matrix of habitual data is presented below, in Table 2. This will represent a houses characteristic power usage. Correlation techniques are then used and the normalised matrix pair with the minimum difference will produce a match. Once the match passes a number of tests, which are trade secrets, the digital switch of the two houses can take place.
Table 2: Habitual Matrix

Whole Length (5min/period) Number of periods Usage (5min) KW.5mins Sleep (hours) 5.53 3.47 19.24 431.53 3.21

Morning 5.52 2.74 15.09 347.47 1.90

Day 4.38 0.38 1.68 31.11 0.00

Night 6.92 0.35 2.44 52.95 1.31

The analysis will be done and compared to the same season and time in the week to eliminate the variability. An averaging of many days is used to compile the matrix to compensate for outliers. This approach provides additional variables and provides a more accurate matching criterion. Habitual power consumption can hold the key to finding a houses characteristic power demand. With 5 minute data this method can reduce the variability. This takes into account the time in the week, day and season that an appliance is used. Using considerable usage data a characteristic can be found for a particular house. Therefore is cables are switched in the field, one can be alerted. With the current capabilities of the Util Labs grid, the switch can be done digitally.

5. Analysis

Sample data has been used to check for the application of the code. The testing, as with the project was done in two stages depending on the collection rate. The results of the per second and five minute data are shown graphically in Appendix D, Figure 10 and Figure 11 respectively on page 15. These figures show the base power data overlaid with the identified application via logic. The graphs illustrate that appliances can be monitored and identifies using power data alone. The per-second data can be segmented into application use of different classes without distortion or significant uncertainty. This is not true for the five minute data. The identification of secondary appliances, such as motors and pumps, proves to be more complex. Primary appliances, such as heating elements, and sleeping patterns are used to form the basis for the analysis. The analysis works for data intervals up to two minutes, as discussed in section 4.2 Sensitivity Analysis, page 6. Habitual data provides more variables to the characteristic matrix. This analysis used a limited number of appliances, with data intervals of five minutes. However there is still a variation of potential power consumption as humans are not habitual by nature. This technique relies on the users being highly regular and employed. The challenge is to incorporate more appliances in the analysis. Using a second data dimension would help in application monitoring. This can be done by analysing the power (in Watts) and the reactive power (using VARs) simultaneously. This would provide more reliable results on the type of appliance that is drawing power (inductive, capacitive or restive).

The MCU has a limited buffer size. This limits the number of bits it can process at any given time. The techniques used are highly limited in terms of the data processing requirements. With a larger buffer, the number of data points used to differentiate the data can be increases. The more data points used, the smaller the computational error.

The code must be converted to programming software that can handle a large amount of data. A combination of C++ and Octave is used within the company. Once the process is programmed, the system can be tested on a larger data samples.

6. Conclusions

1. Differentiation of the power data with logic provides a reasonable approach to pattern recognition. 2. The analysis using per second data intervals can be applied to data intervals of up to two minutes. This is as discussed in section4.2 Sensitivity Analysis, page 6. 3. Habitual data analysis provides a more accurate characteristic for houses using five minute data intervals. 4. The buffer size needed to incorporate this analysis is small. The MCU need not be upgraded. 5. Limited number of applications could be identified. Resistive elements and long cycle motors could be discerned. 6. The detection of reactive power is the key to accurately find the type of application that is being used, using the phase angle.

7. Recommendations
1) A second data dimension input is required. Analysing the power reactive power data would provide more reliable results on the type of appliance drawing power. 2) Upgrading the MCU would lead to higher accuracy in the analysis. 3) The code must be converted to a suitable programming language with the ability to handle larger data samples. 4) The analysis using per second data intervals can be applied to data intervals of up to two minutes. Increasing the data collection rate for a localised section would limit the strain on the main server and would provide the data needed for testing.

On a separate note, not dealing with the project, the materials used in the controller is lacking in finishing. The copper terminals that are used in the MCU arrive flat and have burred edges. The cost associated with filing and shaping time is low. However the efficiency of assembly would improve with de-burred terminals.

References

[1] Popular Mechanics, January 2012,Page 71. [2] http://www.utillabs.com/ [3] http://www.wordiq.com/definition/Nyquist_theorem [4] http://www.dsplog.com/2007/12/02/digital-implementation-of-rc-low-pass-filter/ [5] http://www.holoborodko.com/pavel/numerical-methods/numerical-derivative/centraldifferences/ [6] Robert Schalkoff, Pattern Recognition Statistical, Structural and Neural Approaches, 1992, page 329 All websites last visited 12th March 2012

10

Appendix A: Filtering

Figure 1: Frequency Spectrum

Figure 2: Sample data

Figure 3: Normalized frequency spectrum

11

Appendix B

Figure 4: Localised Differential vs Ordinary

Figure 5: Discrete pulse data and Localised rate of change (pulse/sec) vs time (sec)

12

Appendix C: Sensitivity Analysis

Figure 6: Pulse and rate of change data: 2 second interval

Figure 7: Pulse and rate of change data: 10 second interval

Figure 8: Pulse and rate of change data: 60 second interval

13

Figure 9: Pulse and rate of change data: 5 minute interval

14

Appendix D: Testing

Fridge Element Normalised Power

Figure 10: Per second data testing

Sleep Element Normalised Power

Figure 11: Habitual testing of 5 minute data

15

UTIL LABS (PTY) LTD BOARD OF DIRECTORS

Chief Executive Officer


Joe Paul

Engineering Division

Operations Paroshen Naidoo

Corporate

Group Leader / System Analyst Engineering Projects Management Hartmut Bohmer Testing & Verification Cedric D'Abreton

Andrew Goedhart

Production Engineering Jan Olwagen

Configurati ons Controller Norveshen Pillay

Marketin g Senosha Naidoo Haneefa Montani

Producti on (Compani es 0; 1 & 2) Riaad Perreira

Network Ops

Installatio ns Projects Managem ent A N Other

Field Installati ons Jon Bonsignor e

Adheesh Sewrajan

ULM Technic al Suppor t Stephe n vd Merwe

Finance

Quality Assuran ce (QMR) Luigi Slaviero

Personn el

Britto Philipose

Senior Developer Carl Heymann Developer Nicholas Prozesky

Developer

Pierre vd Riet

ULM Syste m Securi ty Edwar d vd Vyver

Engineer

Producti on Ops

Network Monitor Thamesh an Moodley

Field Team 1

Procurem ent Officer Farai Nyereygon a

QA Officer (Assist QMR) Jodash Singh

Japie Greeff

Gregory Moyce

Deven Pillay Stanley Mkhize

Develop er Duane McKibbi n

Develop er

Developer

Technici an

Warehou se Controlle r Kenneth Hlatshwa yo

Public Liaison Officer

Field Team 2 Dave Hinchcliff e Andries Kekana Peter Mataban e

Test Dept Siphelel e Msezan e

Rajesh Thiru

Sethu Govindasa my

Themba Gama

You might also like