Professional Documents
Culture Documents
1 Introduction
One of the most explored problem in machine learning literature so far is the linear regression, that consists in
approximating the straight line – or function – that better fits a set of given points. Thus, we can use the traced line to
predict outcomes for unknown values. Several methods have been proposed for linear regression so far, including
geometric, mathematics and computational approaches. In this study, it is applied the Least Square algorithm for
addressing this particular problem.
di = wxi + b + ϵi = yi + ϵi (1)
where d, x, y and ϵ are the desired, predictor, linearly fitted and error values for each i = 1, 2, ..., N , respectively,
w is the line slope and b is the bias.
1
In most instances, it is not possible to find a straight line which fits all values. Therefore, a criterion is determined
for estimate which parameters present the best performance on this task. The mean square error (MSE) is one of
the most common adopted criterion, calculated by (Principe et al., 1999)
1 ∑ 2
N
J= ϵ (2)
2N i=1 i
where J is the average sum of square errors and N the amount of sample data to be fitted.
As our main goal is finding a w∗ that minimizes the J function, each is iteratively calculated, by the Eq. (3),
where k = 1, 2, ..., k number of training iterations (or epochs), η is the step size (or learning rate).
3 Development
The algorithm is implemented using Python 2.7.8 (Van Rossum, 1998), including the following libraries:
2
2. PyLab (https://pypi.python.org/pypi/pylab/) – A scientific library, which provides a group of graphic and
chart functions.
The parameters set for the method are summarized in Table 2. All weights w were initialized as 0.00.
4 Results
The first result, obtained utilizing epochs = 1, 000 and learning rate = 0.01, is shown in Figure 4. The model
is described by the linear function f (x) = 0.1568x + 1.1918, with error J = 0.0360.
Varying the learning rate to 0.1 resulted into an execution error, returning no values for the weights, whilst
adopting 0.001 performed a worse solution than the one previously found, with J = 0.1550.
The same sensitivity analysis was performed for the number of epochs: increasing it to 10, 000, the method
found a slightly better model with J = 0.0336, while lowering it to 100 presented error substantially higher –
J = 0.1550.
5 Conclusion
Although the LMS algorithm does not present an analytical solution for the linear regression model, it provides
a satisfactory approximation that can be iteratively found in many different application. There are some limitations
in the method, as depending on the initial weights it can converge for a local but not global solution, or start rattling
and not converging at all. But these are known issues treated by other and the method is also quite important for
introducing some concepts developed in other machine learning methods
References
Principe, J. C., Euliano, N. R., & Lefebvre, W. C. (1999). Neural and adaptive systems: fundamentals through
simulations with CD-ROM. John Wiley & Sons, Inc..
Van Rossum, G. (1998). Python: a computer language. Version 2.7.8. Amsterdam, Stichting Mathematisch Cen-
trum. (http://www.python.org).