Professional Documents
Culture Documents
A comparative study of articial neural networks, and decision trees for digital
game content stocks price prediction
Tsung-Sheng Chang
Department of Information Management, National Chung Cheng University, Min-Hsiung, Chia-Yi 621, Taiwan, ROC
a r t i c l e
i n f o
Keywords:
Articial neural networks (ANN)
C&RT
Decision tree
Stock price forecasting
a b s t r a c t
Precise prediction of stock prices is difcult chiey because of the many intervening factors. Unpredictability is particularly notable in the aftermath of the global nancial crisis. Data mining may however be
used to discover highly correlated estimation models. This study looks at articial neural networks
(ANN), decision trees and the hybrid model of ANN and decision trees (hybrid model), the three common
algorithm methods used for numerical analysis, to forecast stock prices. The author compared the stock
price forecasting models derived from the three methods, and applied the models on 10 different stocks
in 320 data sets in an empirical forecast. Average accuracy of ANN is 15.31%, the highest, in terms of
match with real market stock prices, followed by decision trees, at 14.06%; hybrid model is 13.75%.
The study also discovers that compared to the other two methods, ANN is a more stable method for predicting stock prices in the volatile post-crisis stock market.
2011 Elsevier Ltd. All rights reserved.
1. Introduction
Since mid-2008, when the collapse of Lehman Brothers led to
global economic repercussions, the stock market has been hard
hit. Stock indices fell, and economies went into recession. More
than one year later, stock indices have witnessed sharp uctuations, especially in new emerging markets. In addition, globalization has made forecasting of stock prices increasingly difcult
(Albuquerque, Francisco, & Marques, 2008; Stock & Watson,
2007). However, we need to know if the study samples of this period and the results of the forecasts of previous researchers are in
line with our expectations. In addition, among Asian emerging
markets, Chinas economic growth has been the engine spurring
on the development of stock markets in the region; hence the
greater volatility of stock prices in these markets. Thus, we need
to attach greater importance to emerging stock markets (Dutta,
Jha, Laha, & Mohan, 2006; Fidrmuc & Korhonen, 2009). The Taiwan
stock market, which has close inter-connection with Mainland
China, is another example worth observing (Lai, Fan, Huang, &
Chang, 2009; Lin & Yeh, 2009).
Many algorithm methods are used to predict stock prices.
Examples are the articial neural networks (ANN) (Desai & Bharati, 2007; Kim & Shin, 2007; Pino, Parreno, Gomez, & Priore,
2008; Zhu, Wang, Xu, & Li, 2008), Fuzzy (Khashei, Hejazi, & Bijari,
2008; Lee & Kim, 2007), or other statistical or forecasting meth-
ods (Chen, Gou, Guo, & Gao, 2008; Hu & He, 2007; Ince & Trafalis,
2008). All these methods attempt to predict stock prices under
different market and economic conditions, and of them, ANN
has produced rather good outcomes and has been the favourite
method for many.
Ou and Wang (2009), and Lai et al. (2009) believe that the decision tree (DT) method is good for forecasting stock prices. Levin
and Zahavi (2001) found that problem-correlation using DT is
much clearer than traditional methods. In fact, DT is a very good
forecasting method. The Bayes Theorem may be used as a basis
for scientic forecasts. However, DT studies have focused on
commercial activities (Aitkenhead, 2008; Reyck, Degraeve, &
Vandenborre, 2008). In recent years, there has been a lack of research in the prediction of stock prices using both DT and ANN,
and comparing the results of one method against the other. Most
ANN studies have focused on its evolution and improvement
(Ihme, Marsden, & Pitsch, 2008; Lin & Yeh, 2009; Paliwal & Kumar,
2009), and integrated fuzzy models on the forecasting of stock
prices (Khashei et al., 2008; Lai et al., 2009). This study aims to ll
the research gap by adopting a broader approach through in-depth
empirical studies.
The study adopts a hybrid model, using ANN and DT as the
foundation, to forecast stock prices. Try to nd out if this model
produces better forecasts of stock prices, compared to the earlier
two methods. Hence, the results of forecasted stock prices using
the three abovementioned methods (ANN, DT and hybrid model)
are compared against each other to nd out the differences. In
doing so, we can see if the results of our forecasts match our expectations, and discover the most stable model.
14847
14848
Name of corporation
(3064)
(3083)
(3086)
(3293)
(3546)
(4415)
(5478)
(6111)
(6169)
(6180)
Astro
Chinesegamer
Wayi
International Games System (IGS)
UserJoy
Mega Biotech & Electronics
Soft-World
SoftStar
Interserv
Gamania
closing OTC index and the daily closing prices of each stock for the
period between 1 July 2008 and 31 June 2009. A total of 5229 records were extracted. The unit cluster for all data (for each stock
and OTC index) is based on the trading date. Finally, 249 cases
were selected as our data set.
4.2. Symbol description
To better describe the results of our data analysis, mathematical
symbols and alphabetical representations are used for simplication purposes. The symbols represent the following descriptions:
(Stockid): Stock code, a numerical value. For example, 3064 represents the company Astro.
S (stockid): Current day closing price of a stock. The opening
price of a stock cannot be used as the benchmark for inputting
or analysis, as it does not represent its previous days closing
price due to uncertainties during the period when the market
is closed for trading. At the same time, comparing the days
closing price with the previous days closing price provides an
indication of the magnitude of any rise or fall. Hence, the daily
closing price is used as the parameter for our analysis. In addition, when collecting information on stock prices, the days
closing price is treated as the same closing price of the previous
day if no real transaction is concluded on that day. As such, the
previous days closing price is indicated as Sday1(stockid).
O: OTC index. Refers to the current day closing OTC index; OTC
index of the previous day is represented by Oday1.
P(stockid): Current days predicted stock closing price.
PER Prediction error ratio, the absolute value of the sum of the
current days predicted stock closing price divided by the current days closing price minus 1. A smaller ratio indicates a
more accurate and better prediction. Its mathematical formula
is shown as follows:
Pstockid
1
PER
Sstockid
14849
whose regression models are identical to the original ANN prediction model. The reason that the regression models for (3546) and
(6180) are the same as the ANN model is because there are fewer
independent variables in the C&RT models. The hybrid model does
not consist of any regression model that is the same as those in the
C&RT model. Although the C&RT model assumes seven independent variables for observation, we found more independent variables for prediction by the ANN model, compared to the C&RT
model.
Table 2
ANN of forecasting regression mode for different stock.
Stock ID
Regression model
R2
(3064)
(3083)
(3086)
(3293)
(3546)
(4415)
(5478)
(6111)
(6169)
(6180)
96.58
98.45
96.35
97.63
98.34
98.75
97.65
97.72
98.75
98.45
.962*
.993*
.948*
.981*
.992*
.996*
.983*
.988*
.993*
.991*
p < 0.05.
Table 3
C&RT of forecasting regression model for different stock.
Stock ID
Regression model
R2
(3064)
(3083)
(3086)
(3293)
(3546)
(4415)
(5478)
(6111)
(6169)
(6180)
.962*
.993*
.948*
.981*
.992*
.995*
.984*
.988*
.993*
.991*
p < 0.05.
14850
Table 4
Hybrid model of forecasting regression model for different stock.
Stock ID
Regression
model
R2
(3064)
(3083)
.962*
.994*
(3086)
(3293)
(3546)
(4415)
(5478)
(6111)
(6169)
(6180)
*
.948*
.981*
.992*
.996*
.984*
.989*
.993*
.991*
p < 0.05.
Total score
Rating: A
150
132
130
125
100
97
Rating: B
129
109
88
93
Rating: C
103
79
75
50
25
0
A NN
C&RT
Hy b rid M o d el
Total prediction
accuracy
51
ANN
C&RT
Hybrid Model
49
48
45
45
44
42
39
ANN
C&RT
Hybrid
Model
the predicted stock prices differ from the actual stock prices by
only 2%. The total number of valid predictions divided by the total
number of predictions is the actual prediction performance (refer
to Fig. 2).
We can therefore deduce that using ANN to predict the closing
prices of stocks is more in line with actual stock prices. A total of
49 data sets are accurately predicted. Therefore, for ANN, there is
a 15.31% probability that a valid prediction will occur across all
samples; C&RT is 14.06%; the hybrid model is 13.75%. This is different from prediction stability. Although the total prediction accuracy
for C&RT and the hybrid model differs only by 1, compared to the
hybrid model, C&RT produces a more even rating distribution
among the 10 sample stocks. Hence, we believe that prediction
accuracy is better for C&RT than the hybrid model.
6. Conclusion
Between the three models, ANN demonstrates the greatest stability and accuracy at predicting stock prices during the post-crisis
period. And although C&RT is less stable than the hybrid model, it
is more accurate. The hybrid model proposed by the study clearly
produces different prediction results. This implies that integrated
use of C&RT and ANN require improved efciency and further
development and more in-depth study is necessary.
By simply using the closing prices of stocks, the study has managed to discover stocks with good investment value. However, the
ever-changing market also implies a broad spectrum of impact factors. In terms of stock analysis, data mining is an effective tool for
discovering the correlation between stocks of a different cluster, so
References
Aitkenhead, M. J. (2008). A co-evolving decision tree classication method. Expert
Systems with Applications, 34(1), 1825.
Albuquerque, R. A., Francisco, E. De., & Marques, L. B. (2008). Marketwide private
information in stocks: Forecasting currency returns. Journal of Finance, 63(5),
22972343.
Antonio, G. B., Claudio, O. U., Manuel, M. S., & Nelson, O. M. (1996). Stock market
indices in Santiago de Chile: Forecasting using neural networks. In Proceedings
of the 1996 IEEE international conference on neural networks (pp. 21722175).
Washington, DC: IEEE Press.
Berry, M. J. A., & Linoff, G. (1997). Data mining techniques: For marketing, sales, and
customer support. New York: Wiley.
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classication and
regression trees. Belmont, CA: Wadsworth.
Brida, J. G., & Risso, W. A. (2010). Hierarchical structure of the German stock market.
Expert Systems with Applications, 37(5), 38463852.
Chen, F., Gou, C., Guo, X., & Gao, J. (2008). Prediction of stock markets by the
evolutionary mix-game model. Physica A, 387(14), 35943604.
Desai, V. S., & Bharati, R. (2007). The efcacy of neural networks in predicting
returns on stock and bond indices. Decision Sciences, 29(2), 405423.
Dutta, G., Jha, P., Laha, A. K., & Mohan, N. (2006). Articial neural network models
for forecasting stock price index in the Bombay stock exchange. Journal of
Emerging Market Finance, 5(3), 283295.
14851
Fidrmuc, J., & Korhonen, I. (2009). The impact of the global nancial crisis on
business cycles in Asian emerging economies. Journal of Asian Economics.
doi:10.1016/j.asieco.2009.07.007.
Han, J., & Kamber, M. (2006). Data mining: Concepts and techniques (second ed.). San
Francisco, CA: Morgan Kaufmann.
Hong, H., Torous, W., & Valkanov, R. (2007). Do industries lead stock markets?
Journal of Financial Economics, 83(2), 367396.
Hu, C., & He, L. T. (2007). An application of interval methods to stock market
forecasting. Reliable Computing, 13(5), 423434.
Ihme, M., Marsden, A. L., & Pitsch, H. (2008). Generation of optimal articial neural
networks using a pattern search algorithm: Application to approximation of
chemical systems. Neural Computation, 20(2), 573601.
Ince, H., & Trafalis, T. B. (2008). Short term forecasting with support vector
machines and application to stock price prediction. International Journal of
General Systems, 37(6), 677687.
Kim, H. J., & Shin, K. S. (2007). A hybrid approach based on neural networks and
genetic algorithms for detecting temporal patterns in stock markets. Applied
Soft Computing, 7(2), 569576.
Khashei, M., Hejazi, S. R., & Bijari, M. (2008). A new hybrid articial neural networks
and fuzzy regression model for time series forecasting. Fuzzy Sets and Systems,
159(7), 769786.
Lai, R. K., Fan, C. Y., Huang, W. H., & Chang, P. C. (2009). Evolving and clustering
fuzzy decision tree for nancial time series data forecasting. Expert Systems with
Applications, 36(2P2), 37613773.
Lee, K. C., & Kim, W. C. (2007). Integration of human knowledge and machine
knowledge by using fuzzy post adjustment: Its performance in stock market
timing prediction. Expert Systems, 12(4), 331338.
Levin, N., & Zahavi, J. (2001). Predictive modeling using segmentation. Journal of
Interactive Marketing, 15(2), 222.
Lin, C. T., & Yeh, H. Y. (2009). Empirical of the Taiwan stock index option price
forecasting model Applied articial neural network. Applied Economics, 41(15),
19651972.
Ou, P., & Wang, H. (2009). Prediction of stock market index movement by ten data
mining techniques. Modern Applied Science, 3(12), 2842.
Paliwal, M., & Kumar, U. A. (2009). Neural networks and statistical techniques: A
review of applications. Expert Systems with Applications, 36(1), 217.
Pino, R., Parreno, J., Gomez, A., & Priore, P. (2008). Forecasting next-day price of
electricity in the Spanish energy market using articial neural networks.
Engineering Applications of Articial Intelligence, 21(1), 5362.
Reyck, B. De., Degraeve, Z., & Vandenborre, R. (2008). Project options valuation with
net present value and decision tree analysis. European Journal of Operational
Research, 184(1), 341355.
Shachmurove, Y., & Witkowska, D. (2000). Utilizing articial neural network model to
predict stock markets (CARESS working paper #00-11). Philadelphia, PA:
University of Pennsylvania.
Steiner, M., & Wittkemper, H. G. (1997). Portfolio optimization with a neural
network implementation of the coherent market hypothesis. European Journal
of Operational Research, 100(1), 2740.
Stock, J. H., & Watson, M. W. (2007). Why has US ination become harder to
forecast? Journal of Money, Credit and Banking, 39(S1), 333.
Ture, M., Tokatli, F., & Kurt, I. (2009). Using KaplanMeier analysis together with
decision tree methods (C&RT, CHAID, QUEST, C4 5 and ID3) in determining
recurrence-free survival of breast cancer patients. Expert Systems with
Applications, 36(2P1), 20172026.
Yang, C. C., Prasher, S. O., Enright, P., Madramootoo, C., Burgess, M., Goel, P. K., et al.
(2003). Application of decision tree technology for image classication using
remote sensing data. Agricultural Systems, 76(3), 11011117.
Zhu, X., Wang, H., Xu, L., & Li, H. (2008). Predicting stock index increments by neural
networks: The role of trading volume under different horizons. Expert Systems
with Applications, 34(4), 30433054.