Professional Documents
Culture Documents
Keywords The paper is organized in five parts, the first one introduces the
Implication index; implication field; collaborative filtering; context and issues to be solved by the present system as well as
implication threshold; equipotential plane. proposing the approach to solving, and the second part presents the
related contents. To the statistical implication analysis and the
1. INTRODUCTION extended studies in the implication field, the third part presents the
Because of the rapid increase of data in era of information model of the recommender system based on the variance of the
explosion today, recommender systems [1][2] becomes a tool that implied index in the implication field, the next part is the
is extremely necessary and widely used more in electronic trade, experimental section model with scripts and finally conclusions.
services such as Amazon, Pandora, Netflix, etc. The objective of
the recommender systems is to filter useful information from a 2. IMPLICATION STATISTICAL FIELD
large number of information so that it is predictable that user will 2.1 Implication statistical analysis theory
rate for an item and thereby recommendation items (products,
services, etc.) suitable for the user. Algorithms for the 2.1.1 Implication statistical analysis
recommender system have attracted the attention of the researchers Statistical implication analysis (SIA) theory [9] [11] [13] [14],
for practical application. Among they are, the collaborative filter proposed by Regis Gras, studies the implication relationship of data
algorithms [17] are the most widely used and successful technique. variables. Measures in the analysis implicative statistical us
Most of these algorithms are based on the measure of symmetry for implication index (aka Gras implication index) and implication
filtering information and recommendations for users. Recently, intensity, are used to detect the rule or R-rule (rule of the rule)
several solutions have been proposed that use asymmetric strong implicative relationship between the two sides of the rule, or
to measure the correlation between two variables (individual, From (3), the differential of the function 𝑞 appears as a scalar
attribute ...), these measures are asymmetric. In addition, statistical product between gradient q and the increase of 𝑞 on the surface
implication analysis focuses on counter example factor analysis. It representing the variables of the function 𝑞(𝑛, 𝑛𝑎 , 𝑛𝑏 , 𝑛𝑎𝑏̅ ). 𝑔𝑟𝑎𝑑 𝑞
can be presented as follows: denotes the variability of the function of four variables, which is
Let 𝐸 be a finite set of binary variables, A and B are two subsets of the cardinalities of the sets 𝐸, 𝐴, 𝐵, and 𝐴 ∩ 𝐵̅, which points to the
𝐸 , respectively, which contain the elements 𝑎 ∈ 𝐴 such that direction of the function 𝑞 in four dimensions space. In fact, the
𝐴(𝑎) = 𝑡𝑟𝑢𝑒 and 𝑏 ∈ 𝐵, such that 𝐵(𝑏) = 𝑡𝑟𝑢𝑒, sets 𝐴̅ , 𝐵̅ is the value of this differential lies in the estimation of the increase
complement of sets 𝐴 and 𝐵 respectively, let 𝑛𝑎 = 𝑐𝑎𝑟𝑑(𝐴), 𝑛𝑏 = (positive or negative) of q that we note 𝛥𝑞 relative to the respective
𝑐𝑎𝑟𝑑(𝐵) is the cardinality of 𝐴 and 𝐵 respectively, 𝑛𝑎̅ = variations ∆𝑛, ∆𝑛𝑎 , ∆𝑛𝑏 , and ∆𝑛𝑎𝑏̅ . So we have:
𝑐𝑎𝑟𝑑( 𝐴̅), 𝑛𝑏̅ = 𝑐𝑎𝑟𝑑( 𝐵̅) is the cardinality of the set 𝐴̅ and the set ∆𝑞 =
𝜕𝑞
∆𝑛 +
𝜕𝑞
∆𝑛𝑎 +
𝜕𝑞
∆𝑛𝑏 +
𝜕𝑞
∆𝑛𝑎𝑏̅ + 𝑜(∆𝑞) (4)
𝜕𝑛 𝜕𝑛𝑎 𝜕𝑛𝑏 𝜕𝑛𝑎^𝑏
𝐵̅ and 𝑛𝑎𝑏̅ = 𝑐𝑎𝑟𝑑(𝐴 ∩ 𝐵̅) is the cardinality of the set 𝐴 ∩ 𝐵̅, that
̅
is a set containing the elements that satisfy the properties 𝑎 = 𝑡𝑟𝑢𝑒 with 𝑜(𝑞) is an infinitely small.
and 𝑏 = 𝑓𝑎𝑙𝑠𝑒 , 𝑛𝑎𝑏̅ also called counter-example, and also Now, to further examine the relationship between the implication
randomly and independently selects subsets of X and Y same index and implication intensity. Take the primitive of the equation
cardinality with 𝐴, 𝐵 respectively, meaning 𝑐𝑎𝑟𝑑(𝑋) = 𝑛𝑎 and (1), we have:
𝑐𝑎𝑟𝑑(𝑌) = 𝑛𝑏 . Let 𝑋̅ and 𝑌̅ respectively be the complement of 𝑋 −q2
dφ 1
and 𝑌 in 𝐸 and have corresponding cadinality as 𝑛𝑎̅ = 𝑛 − 𝑛𝑎 =- e 2 <0 (5)
dq √2π
𝑛𝑏̅ = 𝑛 − 𝑛𝑏 .
This confirms that the implication intensity increases as 𝑞
The implication relationship between 𝐴 and 𝐵 is modeled in the decreases, but the rate of increase is determined by formula (5),
statistical implication analysis as follows (see Figure 1). which allows for a more rigorous study of the variability of 𝜑.
- Selecting relevant data: Ignoring data can lead to bias and {Star Wars (1977),Empire Strikes Back, The
(1980),Return of the Jedi (1983)} => {Raiders of the
also to speed up computation, by not interested in the 226 Lost Ark (1981)} -9.000696
film has had only a few times, because, because The {Empire Strikes Back, The (1980),Return of the Jedi
ratings of these films may be subject to bias due to lack 86 (1983)} => {Raiders of the Lost Ark (1981)} -8.970471
of data, and users rated only a few films because their TABLE 4. ERROR INDEXES OF ISF MODELS WITH
ratings may be biased. IBCF MODELS AND UBCF
On the dataset has been preprocessed so and to avoid overfitting
problems, as well as to get better accuracy we conducted Model RMSE MSE MAE
experiments in k-fold cross validation mode, rather For Splitting ISF 0.9434059 0.8900147 0.7419290
and Boostrapping method. IBCFcosine 1.2372211 1.5307160 0.9264473
Authors’ background
Your Name Title* Research Field Personal website
*This form helps us to understand your paper better, the form itself will not be published.
*Title can be chosen from: master student, Phd candidate, assistant professor, lecture, senior lecture, associate
professor, full professor