National Institute of Technology Data Analytics Exam Questions

National Institute of Technology, Tiruchirappalli
Department of Management Studies,

Basic Data Analytic
Marathon Exam
1. How would you calculate the adjusted R square value?

2. Mention all the assumptions of ordinary least square.
3. What do you mean by conditional index?
4. How do you use factor analysis to eliminate multi co linearity?
5. What do you understand by the term Part worth functions?
6. Conduct the t test for the given multiple regression model for 8 df at alpha equal to 0.05 level of significance.
Attitude = 0.33732 + 0.48108 (Duration) + 0.28865 (Importance)
7. Draw the anova table for the given data
Sales Ads Expenditure
50 250
70 280
65 320
75 350
85 360
95 390
8. What is multi co linearity? How do you use ridge regression to remove multi co linearity Problem?
9. You come across a magazine article reporting the following relationship between annual expenditure on prepared
dinner ( PD) and annual income (INC)
PD = 23.4+0.003 INC
The coefficient of INC variable as significant. 1) Does the relationship seem plausible? Is it possible to have a
coefficient that is small in magnitude and yet significant? 2) From the information given, can you tell how good the
estimated model is? 3) what are the expected expenditure on prepared dinners of the family earning $30,000? 4) If a
family earning $40,000 spent $130 annually on prepared dinners, What is the residual? 5) What is the meaning of
negative residual?
10. 1. How do you arrive at the relative weightage for the attributes and the part worth values for the levels?
2. Explain the applications of conjoint analysis.
11. In the conjoint analysis output file the relative weightage column is not given, arrive at the relative weightage for
each attribute and also arrive at the more favorable combinations of attribute and levels
Attribute Level Utility Relative

weightage
A1 1 0.223 ?
2 -0.556 ?
3 -0.222 ?
A2 1 0.455 ?
2 0.332 ?
3 -0.556 ?
A3 1 0.233 ?
2 0.822 ?
3 0.222 ?
12. 1. What do you understand by the term outlier?

2. Do the residual analysis with the following data
Observations Actual Predicted
1 500 450
1
2 700 650
3 720 660
4 600 620
5 500 520
6 650 720
7 500 520
8 700 620
13. 1. How do you construct confusion matrix?

2. When do you use Mahalonobis D2 statistics?
14. In the given table, find the missing value of factor analysis.
Variables Factors Communality Specific

F1 F2 Variance
X1 0.6254 -0.76663 ? 0.02
X2 0.7136 ? 0.82 0.18
X3 0.7144 -0.6792 ? 0.03
X4 ? 0.1586 0.80 0.20
X5 0.7421 0.5781 0.88 0.12
Total Variance % ? ? 89 11
Common Variance ? ?
% 2.7344 1.7161
Eigen Value
15. 1. What is meant by oblique rotation? List the algorithm related to oblique rotation.
2. Explain the “Thurston Princpal.
16. By using the data below, define a binary response variable Z that assumes the value 0 if a firm is bankrupt and 1 if
a firm is not bankrupt.
CF – Cash flow,TD – Total Debt,NI – Net Income, TA - Total Assets, CA – Current assets ,CL – Current Liabilities
1) Develop the discriminant function for the firm
2) Develop confusion matrix
3) Find the Apparent error rate.
Row X1 = X2 = X3 = X4 = POPULATION
CF/TD NI/TA CA/CL CA/NS
2
2 -.45 -.41 1.09 .45 0
3 -.56 -.31 1.50 .16 0
4 .06 .02 1.01 .40 0
5 -.07 -.09 1.45 .26 0
6 -.10 -.09 1.56 .67 0
7 -.14 -.07 .71 .28 0
8 .04 .01 1.50 .71 0
9 -.06 -.06 1.37 .40 0
10 .07 -.01 1.37 .34 0
11 -.13 -.14 1.42 .44 0
12 -.23 -.30 .33 .18 0
13 .07 .02 1.31 .25 0
14 .01 .00 2.15 .70 0
15 -.28 -.23 1.19 .66 0
16 .15 .05 1.88 .27 0
17 .37 .11 1.99 .38 0
18 -.08 -.18 1.51 .42 0
19 .05 .03 1.68 .95 0
20 .01 -.00 1.26 .60 0
21 .12 .11 1.14 .17 0
1 -.28 -.27 1.27 .51 0
2 .51 .10 2.49 .54 1
3 .08 .02 2.01 .53 1
4 .38 .11 3.27 .35 1
5 .19 .05 2.25 .33 1
6 .32 .07 4.24 .63 1
7 .31 .05 4.45 .69 1
8 .12 .05 2.52 .69 1
9 -.02 .02 2.05 .35 1
10 .22 .08 2.35 .40 1
11 .17 .07 1.80 .52 1
12 .15 .05 2.17 .55 1
13 -.10 -.01 2.50 .58 1
14 .14 -03 .46 .26 1
15 .14 .07 2.61 .52 1
16 .15 .06 2.23 .56 1
17 .16 .05 2.31 .20 1
18 .29 .06 1.84 .38 1
19 .54 .11 2.33 .48 1
20 -.33 -.09 3.01 .47 1
21 .48 .09 1.24 .18 1
22 .56 .11 4.29 .45 1
23 .20 .08 1.99 .30 1
24 .47 .14 2.92 .45 1
25 .17 .04 2.45 .14 1
.58 .04 5.06 .13 1
3
17. How do you find partial ‘F’?
18. From the given variance – Co-variance matrix, deduce the correlation matrix.
X1 x2 x3
X1 3.2 2.5 4.2
X2 4.8 2.8
X3 4.6
19. For the given data use a normalization process to find the normalized values
5
4
8
5
7
6
20. Determine the R2 value for the given data
Y Value Ŷ Value
5 7
6 8
4 7
7 4
21. Explain the relationship between communality and error variance.

22. There are five variables, the Y and Y Values are given below. Find the standard error and R2 for 10
observations.
Y
Ŷ Value
50 48
60 57
63 59
68 59
72 62
73 70
75 73
78 82
85 78
92 85
23. From the given information, find the relative weightages for the attributes
4
PRODUCT
Attribute 1 Attribute 2 Attribute 3 Attribute 4
α11 α12 α13 α21 α22 α23 α31 α32 α33 α41 α42
Utility values
α11 -0.1
α12 -0.1
α13 0.2
α21 0.3
α22 0.1
α23 -0.4
α31 0.4
α32 -0.3
α33 -0.1
α41 0.01
α42 -0.01
24. Explain about the dummy regression.
25. Explain about the Mahalanobis’s D2 test and when and how do you apply this test.
26. Explain the steps involved in stepwise discriminant analysis for four variables.
27. Use the Fischer’s linear discriminant function in the given data set and evaluate the result by
resubstitution ,the probabilities of misclassification. ?
WAIS subsets:
X1=information
X2=similarities
X3=arithmetic
X4=picture completion
Group II
SUBJECT INFORMATION SIMILARITIES ARITHMETIC PICTURECOMPLETION
1 9 5 10 8
2 10 0 6 2
3 8 9 11 1
4 13 7 14 9
5 4 0 4 0
6 4 0 6 0
7 11 9 9 8
8 5 3 3 6
9 9 7 8 6
10 7 2 6 4
11 12 10 14 3
5
12 13 12 11 10
MEAN 8.75 5.33 8.5 4.75
Group I
GROP I
SUBJECT INFORMATION SIMILARITIES ARITHMETIC PICTURECOMPLETION

1 7 5 9 8
2 8 8 5 6
3 16 18 11 9
4 8 3 7 9
5 6 3 13 9
6 11 8 10 10
7 12 7 9 8
8 8 11 9 3
9 14 12 11 4
10 13 13 13 6
11 13 9 9 9
12 13 10 15 7
13 14 11 12 8
14 15 11 11 10
15 13 10 15 9
16 10 5 8 6
17 10 3 7 7
18 17 13 13 7
19 10 6 10 7
20 10 10 15 8
21 14 7 11 5
22 16 11 12 11
23 10 7 14 6
24 10 10 9 6
25 10 7 10 10
26 7 6 5 9
27 15 12 10 6
28 17 15 15 8
29 16 13 16 9
30 13 10 17 8
31 13 10 17 10
32 19 12 16 10
33 19 15 17 11
34 13 10 7 8
35 15 11 12 8
36 16 9 11 11
37 14 13 14 9
MEAN 12.57 9.57 11.49 7.97
28. The annual financial data listed in table have been analyzed by jhonson with a view toward detecting
influential observations in a discriminant analysis. Consider variables X1=CF/TD and X2=CA/CL.
Row X1=CF/TD X2=CA/CL
1 -0.45 1.09
2 -0.56 1.51
6
3 0.06 1.01
4 -0.07 1.45
5 -0.1 1.56
6 -0.14 0.71
7 0.04 1.5
8 -0.06 1.37
9 0.07 1.37
10 -0.13 1.42
11 -0.23 0.33
12 0.07 1.31
13 0.01 2.15
14 -0.28 1.19
15 0.15 1.88
16 0.37 1.99
17 -0.08 1.51
18 0.05 1.68
19 0.01 1.26
20 0.12 1.14
21 -0.28 1.27
1 0.51 2.49
2 0.08 2.01
3 0.38 3.27
4 0.19 2.25
5 0.32 4.24
6 0.31 4.45
7 0.12 2.52
8 -0.02 2.05
9 0.22 2.35
10 0.17 1.8
11 0.15 2.17
12 -0.1 2.5
13 0.14 0.46
14 0.14 2.61
15 0.15 2.23
16 0.16 2.31
17 0.29 1.84
18 0.54 2.33
19 -0.33 3.01
20 0.48 1.24
21 0.56 4.29
22 0.2 1.99
23 0.47 2.92
24 0.17 2.45
25 0.58 5.06
29. How do you find the error rate from the confusion matrix for three groups.Assume your own data and
show the results?
30. Explain the simple structure principle ?
31. Explain the process of factor rotation with the transformation matrix ?
32. Explain the Orthogonal and Oblique rotation and name the algorithms available under each rotation ?
7
33. Find the Residual matrix from the Exploratory factor analysis model. The following are the related
information ?
correlation coefficients for exploration and conformation
1 2 3 4 5 6 7 8 9
1 0.411 0.479 0.401 0.37 0.393 0.078 0.389 0.411
0.245 1 0.463 0.223 0.198 0.244 -0.042 0.169 0.324
0.418 0.362 1 0.231 0.272 0.357 -0.126 0.153 0.307
0.282 0.217 0.425 1 0.659 0.688 0.215 0.221 0.256
0.257 0.125 0.304 0.784 1 0.649 0.293 0.279 0.324
0.239 0.131 0.33 0.743 0.73 1 0.226 0.298 0.294
0.122 0.149 0.265 0.185 0.221 0.118 1 0.602 0.446
0.253 0.183 0.329 0.021 0.139 -0.027 0.601 1 0.63
0.583 0.147 0.455 0.381 0.4 0.235 0.385 0.462 1
Unrestricted maximum likelihood solution for exploration sample

variate λ1 λ2 λ3 ψ
1 0.59 -0.14 0.37 0.49
2 0.37 -0.19 0.45 0.62
3 0.42 -0.32 0.53 0.44
4 0.71 -0.37 -0.27 0.29
5 0.71 -0.26 -0.23 0.37
6 0.74 -0.33 -0.17 0.33
7 0.5 0.58 -0.3 0.32
8 0.65 0.54 0.13 0.27
9 0.64 0.34 0.27 0.4
34. A regression model has been established Y=.235X1 +.468X2 for 10 observations. Conduct ‘t’test for X1
and X2 at 5% level of significance
35. There are two discriminate functions for the given problem. Draw the confusion matrix for the given
data.
Group 1
X1 X2 X3 X4
4 3 4 5
2 4 5 3
5 2 4 6
3 4 5 3
Group 2
5 6 7 5
7 6 7 5
7 6 8 5
5 4 7 6
5’ 4 3 2
Group 3
8
9 7 6 5
5 4 8 2
7 6 5 3
4 8 5 2
7 3 5 6
Discriminate function 1= .8x1+.7x2

Discriminate function 2= .9x1+.95x2
36. How do you arrive at the factor score for one observation ?
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12

3 2 3 4 4 4 2 0 4 0 0 0
Here extracted three factor loadings and 4 values are
F1 F2 F3 ψ
-0.95 -0.1 0.35 .867
-0.99 -0.1 0.39 .579
-0.96 0 0.3 .619
-1.07 0.1 0.3 .672
-1.24 0 0.06 .572
-1.18 0.1 -0.3 .479
-0.83 -0.1 -0.4 .796
-0.97 0.1 -0.5 .883
-1.05 0.1 -0.3 .816
-0.11 -0.6 0.06 .338
-0.03 -0.4 -0.1 .153
-0.5 -0.5 -0.1 .894
37. Explain the following:

a. Nominal, Ordinal scale, Interval scale, Ratio scale.
b. Why do you normalize the data? How do you do the normalization?
38. Find the two Eigen vectors for the following matrix.
A= 3 4
2 5
9
39. From the given data find the Discriminant Loading
Z1 Z2
X1 0.41 0.32
X2 0.51 0.42
X3 0.61 0.35
Variance - covariance matrix:
X1 X2 X3
X1 3 4 2
X2 4 3
X3 2
Correlation matrix:
1 0.25 0.4
1 0.34
40. The data below are selected from a much larger body of data referring to candidates for the General Certificate
of Education who were being considered for a special award. Here, Y denotes the candidate’s total mark, out of
1000, in the G.C.E. examination. Of this mark the subjects selected by the candidate account for a maximum of
800; the remainder, with a maximum of 200, is the mark in the compulsory papers- “General” and “ Use of
English” – this mark is shown as X1. X2 denotes the candidate’s mark out of 100, in the compulsory School
Certificate English Language paper taken on a previous occasion.
Compute the multiple regression of Y on X1 and X2, and make the necessary tests to enable you to
comment intelligently on the extent to which current performance in the compulsory papers may be used to
predict aggregate performance in the G.C.E. examination, and on whether previous performance in School
10
Certificate English Language has any predictive value independently of what has already emerged from the
current performance in the compulsory papers.
Candidate Y X1 X2
1 476 111 68
2 457 92 46
3 540 90 50
4 551 107 59
5 575 98 50
6 698 150 66
7 545 118 54
8 574 110 51
9 645 117 59
10 556 94 97
11 634 130 57
12 637 118 51
13 390 91 44
14 562 118 61
15 560 109 66
41. How do you interpret partial regression coefficient?
42. Explain the difference between interval scale and ratio scale.
43. Draw the format of ANOVA table.
44. Show the format of confusion matrix.
45. Define Factor Loading.
46. a. Find the sum of squared error for the following data
Sales=.48+.85 Advertisement Expenditure-.35Price
Sales Advertisement Price

50 25 15
60 40 10
70 50 8
11
90 70 6
65 75 8
90 80 10
47 a. Explain the steps involved in performing stepwise discriminant analysis.
b. Explain the steps involved in performing multiple regression analysis.
48 a. Explain the differences between logistic regression and discriminant analysis.
b. Describe the assumptions to be considered in performing the multiple discriminant analysis
49. How do you determine the discriminant loading?
50. How do you establish a confusion matrix for two groups wherein the first group consists of 20
members and second group consists of 40 members? The apparent error rate is 20%
51. a.Differentiate between orthogonal rotation and oblique rotation?
b. How do you assign variable on the factor?
52. Explain the steps involved in performing maximum likelihood method for the factor extraction.
53 Find the correlation between the two data without MATLAB
X Y
5 4
7 8
6 5
4 2
5 4
54 . Find the variable and covariance matrix for the following data.
X Y Z
5 4 3
2 4 6
7 5 3
12
4 5 8
4 2 6
3 5 1
55.From the data given below, do all the working using MATLAB
a. Find the correlation matrix

b. Use Principal factor method to arrive at factor loading for four factors
c. Find the residual matrix after the extraction
56 . Sales= 0.425(Advertisement Expenditure) – 0.325 (Price) + 15

Sales= 410(Advertisement Expenditure) – 30 (Price) + 250
The two models are arrived for the same data. Explain the reason for different values. How do you interpret
partial regression coefficient? Explain the test which is used for testing the significance of partial regression
coefficient.
57 .Find out the R2 for the given data:
State Relative Promotional Relative State Relative Promotional Relative

No. Expense Sales No. Expense Sales
1 95 98 9 85 93
2 92 94 10 101 107
3 103 110 11 106 114
4 115 125 12 120 132
5 77 82 13 118 129
6 79 84 14 75 79
7 105 112 15 99 105
8 94 99
58 .A process for making steel wire turns out wire with a mean tensile strength of 200 psi. The process standard
deviation is 20 psi. The quality control engineer wants to design a test that will indicate whether or not there has
been a shift in the process average, using a sample size of 25 and a level of significance of α = 0.05. State Ho and
H1 for this test.
59 . Consider a three group discriminant analysis on two variables (X1 and X2).The number of observations in
each group (drawn at random from the population) is : n1=25, n2=35, n3=25.The within group sum of square
matrices and group centroids are given below:
X’(1)=(-0.91,1.08) X’(2)=(-0.08,-0.17) X’(3)=(1.03,1.16)
W 1= 25.2 20.5 W 2= 38.2 33.8 W3= 20.2 16.9
20.5 19.8 33.8 41.0 16.9 21.5

13
Another observation is drawn at random from the population: x’=(0,1).To which group would you assign this
observation ?
60. Calculate W-1 and B and the eigen values and eigen vectors of W-1B. Construct Linear Discriminant Analysis. Find
the apparent rate error rate and confusion matrix.
G x1 x2
1 150 15
1 147 13
1 145 14
1 144 16
1 153 13
1 140 15
1 151 14
1 143 14
1 144 14
1 142 15
1 141 13
1 150 15
1 148 13
1 154 15
1 147 14
1 137 14
1 134 15
1 157 14
1 149 13
1 147 13
1 148 14
2 120 14
2 123 16
2 130 14
2 131 16
2 116 16
2 122 15
2 127 15
2 132 16
2 125 14
2 119 13
2 122 13
2 120 15
2 119 14
2 123 15
2 125 15
2 125 14
2 129 14
2 130 13
2 129 13
2 122 12
2 129 15
2 124 15
2 120 13
2 119 16
2 119 14
2 133 13
2 121 15
2 128 14
2 129 14
2 124 13
2 129 14
14
3 145 8
3 140 11
3 140 11
3 131 10
3 139 11
3 139 10
3 136 12
3 129 11
3 140 10
3 137 9
3 141 11
3 138 9
3 143 9
3 142 11
3 144 10
3 138 10
3 140 10
3 130 9
3 137 11
3 137 10
3 136 9
3 140 10
61.Find the relationship betweer R2 and R2 adjusted.

62.How do you find and interpret variance inflation factor?
63.Explain all the assumptions related to ordinary least square method (OLS)?
64.Find the inverse of matrix A = 2 3
1 5
65.Find the discriminant loading for two variable problem.
The discriminant weights are 0.86 0.28
The correlation matrix is 0.5 0.6

0.3 0.4
66.Twenty five portfolio managers were evaluated in terms of their performance. Suppose Y represent the rate
of return achieved over a period of time, Z1 is the managers attitude toward risk measured on a five point scale
from “ very conservative” to “very risky” and Z2 is years of experience in the investment business. The
observed correlation coefficients between pairs of variables are
R= Y Z1 Z2
1.0 -3.5 0.82
-0.35 1.0 -0.6
0.82 -0.6 1.0
i) Interpret the sample correlation coefficients r Y Z1 = - 0.35 and r Y Z2 = -0.82
ii) Calculate the partial correlation coefficient r Y Z1.Z2 and interpret this quantity with respect to the
interpretation provided for r Y Z1.Z2 in Part( i).
67 .Write the probability density function for multivariate normal distribution
68. Find the characteristics equation for the given matrix.
2 4
1 3
15
69.An analysis attempt to identify the factors that determine utility values for computer professionals in a
large corporation. The variables included in the study were
i) Education, defined by the dummy variables E1, E2 where
(1,0) for high school diploma
(E1, E2) = (0,1) for B.S degree
(0,0) for advance degree
ii) Whether the individual has management responsibility- defined by the dummy variable.
MGT = 1 if individual has management responsibility
0 if not
The model considered is
SALARY = 11,032.00 – 2,996.00 E1 + 147.98 E2 + 6,883.50 MGT
Find the part worth values for each level of the corresponding factor?
70. A sample of n=10 observations gives the values in the following table.
Ordered observations (X j)
-1.00
-0.10
0.16
0.41
0.62
0.80
1.26
1.54
1.71
2.30
Draw the Q-Q plot for the above observations.
71. Consider the data shown below
X 1 2 3 4 5 6 7 8 9 10
Y 2 1 2 3 4 5 5 6 7 8
Given X1X -1
= 0.4666 -0.0666
-0.0666 0.0120
Find the ‘b’ values and residuals for each observation.
72. Find the Λ (X1/X2) and Λ (X3/X4) using the MATLAB for the given data.(Table 1)
x1 x2 x3 x4 x5
3.9 51 0.2 7.06 12.19
2.7 49 0.07 7.14 12.23
2.8 36 0.3 7 11.3
π1 3.1 45 0.08 7.2 13.01
16
3.5 46 0.1 7.81 12.63
3.9 43 0.07 6.25 10.42
2.7 35 0 5.11 9
5 47 0.07 7.06 6.1
3.4 32 0.2 5.82 4.69
1.2 12 0.07 5.54 3.15
8.4 17 0.07 6.31 4.55
4.2 36 0.5 9.25 4.95
4.2 35 0.5 5.69 2.22
3.9 41 0.1 5.63 2.94
3.9 36 0.07 6.19 2.27
7.3 32 0.3 8.02 12.92
4.4 46 0.07 7.54 5.76
π2 3 30 0 5.12 10.77
6.3 13 0.5 4.24 8.27
1.7 5.6 1 5.69 4.64
7.3 24 0 4.34 2.99
7.8 18 0.5 3.92 6.09
7.8 25 0.7 5.39 6.2
7.8 26 1 5.02 2.5
9.5 17 0.05 3.52 5.71
7.7 14 0.3 4.65 8.63
11 20 0.5 4.27 8.4
8 14 0.3 4.32 7.87
8.4 18 0.2 4.38 7.98
10 18 0.1 3.06 7.67
7.3 15 0.05 3.76 6.84
9.5 22 0.3 3.98 5.02
8.4 15 0.2 5.02 10.12
8.4 17 0.2 4.42 8.25
π3 9.5 25 0.5 4.44 5.95
17
7.2 22 1 4.7 3.49
4 12 0.5 5.71 6.32
6.7 52 0.5 4.8 3.2
9 27 0.3 3.69 3.3
7.8 29 1.5 6.72 5.75
4.5 41 0.5 3.33 2.27
6.2 34 0.7 7.56 6.93
5.6 20 0.5 5.07 6.7
9 17 0.2 4.39 8.33
8.4 20 0.1 3.74 3.77
9.5 19 0.5 3.72 7.37
9 20 0.5 5.97 11.17
6.2 16 0.05 4.23 4.18
7.3 20 0.5 4.39 3.5
3.6 15 0.7 7 4.82
6.2 34 0.07 4.84 2.37
7.3 22 0 4.13 2.7
4.1 29 0.7 5.78 7.76
5.4 29 0.2 4.64 2.65
5 34 0.7 4.21 6.5
6.2 27 0.3 3.97 2.97
73.Using the MATLAB find the confusion matrix for the data given for the three groups’ problem? (Data given
in problem 72 (Note: Only for the first two groups)
74.Find the d2 value for the given data (Table 2) where d2 = (Xj – X) ’ s -1 (Xj – X), where s is the variance-
covariance matrix and find the outliers by subjective observation.
X1 X2 X3 X4 X1 X2 X3 X4
1889 1651 1561 1778 1954 2149 1180 1281
2403 2048 2087 2197 1325 1170 1002 1176
2119 1700 1815 2222 1419 1371 1252 1308
1645 1627 1110 1533 1828 1634 1602 1755
1976 1916 1614 1883 1725 1594 1313 1646
1712 1712 1439 1546 2276 2189 1547 2111
1943 1685 1271 1671 1899 1614 1422 1477
2104 1820 1717 1874 1633 1513 1290 1516
18
2983 2794 2412 2581 2061 1867 1646 2037
1745 1600 1384 1508 1856 1493 1356 1533
1710 1591 1518 1667 1727 1412 1238 1469
2046 1907 1627 1898 2168 1896 1701 1834
1840 1841 1595 1714 1655 1675 1414 1597
1867 1685 1493 1678 2326 2301 2065 2234
1859 1649 1389 1714 1490 1382 1214 1284
75. Why do we do factor rotation?

ii) Explain the simple structure principle.
iii) Prove mathematically the usage of orthogonal matrix does not change the variance-covariance
matrix in factor rotation.
76.Find the variance covariance matrix using MATLAB for the given factor loadings values and specific
variance values given in the following table.
Variate λ1 λ2 λ3 ψ
1 0.664 0.321 0.074 0.450
2 0.689 0.247 -0.193 0.427
3 0.493 0.302 -0.222 0.617
4 0.837 -0.292 -0.035 0.212
5 0.705 -0.315 -0.153 0.381
6 0.819 -0.377 0.105 0.177
7 0.661 0.396 -0.078 0.4
8 0.458 0.296 0.491 0.462
9 0.766 0.427 -0.012 0.231
77.How do you identify the out-lier?
78.How do you determine the discriminant loading?
79. Why do we rotate the factors?
80.Find the Covariance matrix for the given data
Observation Number Attitude Perception Action
1 5 4 3
2 6 3 5
3 7 5 7
4 6 5 5
5 7 6 5
19
6 8 4 7
7 4 5 7
81 a) Explain the relationship between correlation and covariance matrix?
b) Show the graphical representation between X and Y, when correlation

coefficient xy = 1 and xy = 1
82 .Draw the curve related to the sigmoidal function
83.Find the Discriminant loading for the given problem:
Discriminant Function 1 Discriminant Function 2
X1 .42 .51
X2 .32 .15
X3 .26 .31
X4 .36 .26
X5 .17 .28
84.Explain the procedures involved in arriving at two discriminant functions?
85. How do you arrive the factor score?
86.Discuss the differences between orthogonal rotation and oblique rotation?
87.From the given factory analysis output, fill up the blanks in the table
Variables
Verbal 0.272 0.293 16 84
0.409 - .36 .64
0.477 0.513 .49 .51
Numerical 0.926 -0.179 - .11
- 0.031 .72 .28
0.843 0.172 .74. .26
Total Variance (%) 45.9 10.1 56.0 44.0
20
Common Variance (%) - -
Eigen value 2.756 0.604
88. A Fast moving consumer product company’s marketing manager thinks there is a strong link between the
advertising and promotional expenditure and the sales in the following week. He collects data from his company
records on sales, advertising expenditure, and promotional (Non-advertising) expenditure for one of the large
territories of his company. The date is shown below.
Week No. Sales in a week Advertising Promotional

(units) expenditure in expenditure in
previous week (Rs) previous week (Rs)
1 120,000 15,000 22,500
2 123,000 25,000 10,000
3 140,000 17,000 17,000
4 115,000 20,000 6,000
5 126,000 15,000 16,000
6 130,000 15,000 18,000
7 115,000 18,000 12,000
8 127,000 10,000 15,000
9 118,000 10,000 10,000
10 121,000 15,000 20,000
11 126,000 15,000 18,000
12 150,000 25,000 20,000
13 140,000 18,000 17,000
14 135,000 20,000 20,000
15 137,000 17,000 22,000
The marketing manager would like you to perform a regression analysis on the data and advise him on how to use the
regression model to predict sales based on advertising and promotional expenditure. What would you tell him?
21
89. A total of 77 samples were collected and analysed for six physical characteristics namely ,specific gravity,pH
value,Osmolarity(MOSM) urea Concentration (UREA),coductivity(MMHO)and calcium concentration (CALCIUM)
Answer the Following
1. Using LOGISTIC procedure determine the possible presence of crystal in urine using the special
characteristics
2. Form a logistic regression model
3. Form a confusion matrix and interpret
4. Conduct log likelihood test and interpret
5. Find a R2 Value
SG pH.Value MOSM MMHO UREA CALCIUM

no 1.017 5.74 577 20 296 4.49
no 1.008 7.2 321 14.9 101 2.36
no 1.011 5.51 408 12.6 224 2.15
no 1.005 6.52 187 7.5 91 1.16
no 1.02 5.27 668 25.3 252 3.34
no 1.012 5.62 461 17.4 195 1.4
no 1.029 5.67 1107 35.9 550 8.48
no 1.015 5.41 543 21.9 170 1.16
no 1.021 6.13 779 25.7 382 2.21
no 1.011 6.19 345 11.5 152 1.93
no 1.025 5.53 907 28.4 448 1.27
no 1.006 7.12 242 11.3 64 1.03
no 1.007 5.35 283 9.9 147 1.47
no 1.011 5.21 450 17.9 161 1.53
no 1.018 4.9 684 26.1 284 5.09
no 1.007 6.63 253 8.4 133 1.05
no 1.025 6.81 947 32.6 395 2.03
no 1.008 6.88 395 26.1 95 7.68
no 1.014 6.14 565 23.6 214 1.45
no 1.024 6.3 874 29.9 380 5.16
no 1.019 5.47 760 33.8 199 0.81
no 1.014 7.38 577 30.1 87 1.32
no 1.02 5.96 631 11.2 422 1.55
no 1.023 5.68 749 29 239 1.52
no 1.017 6.76 455 8.8 270 0.77
no 1.017 7.61 527 25.8 75 2.17
no 1.01 6.61 225 9.8 72 0.17
no 1.008 5.87 241 5.1 159 0.83
no 1.02 5.44 781 29 349 3.04
22
no 1.017 7.92 680 25.3 282 1.06
no 1.019 5.98 579 15.5 297 3.93
no 1.017 6.56 559 15.8 317 5.38
no 1.008 5.94 256 8.1 130 3.53
no 1.023 5.85 970 38 362 4.54
no 1.02 5.66 702 23.6 330 3.98
no 1.008 6.4 341 14.6 125 1.02
no 1.02 6.35 704 24.5 260 3.46
no 1.009 6.37 325 12.2 97 1.19
no 1.018 6.18 694 23.3 311 5.64
no 1.021 5.33 815 26 385 2.66
no 1.009 5.64 386 17.7 104 1.22
no 1.015 6.79 541 20.9 187 2.64
no 1.01 5.97 343 13.4 126 2.31
no 1.02 5.68 876 35.8 308 4.49
yes 1.021 5.94 774 27.9 325 6.96
yes 1.024 5.77 698 19.5 354 13
yes 1.024 5.6 866 29.5 360 5.54
yes 1.021 5.53 775 31.2 302 6.19
yes 1.024 5.36 853 27.6 364 7.31
yes 1.026 5.16 822 26 301 14.34
yes 1.013 5.86 531 21.4 197 4.74
yes 1.01 6.27 371 11.2 188 2.5
yes 1.011 7.01 443 21.4 124 1.27
yes 1.011 6.13 364 10.9 159 3.1
yes 1.031 5.73 874 17.4 516 3.01
yes 1.02 7.94 567 19.7 212 6.81
yes 1.04 6.28 838 14.3 486 8.28
yes 1.021 5.56 658 23.6 224 2.33
yes 1.025 5.71 854 27 385 7.18
yes 1.026 6.19 956 27.6 473 5.67
yes 1.034 5.24 1236 27.3 620 12.68
yes 1.033 5.58 1032 29.1 430 8.94
yes 1.015 5.98 487 14.8 198 3.16
yes 1.013 5.58 516 20.8 184 3.3
yes 1.014 5.9 456 17.8 164 6.99
yes 1.012 6.75 251 5.1 141 0.65
yes 1.025 6.9 945 33.6 396 4.18
yes 1.026 6.29 833 22.2 457 4.45
yes 1.028 4.76 312 12.4 10 0.27
yes 1.027 5.4 840 24.5 395 7.64
yes 1.018 5.14 703 29 272 6.63
yes 1.022 5.09 736 19.8 418 8.53
yes 1.025 7.9 721 23.6 301 9.04
yes 1.017 4.81 410 13.3 195 0.58
yes 1.024 5.4 803 21.8 394 7.82
23
yes 1.016 6.81 594 21.4 255 12.2
yes 1.015 6.03 416 12.8 178 9.39
90. Find the Covariance matrix for the given data (Without Matlab)?
Observation Number Attitude Perception Action
1 5 4 3
2 6 3 5
3 7 5 7
4 6 5 5
5 7 6 5
6 8 4 7
7 4 5 7
91. How do you establish a confusion matrix for two groups wherein the first group consists of 60
members and second group consists of 50 members? The apparent error rate is 25%.
92. Draw the Flowchart for Principal Axis Factoring
24

National Institute of Technology Data Analytics Exam Questions

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

National Institute of Technology Data Analytics Exam Questions

Uploaded by

Copyright:

Available Formats

National Institute of Technology, Tiruchirappalli

Department of Management Studies,

1. How would you calculate the adjusted R square value?

Attribute Level Utility Relative

12. 1. What do you understand by the term outlier?

13. 1. How do you construct confusion matrix?

Variables Factors Communality Specific

X1 3.2 2.5 4.2

20. Determine the R2 value for the given data

21. Explain the relationship between communality and error variance.

Attribute 1 Attribute 2 Attribute 3 Attribute 4

24. Explain about the dummy regression.

SUBJECT INFORMATION SIMILARITIES ARITHMETIC PICTURECOMPLETION

Unrestricted maximum likelihood solution for exploration sample

Discriminate function 1= .8x1+.7x2

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12

Here extracted three factor loadings and 4 values are

37. Explain the following:

Variance - covariance matrix:

41. How do you interpret partial regression coefficient?

43. Draw the format of ANOVA table.

44. Show the format of confusion matrix.

45. Define Factor Loading.

Sales=.48+.85 Advertisement Expenditure-.35Price

Sales Advertisement Price

47 a. Explain the steps involved in performing stepwise discriminant analysis.

b. Explain the steps involved in performing multiple regression analysis.

48 a. Explain the differences between logistic regression and discriminant analysis.

b. Describe the assumptions to be considered in performing the multiple discriminant analysis

49. How do you determine the discriminant loading?

51. a.Differentiate between orthogonal rotation and oblique rotation?

b. How do you assign variable on the factor?

53 Find the correlation between the two data without MATLAB

a. Find the correlation matrix

56 . Sales= 0.425(Advertisement Expenditure) – 0.325 (Price) + 15

57 .Find out the R2 for the given data:

State Relative Promotional Relative State Relative Promotional Relative

X’(1)=(-0.91,1.08) X’(2)=(-0.08,-0.17) X’(3)=(1.03,1.16)

W 1= 25.2 20.5 W 2= 38.2 33.8 W3= 20.2 16.9

20.5 19.8 33.8 41.0 16.9 21.5

61.Find the relationship betweer R2 and R2 adjusted.

65.Find the discriminant loading for two variable problem.

The discriminant weights are 0.86 0.28

The correlation matrix is 0.5 0.6

67 .Write the probability density function for multivariate normal distribution

68. Find the characteristics equation for the given matrix.

Draw the Q-Q plot for the above observations.

71. Consider the data shown below

Find the ‘b’ values and residuals for each observation.

3.9 51 0.2 7.06 12.19

2.7 49 0.07 7.14 12.23

2.8 36 0.3 7 11.3

π1 3.1 45 0.08 7.2 13.01

3.9 43 0.07 6.25 10.42

5 47 0.07 7.06 6.1

3.4 32 0.2 5.82 4.69

1.2 12 0.07 5.54 3.15

8.4 17 0.07 6.31 4.55

4.2 36 0.5 9.25 4.95

4.2 35 0.5 5.69 2.22

3.9 41 0.1 5.63 2.94

3.9 36 0.07 6.19 2.27