You are on page 1of 23

Table of Contents Group members and list of work ...............................................................................2 Introduction ................................................................................................................3 Collecting data ...........................................................................................................

4 Presenting and Summarizing data ..............................................................................6 1. Subject.................................................................................................................6 2. Number of absent classes per month ..................................................................7 3. Average number of hours spent for self-study .................................................10 4. GPA...................................................................................................................14 Regression model .....................................................................................................18 1. Descriptive statistics .........................................................................................18 2. Regression model ..............................................................................................18 Conclusion ...............................................................................................................22 Appendix ..................................................................................................................23

Group members and list of work


Name Pham Ngoc Anh Student ID Tasks Collecting data Preparing slides Dang Thi Hien Writing "Introduction" Designing survey Hoang Thanh Ha Collecting data Presenting Phan Duy Hung 1001030153 Processing and Analyzing data Writing "Regression model" and "Average number of hours spent for self - study" Bui Kieu Dieu Linh 1001010500 Processing and Analyzing data Writing "Presenting and Summarizing data" Kim Le Ha Thanh Designing survey Writing "Conclusion" It should be note that all members of the group are serious, enthusiastic and hard working. They all did the tasks given well and before deadline. In fact, each member could do equal y (which means sharing the same task for all 6 people), but for the sake of the report's quality as well as time saving, we delegate the tasks based on our strengths and weaknesses. Therefore, the percentage here is just a relative measure, and in terms of working attitude and efficiency, all members could be regard as equivalent. 10 25 25 13 14 % equivalent 13

Introduction
Education is one of the most fundamental aspects for individuals success in life. It is also the best investment for people because well educated people would have more opportunities to get promising jobs in the future. However, not many people realize that importance and are wasting their time on other nonsensical reasons. To be more specific, there are quite a variety of dangers which seduce students such as video games, internet social programs, etc,.. But in this scope of our assignment, we just focus on some main factors such as: being absent from the class due to the weather, registering lots of lessons then quit from class several times, spending least time on studying,...which affect directly to students of Foreign Trade University (FTU), especially members of High Quality Classes of Finance and Banking. From this view, the data would be closer and more accurate to our findings. This topic may be popular for researching at FTU, on the other hand, there has not been a research for students of High Quality Programs in Finance and Banking Faculty before. So that we decide to focus on this number of people and choose the topic : Investigation of absent time of students and their average total marks so as to figure out how important of studying hard, trying their best to get a good result, giving optimistic attitude of learning,... After seeing the final output, it may help a student look back his studying history and change the habits to be better at university. Our purpose of doing this assignment is illustrating the important of education, particularly the time at university for learners. In our research, we use five methods of business statistics including collecting data, presenting, summarizing, analyzing and forecasting in order to make our studying as effective and informative as possible.

Collecting data

In order to study the diligence of Foreign Trade University students of High Quality Class, Faculty of Finance and Banking (a.k.a CLCTCNH), we collected data in both direct and indirect way. By the first method, our group including 6 people randomly chose some classes and spontaneously asked several people to fill in the survey for us. In addition, we created a online version then spread it through email and social network like Facebook. The overall students of CLCTCNH are about 600 students, and we tried to get at least 10% of these to serve for our study. And here is our survey we used to collect data

Investigation on students' studying habit


1. 2. 3. 4. 5.

According to credit scale, how many subjects are you studying?* <5 6 -8 >8 How many classes per week do you have? * How much time do you spend on self-studying at home? * How many classes per month are you absent from? * Are you interested in studying in the class? * Yes No Only in some certain subjects

6. 7. 8. 9.

Which thing(s) most make you want to skip a class? * Bad weather Classes without checking attendance Studying without understanding the lesson Busy with other activities Getting up late Other: Are you happy with your studying now? * No Yes Some subjects only Your average GPA (for the scale of 4) at the moment * Which year are you in? K47 K48 K49 K50

The complete filled surveys will be showed latter in the appendix.

Presenting and Summarizing data

1. Subject After one week collecting data, we got 75 surveys filled, more than what we expected. A quick glance at the pie chart below shows that of 4 kinds, 2nd year students and 3rd year students were willing to answer the survey's question much more than 1st year and 4th year students. This may be explained by the fact that last year students are too busy with their work and graduation essay that they didn't have time to answer these. For the 1st year students, we guess that since they haven't learned courses as Economics or Business Statistic yet, so they didn't have the motive to answer these.

K47

K48

K49

K50

4%

11%

47%

38%

2. Number of absent classes per month Table of distribution xi 0 1 2 3 4 5 6 8 10 12 16 25 fi 14 12 15 4 11 9 3 2 2 1 1 1 cumulative frequency 14 26 41 45 56 65 68 70 72 73 74 75 relative frequency (%) 18.7% 16.0% 20.0% 5.3% 14.7% 12.0% 4.0% 2.7% 2.7% 1.3% 1.3% 1.3%

From the above data, we calculated: The range: R = largest value smallest value = 25 - 0 = 25 This range tells us the difference between the largest and the smallest value of the distribution is 25. Although the absent classes per month of each students range from 0 to 25, there are only 12 values, with the number of absent classes are mainly less than 10 (which are account for 70% overall) The arithmetic mean: Mean = 253/75 = 3.37

This number tells us the monthly average absent classes are 3.37, or in a more sensible understanding, averagely, a CLCTCNH student takes 3 to 4 classes off per month. The mean deviation

x x
i 1 i

Mean deviation = 2.568 The average difference between the number of absent classes and the mean is 2.568 Mode The mode of the data set is the value which has the largest frequency. From the above table, we can see the rate which appears the most is 2, which has the frequency of 15 times. The below graph also illustrates this fact

fi
16 14 12 10 8 6 4 2 0 0 1 2 3 4 5 6 8 10 12 16 25

This data can be understood as of the studied students, most of them choose to be off 2 classes per month. However, we should also note an interesting number that there are 18.7% students never or hardly absent from any classes, which is only 1.3% less than the mode of 2. Therefore, we have evidence to say that Finance and Banking students are quite hard-working. Median Based on the table of distribution in which total frequency is 75, we can find the middle item is the 38th item, corresponding to the value of 2. Median = 2 The variance ( )

Variance = 63.403136 The average of squared discrepancies between each number of monthly absent classes and the mean is 63.40 The standard deviation

Standard deviation = 2.821890458

The coefficient of variation

Coefficient of variation = 83.65 %

3. Average number of hours spent for self-study Table of distribution xi 0 0.25 0.5 1 1.5 2 3 4 5 10 fi 6 1 11 15 1 24 6 6 4 1 Cumulative frequency 6 7 18 33 34 58 64 70 74 75 Relative frequency (%) 8% 1.3 % 14.7 % 20 % 1.3 % 32 % 8% 8% 5.3 % 1.3 %

The following dot plot shows the frequency of values of time students spending on self - studying

Dotplot of hours spent on selfstudying

10

From the above data, we calculated: The range: R = largest value smallest value = 10- 0 = 10 This range tells us the difference between the largest and the smallest value of the distribution is 10. Although the number of hours spent on self-studying of each students range from 0 to 10, there are only 10 values, with the number of hours are mainly less than 10 (which are account for 70% overall) The arithmetic mean: Mean = 142.25/75=1.897 This number tells us the daily average hours spent on self-studying are 1.897, or in a more sensible understanding, averagely, a CLCTCNH student spends about 2 hours a day on revising their home assignment.

The mean deviation

x x
i 1 i

Mean deviation = 1.126 The average difference between the number of hours spent on self-studying and the mean is 1.126 Median The median of the data set is the value of the item in the middle when the data items are arranged in ascending order. Based on the table of distribution in which total frequency is 75, we can find the middle item is the 38th item, corresponding to the value of 2. Median = 2 Mode The mode of the data set is the value which has the largest frequency. From the above table, we can see the rate which appears the most is 2, which has the frequency of 24 times. The below graph also illustrates this fact

fi
30 25 20 15 10 5 0 0 0.25 0.5 1 1.5 2 3 4 5 10

This data can be understood as of the studied students, most of them spend 2 hours per day on studying. Mode = 2 As we can see from the graph and the table of frequency distribution, only 8% of students never spend their time at home on studying, which indicates that there are still a large number of CLCTCNH students being aware of their task. The variance ( )

Variance = 2.617 The average of squared discrepancies between each number of monthly absent classes and the mean is 2.617 The standard deviation

Standard deviation = 1.618 The coefficient of variation

Coefficient of variation = 85.293 % 4. GPA Table of distribution


xi 2.67 2.8 2.83 2.88 2.9 2.91 2.93 2.97 3 3.03 3.04 3.1 3.13 3.15 3.17 fi 1 2 1 1 2 1 1 1 3 1 2 2 1 1 1 xi 3.19 3.2 3.22 3.23 3.24 3.25 3.27 3.28 3.3 3.31 3.34 3.35 3.38 3.4 3.41 fi 1 10 3 2 1 1 1 1 4 2 1 2 2 3 1 xi 3.42 3.47 3.48 3.49 3.5 3.54 3.62 3.64 3.65 3.67 3.76 3.79 3.8 3.9 4 fi 2 1 1 1 1 2 1 1 1 2 1 1 1 2 1

There are 45 values in this data set, so we create a table of grouped frequency distribution in order to simplify it, which help readers easier to follow cummulative GPA 2.67 up to 2.75 2.76 up to 3.00 3.01 up to 3.25 3.26 up to 3.50 3.51 up to 3.75 3.76 up to 4.00 Total f 1 12 26 23 7 6 75 frequency (%) 1.33% 16.00% 34.67% 30.67% 9.33% 8.00% 100.00% class mid point (x) 2.71 2.88 3.13 3.38 3.63 3.88

Those can be illustrated by the following ogive:

Ogive of GPA
120.00% 100.00% 80.00% 60.00% 40.00% 20.00% 0.00% 17.33% 1.33% 2.67 up to 2.76 up to 3.01 up to 3.26 up to 3.51 up to 3.76 up to 2.75 3.00 3.25 3.50 3.75 4.00 52.00% 82.67% 92.00% 100.00%

From the above statistic, we can calculate The range : R = largest value smallest value = 4 - 2.67 = 1.33 Class width: C= lower limit of class N+1 - lower limit class N = 3.01 - 2.76 = 0.25 With the available data, we grouped them into 6 classes, with the class range of 0.25 and total range is 1.33. It should be noted that the GPA here is calculated on the scale of 4, not 10 as usual. Therefore, 4 is the highest mark. The arithmetic mean

Mean = 245.08/75 = 3.27 From the value calculated, we can generally understood as the standard average GPA of a CLCTCNH student is 3.27 The mean deviation md = 0.24 So the average distance between the avarage GPA and GPA is 0.24 The mode ( Mode = 3.22 So the GPA that occurs most often of CLCTCNH student is 3.22 )

Histogram of GPA
30 25 20 15 10 5 0 2.67 up to2.76 up to3.01 up to3.26 up to3.51 up to3.76 up to 2.75 3.00 3.25 3.50 3.75 4.00

The median [ Median = 3.16 This number tells us the middle value of GPA in size order is 3.16 The variance Variance = 0.082 The standard deviation ( ) ( ) ]

Standard deviation = 0.287 The coefficient of the variance

Coefficient of variance = 8.77%

Regression model

Analyzing the relationship between GPA, number of absent classes and number of self - studying hours

1. Descriptive statistics The dependent variable we use in this model is GPA and the independent variable is the number of classes students are absent from and the number of hours spent for self-study. To specify, y : GPA x1 : The number of classes students are absent from x2 : The number of hours spent for self-study

2. Regression model We can easily calculate ^0 and ^1 (or b0 and b1) of the regression function by using the calculation of Excel and testing the result with Gretl software. Step 1: Calculating by Excel We have: The population model: y = 0 + 1x1 + 2x2 +

Where, 0 : intercept of y 1x1 + 2x2 : population slope : random error The estimated multiple regression model is y^ = b0 + b1x1 + b2x2 Where, y^ : estimated (predicted) value of y b0 : estimated intercept b1x1 + b2x2 : estimated slope coefficients The formula of b0, b1 and b2 as following:

y = nb0 + b1 x 1 + b2 x 2
and and We have: { b0 = 3.31947 b1 = -0.0323693 b2 = 0.0739471 The estimated model is y^ = 3.31947 - 0.0323693x1 + 0.0739471 x2

x 1y = b0 x 1 + b1x1

+ b2 x 1x2
2

x 2y = b0 x 2 + b1 x 1x2 + b2 x 2

Step 2: Testing the calculation by Gretl:

Model 1: OLS, using observations 1-75 Dependent variable: GPA_Y

Coefficient Std. Error const Absent_X1 Self_study_X2 3.31947 0.133021

t-ratio 24.9546 -1.6998 1.6003

p-value <0.00001 *** 0.09349 * 0.11390

-0.0323693 0.019043 0.0739471 0.0462074

Mean dependent var Sum squared resid R-squared F(2, 72) Log-likelihood Schwarz criterion

3.350533 30.14909 0.072181 2.800666 -72.24539 157.4432

S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn

0.662658 0.647099 0.046408 0.067403 150.4908 153.2668

The population regression model is : y = 3.31947 - 0.0323693x1 + 0.0739471x2 + e (*)

R2= 0.072181 means that the variation in the independent variables is able to explain 7.2181% of the total variation in the dependent variable. According to the final result we can jump to conclusion as following : - As we can see, b0 = 3.31947 reflects that the GPA is not affected by not only the number of classes students are absent from and the number of hours spent for selfstudy but also other factors. - Equation (*) illustrates the fact that the number of hours that student skipped classes has a negative effect on their GPA. One hour of being absent from classes makes GPA decrease 0.0323693 if other factors remain unchanged. - According to equation (*), we can see the obviously positive relationship between the hours that student spent on self- studying and their GPA. A coefficient of 0.0739471 means that when the time for self studying increases one more hour, the students GPA will also increase 0.0739471.

Conclusion
Due to limited time and resource, our group chose to investigate only in small area High quality class, Faculty of Finance and Banking instead of the whole university or even in other universities. If having any more time for us, our group believes that we will be able collect more information, more data as a result, more accurate results for this survey. In addition, the hours students spent on self-studying and the number of classes students are absent from have a positive relationship with their GPA. With the result of this survey and the data collected, there are convincing evident that if students want to improve their GPA, they should be more diligent. In specific, we students need to spend much time to improve knowledge by studying hard at school, listening carefully to the teachers lecture, doing homework regularly and moreover, the number of hours spending on self-studying are very extremely important for each students. Finally, through this investigation, we not only believe to bring application to students and teacher but also gained ourselves helpful and practical knowledge, which can be used many times afterward.

Appendix
Selfstudy 0 2 3 2 5 0 0 0 1 0.5 1.00 1 1 1 1 4 0 0.5 2 2 2 2 2 5 0.5 Selfstudy 1 0.5 2 3 1 1 1 2 2 0.5 0.5 2 1.5 5 5 2 3 0.5 2 1 2 1 1 4 2 Selfstudy 4 0.5 10 3 0.5 2 0.5 2 4 0.25 2 2 3 2 2 2 4 1 2 3 2 0 4 0.5 1

GPA 2.88 3.2 3.54 2.91 3.22 3.13 3.03 3.2 3.34 3.1 3.47 3.24 2.8 2.97 3.49 3.2 3.3 3.1 3.23 3 3.41 3.04 2.83 3.2 3

Absent 10 2 4 5 2 12 5 10 4 2 4 0 2 3 2 1 2 3 0 5 3 1 1 1 2

GPA 3.2 2.67 2.93 8.5 3.22 3.48 3.4 2.9 3.28 3.22 3.38 3.9 3.19 3.31 3.54 2.9 3.62 3.5 3.67 3.38 3.2 3.67 3.2 3.9 3.64

Absent 2 4 2 0 0 1 5 7 7 4 1 1 5 7 8 5 0 1 2 1 4 0 1 4 0

GPA 3.27 3.2 3.76 3.04 3.8 3.3 3.42 3.2 3.79 3.23 3.17 3 3.65 3.3 3.31 3.42 2.8 3.2 3.25 4 3.35 3.35 3.3 3.15 3.4

Absent 5 5 3 25 1 2 2 4 0 0 4 2 1 8 4 0 5 2 4 0 2 16 0 0 0

You might also like