You are on page 1of 11

MB0040 - Statistics for Management semester-I Assignment Set-1 Q.1a.

What is the difference between a qualitative and quantitative variable? Ans: Scientific experiments will normally have three types of variables; controlled, independent and dependent. Variables are a condition or factor that is used in testing a hypothesis and generating a conclusion. These three types of variables can also be quantitative or qualitative in nature. Qualitative: By definition something that is qualitative concerns or describes a quality. A qualitative variable is a descriptive. Qualitative variable are sometimes referred to as categorical. The variable may be colors in the light spectrum or a comparison between red and green grapes. Qualitative variables can influence the outcome of an experiment or research because they can influence other factors or parameters. Qualitative variables are frequently used in social research. Qualitative research is considered to be inductive. Quantitative: By definition something that is quantitative can be expressed as a quantity or number. Quantitative variables are something that can be measured. Quantitative variables are numerical. A quantitative variable can be a percentage of something, a number of units or any other measurement. Temperature is a quantitative value or variable by the number of degrees. Speed, area population, voltage and time are all examples of quantitative variables that can be measured. Quantitative variables are most often considered to be deductive in nature. Deduction and induction in experimentation and research: Deduction works from a general idea to a specific idea. Deductive research starts with a theory, forms a hypothesis, gathers observations and then confirms or disproves the original thought. Induction works in the reverse. Inductive experimentation will start with an observation and then look for patterns in the observation. Once patterns form a hypothesis is developed. The hypothesis is then tested for a resulting theory. The best results in experimentation come from having only one independent variable. The controlled variable is something that does not change and must remain constant. The independent variable is the variable that is changed by the researcher. The dependent value is the variable that changes due to the independent variable.

An example of quantitative variables in an experiment would be testing the change in speed on a turntable as additional weight is applied. The turntable itself is the controlled variable. The experimenter will only use one. The independent quantitative variable is the amount of weight applied for each measurement. The dependent quantitative variable is the resulting speed that is measured. An example of a qualitative variable in testing would be the drying time require for red and green grapes at a constant temperature. The outcome, or dependent variable, of time is measured and therefore quantitative. The controlled variable being used is temperature, also quantitative. The independent variable is qualitative, the difference between red and green grapes. In this particular example the weight of each grape, a quantitative variable would also need to be consistent or controlled. Q1 (B) A town has 15 neighbourhoods. If you interviewed everyone living in one particular neighbourhood, would you be interviewing a population or a sample from the town? Would this be a random sample? If you had a list of everyone living in the town, called a frame, and you randomly selected 100 people from all neighbourhoods, would this a random sample? Ans: If want to do statistic in a particular area and interviewed everyone then it is called as population and if you doing statistic analysis for a town and interviewed only 15 neighborhoods then it is called that interviewing a sample from town. It is also called chunk which refers to the fraction of the population being investigated which is selected neither by probability nor by judgment. If you interviewed everyone in one of the 15 neighborhoods you would know a lot about how the people in that neighborhood feel about stuff- but nothing about how the people in the other neighborhoods feel. So no it is not a good or random sample of the town. picking people randomly is (by definition) a random sample. 100 may or may not be large enough. The number you need to interview depends on how many people are in the town, how certain you need to be of your result, and how good you actually are at choosing randomly and getting people to respond to questions that you have skillfully constructed.

Q2. (A) Explain the steps involved in planning of a statistical survey? Ans: Stages in a statistical survey: 1. Nature of the problem to be investigated should be clearly defined in an unambiguous manner. 2. Objectives of the investigation should be stated at the outset. Objectives could b: Obtain certain estimates. Establish a theory. Verify an existing statement. Find relationship between characteristics.

3. The scope of investigation has to be made clear. The scope of the investigation refers to the area to be covered, identification of units to be studied, nature of characteristics to be observed, accuracy of measurement, analytical method, time cost and other resources required. 4. Whether to use data collected from primary sources or secondary sources should be determined in advanced. 5. The organization of investigation is the final step in the process. It encompasses the determination of the number of investigator required, their training, supervision work needed, funds required. Q.2 b. What are the merits & Demerits of Direct personal observation and Indirect Oral Interview? Ans: Direct personal observation: In the direct personal observation method, the investigator collects data by having direct contact with the units of investigation. The accuracy of the data depends upon the ability, training, and attitude of the investigator. Merits: We get the original data which is more accurate and reliable. Satisfactory information can be extracted by the investigator through indirect questions. Data is homogenous and comparable.

Additional information can be gathered. Misinterpretation of question can be avoided. Demerits: This method consumes more cost. This method costs more time. This cannot be used when the scope of the investigation is wide. Indirect oral interview: Indirect oral interview is used when the area to be covered is large. The investigator collects the data from a third party or witness or had of the institution. This method is generally used by police department in cases related to enquiries on causes of fires, theft or murders. Merits: Economical in terms of time, cost and man power. Confidential information can be collected. Information is likely to be unbiased and reliable. Demerits: The degree of accuracy of information is less. Q.3 a) Draw Ogives from the following data and measure the median value. Verify it by actual calculations.
Central size Frequenc y

5 5

15 11

25 21

35 16

45 10

Ans:
Central Value 5 15 25 35 45 Limit s 0-10 10-20 20-30 30-40 40-50 Freque ncy 5 11 21 16 10 Less Than 10 5 20 16 30 37 40 53 50 63 Greater Than 0 63 10 58 20 47 30 26 40 10

Total

63

Now from the meeting points of these two Ogives if we draw a perpendicular to the X axis, the point where it meets X axis gives median of the series. So here midpoint of 2030 limit is 25. So median is 25. By actual calculation Here n=63, hence median is (N+1)/2th item which is (63+1)/2=32nd item =25. So Ogive median and actual median are same. 3 (b) Complete the following distribution, if its Median is 2600 and compute the value of Arithmetic Mean.
Size Frequenc y 10001500 15002000 20002500 400 25003000 30004000 40005000 50 50006000 20 Total 1500

120 ?

500 ?

Ans:
Size 1000-1500 1500-2000 2000-2500 2500-3000 f 120 f1 400 500 cf 120 120+f1 520+f1 1020+f1

3000-4000 4000-5000 5000-6000

410-f1* 50 20

1430 1480 1500

N=1500(given) *N=1500-(120+400+500+50+20)-f1 Median = (N)th item/2 => 1500/2=750th item ,but median is 2600 (given) This lies between 2500-3000 groups Now M= L1 + (L2-L1)/F*(m-c) 2600= 2500+ (3000-2500)/500 *(750-(520+f1)) = >2600 = 2500+ 500/500* (750-520-f1) = >2600 = 2500- 230-f1 = >2600-2500= 320-f1 = > 100= 320- f1 = > f1 =130 Then f2 = 410-130=280
Ci 1000-1500 1500-2000 2000-2500 2500-3000 3000-4000 4000-5000 5000-6000 Total f 120 130 400 500 280 50 20 1500 m 1250 1750 2250 2750 3500 4500 5500 fm 150000 227500 900000 1375000 980000 225000 110000 3967500

X= fm/ f =3967500/1500 = 2645 (ans) Q.4a) What is the main difference between correlation analysis and regression analysis? Ans: Correlation analysis: When two or more variables move in sympathy with other, they are said to be correlated. If both variables move in the same direction then they are said to be

positively correlated. If the variables move in opposite direction then they are said to be negatively correlated. If they move haphazardly then there is no correlation between them. Regression analysis: Regression analysis is used to estimate the values of the dependent variables from the values of the independent variables. Regression analysis is used to get measure of the error involved while using the regression line as a basis for estimation. Regression coefficient is used to calculate correlation coefficient. The main difference between these two is:- correlation analysis attempts to study the relationship between the variable X and Y. Regression analysis attempts to predict the average X for a given Y. It is attempted to quantify the dependence of one variable on the other. Difference between regression coefficient and correlation coefficient
Correlation coefficient rxy = ryx 1< r <1 There exist nonsense correlation It indirectly helps in estimation. It has no units attached to it. It is not based on cause and effect relationship Regression Coefficient byx = bxy if byx can be greater than one, but bxy must be less than one such that byx.byx<1 There is no such nonsense regression It is meant for estimation. It has units attached to it. It is based on cause and effect relationship.

Q.4b) In a multiple regression model with 12 independent variables, what are the degrees of freedom for error? Explain? Ans: In Multiple regressions analysis is an extension of two variable regression analyses. In this analysis, two or more independent variables are used to estimate the values of a dependent variable, instead of one independent variable. Objectives of multiple regression analysis are: * To derive an equation, this provides estimates of the dependent variable from values of the two or more independent variables? * To obtain the measure of the error involved in using the regression equation as a basis of estimation. * To obtain a measure of the proportion of variance in the dependent variable accounted for or explained by the independent variables.

In the given question N=12, hence degree of freedom will be v=n-1, where n is the sample size. So the degree of freedom will be 12-1=11 5. a) Discuss what is meant by Quality control and quality improvement. Ans: Quality Control is defined as the part of quality management focused on fulfilling quality requirements. Ideally, prevention based controls should prevent problems from occurring, but in reality, no system is foolproof and problems do occur. Accordingly, controls to detect quality problems must be established so that customers receive only products that meet their requirements. ISO 9000 Lead Auditor Training Detection based controls are reactive the problem and cost have already occurred and the company is resorting to damage control. The intent of detection is to evaluate output from processes and activities by implementing controls to catch problems when they do occur. For example, final inspection to catch defective product before it gets shipped. Quality Improvement is defined as the part of quality management focused on increasing the ability to fulfill requirements. Continual improvement results from ongoing actions taken to enhance product characteristics or increase process effectiveness and efficiency. This is one of the key characteristics that differentiate a quality management system from a quality assurance system, i.e., being able to improve the effectiveness and efficiency and of a process or activity by setting measurable objectives and using performance data to manage the achievement of these objectives. Effectiveness is defined as the extent to which planned activities are realized and planned results are achieved. In determining the effectiveness of quality assurance and quality improvement activities, the following questions should be asked: To what extent have problems in product or processes been prevented? To what extent have planned objectives for quality been met? Efficiency is defined as the relationship between result achieved and resources used. The measure of efficiency is determined by asking the following: Can we get the same output using fewer resources? Can we get more output without adding resources? These questions may be applied to the output of any activity within the quality management system of an organization. It should be noted that ISO 9001 requires organizations to achieve QMS effectiveness through quality assurance and continual improvement activities. QMS efficiency is

desirable, but not currently required by ISO 9001. ISO9004 provides guidelines that consider both the effectiveness and efficiency of the QMS. Quality improvement actions may include: Measuring and analyzing situations Establishing improvement objectives Searching for possible solutions Evaluating these solutions Implementing the selected solution Measuring, verifying, and analyzing results Formalizing the changes Q.5 b) What are the limitations of a quality control charts? The quality control chart is based on the research of Villefredo Pareto. He found that approximately 80 percent of all wealth of Italian cities he researched was held by only 20 percent of the families. The Pareto principle has been found to apply in other areas, from economics to quality control. Pareto charts have several disadvantages, Easy to Make but Difficult to Troubleshoot * Based on the Pareto principle, any process improvement should focus on the 20 percent of issues that cause the majority of problems in order to have the greatest impact. However, one of the disadvantages of Pareto charts is that they provide no insight on the root causes. For example, a Pareto chart will demonstrate that half ofall problems occur in shipping and receiving. Failure Modes Effect Analysis, Statistical Process Control charts,run charts and cause-and-effect charts are needed to determine the most basic reasons that the major issues identified by the Pareto chart are occurring. Multiple Pareto Charts May Be Needed * Pareto charts can show where the major problems are occurring. However, one chart may not be enough. To trace the cause for the errors to its source, lower levels of Pareto charts may be needed. If mistakes are occurring in shipping and receiving, further analysis and more charts are needed to show that the biggest contributor is in order-taking or label-printing. Another disadvantage of Pareto charts is that as more are created with finer detail, it is also possible to lose sight of these causes in comparison to each other. The top 20 percent of root causes in a Pareto analysis two to three layers down from the original Pareto chart must also be compared to each other so that the targeted fix will have the greatest impact.

Qualitative Data versus Quantitative Data * Pareto charts can only show qualitative data that can be observed. It merely shows the frequency of an attribute or measurement. One disadvantage of generating Pareto charts is that they cannot be used to calculate the average of the data, its variability or changes in the measured attribute over time. It cannot be used to calculate the mean, the standard deviation or other statistics needed to translate data collected from a sample and estimate the state of the real-world population. Without quantitative data and the statistics calculated from that data, it isn't possible to mathematically test the values. Qualitative statistics are needed to whether or not a process can stay within a specification limit. While a Pareto chart may show which problem is the greatest, it cannot be used to calculate how bad the problem is or how far changes would bring a process back into specification. Q6. a) Suggest a more suitable average in each of the following cases: (i) Average size of ready-made garments. (ii) Average marks of a student. Ans: Average size of readymade garments: Arithmetic mean will be used because it is continuous and additive in nature. Average marks of a student: Arithmetic mean will be used because the data re in the interval and the distribution is symmetrical. Q.6b) State the nature of symmetry in the following cases: (i) When median is greater than mean (ii) When mean is greater than median Ans: When median is greater than mean, the series is said to have negative skewness. The following characteristics can be seen * Mode > Median > Mean * The left tail of the curve is longer than the right tail, when the data are plotted through a histogram, or a frequency polygon. * The formula of skewness and its coefficients give negative figures.

When mean is greater than median, the series is said to have positive skewness; the following characteristics can be seen * Mean > Median > Mode * The right tail of the curve is longer than its left tail, when the data are plotted through a histogram, or a polygon. * The formula of skewness and its coefficients give positive figures. The following example would show the above distributions and their respective characteristics:
Value (X)

Positively Skewed Negatively Skewed F FX CF F FX CF 10 5 50 5 5 50 5 20 15 300 20 7 140 12 30 13 390 33 9 270 21 40 11 440 44 11 440 32 50 9 450 53 13 650 45 60 7 420 60 15 900 60 70 5 350 65 5 350 65 Total 65 2400 65 2800 Mean= 2400/65= 37 Mean= 2800/65= 43 Median=(65+1)/2=33th Median= 33th item =50 Item =30

You might also like