http://office.microsoft.com/enus/excel/HP011277241033.aspx
For Excel 2007:
http://office.microsoft.com/enus/excel/HP100215691033.aspx
For Excel 2010:
http://office.microsoft.com/enus/excelhelp/loadtheanalysistoolpakHP010342659.aspx?CTT=1
If you still have problems, see:
http://www.addictivetips.com/windowstips/excel2010dataanalysis

Problem Set #3
Hypothesis Testing
1. University of Maryland University College is concerned that out of state students may be receiving lower grades than Maryland students. Two independent random samples have been selected: 165 observations from population 1 (Out of state students) and 177 from population 2 (Maryland students). The sample means obtained are X1(bar)=86 and X2(bar)=87. It is known from previous studies that the population variances are 8.1 and 7.3 respectively. Using a level of significance of .01, is there evidence that the out of state students may be receiving lower grades? Fully explain your answer.
Simple Regression
2. A CEO of a large pharmaceutical company would like to determine if the company should be placing more money allotted in the budget next year for television advertising of a new drug marketed for controlling diabetes. He wonders whether there is a strong relationship between the amount of money spent on television advertising for this new drug called DIB and the number of orders received. The manufacturing process of this drug is very difficult and requires stability so the CEO would prefer to generate a stable number of orders. The cost of advertising is always an important consideration in the phase I rollout of a new drug. Data that have been collected over the past 20 months indicate the amount of money spent of television advertising and the number of orders received.
The use of linear regression is a critical tool for a manager's decisionmaking ability. Please carefully read the example below and try to answer the questions in terms of the problem context. The results are as follows:
Month

Advertising Cost

Number of Orders

1

$74,430.00

2,856,000

2

62,620

1,800,000

3

67,580

1,299,000

4

53,680

1,510,000

5

69,180

1,367,000

6

73,140

2,611,000

7

85,370

3,788,000

8

76,880

2,935,000

9

66,990

1,955,000

10

77,230

3,634,000

11

61,380

1,598,000

12

62,750

1,867,000

13

63,270

1,899,000

14

86,190

3,245,000

15

60,030

1,934,000

16

79,210

2,761,000

17

67,770

1,625,000

18

84,530

3,778,000

19

79,760

2,979,000

20

84,640

3,814,000

a. Set up a scatter diagram and calculate the associated correlation coefficient. Discuss how strong you think the relationship is between the amount of money spent on television advertising and the number of orders received. Please use the Correlation procedures within Excel under Tools > Data Analysis. The Scatterplot can more easily be generated using the Chart procedure.
NOTE: If you do not have the Data Analysis option under Tools you must install it. You need to go to Tools select Addins and then choose the 2 data toolpak options. It should take about a minute.
b. Assuming there is a statistically significant relationship, use the least squares method to find the regression equation to predict the advertising costs based on the number of orders received. Please use the regression procedure within Excel under Tools > Data Analysis to construct this equation.
c. Interpret the meaning of the slope, b_{1}, in the regression equation.
d. Predict the monthly advertising cost when the number of orders is 2,300,000. (Hint: Be very careful with assigning the dependent variable for this problem)
e. Compute the coefficient of determination, r^{2}, and interpret its meaning.
f. Compute the standard error of estimate, and interpret its meaning.
g. Do you think that the company should use these results from the regression to base any corporate decisions on?….explain fully.
Hypothesis Testing on Multiple Populations
3. Dr. Michaella Evans, a statistics professor at the University of Maryland University College, drives from her home to the school every weekday. She has three options to drive there. She can take the Beltway, or she can take a main highway with some traffic lights, or she can take the back road, which has no traffic lights but is a longer distance. Being as dataoriented as she is, she is interested to know if there is a difference in the time it takes to drive each route.
As an experiment she randomly selected the route on 21 different days and wrote down the time it took her for the round trip, getting to work in the morning and back home in the evening. At the .01 significance level, can she conclude that there is a difference between the driving times using the different routes?
Time (in minutes) it took to get to work and back using:
Beltway

Main highway

Back road

88

79

86

94

86

78

91

75

79

88

83

96

98

74

97

84

72

73

90


68

77



You can check your critical value with the following table: http://www.statsoft.com/textbook/distributiontables