# Project #10558 - Statistics Project

Hello, I need help with my Stat project. I dont have a topic. Feel free to chose your own topic as long as it fits the guidlines. Contact me if you have any questions.

The problem you choose should be complex enough to demonstrate your skills at data analysis, including such skills as the identification of any outliers, describing the variables in your data, looking at any relationships between variables that may exist, and testing the significance of any inferences you make about your data. A problem does not need to allow for the use of all of your data analysis skills if it demonstrates some of them well.

Use of the spreadsheet is required and all calculations must be shown.  Statistics generated outside Excel must be documented within Excel; for example, the table or graph generated in StatCrunch must be identified as such, with information provided on which Excel data ranges were used.

## Choosing a Problem

1. Where Will I Obtain the Necessary Data?

There are a large number of readily available sources of data. Some of these sources are on the Internet (especially government websites, like www.census.gov).  States and cities may also have downloadable data sets.  Use Google or your favorite search engine (include words like data, dataset, or database along with your topic in the search string).

You may want to look at the links in the Tools for this Course folder for possible project ideas and data. The ESC Library subject area guides are especially good for identifying data sources.  Additional data is available at theStatistics website through StatCrunch (look under Load data button).

Other sources for data include readily available printed reports (especially government reports such as those produced by the U.S. Bureau of the Census or the U.S. Bureau of Labor Statistics) and, frequently, your own job.  Many employers are willing to give access to data if identifying information is omitted.

If you really prefer to gather your own data, we strongly discourage gathering data by survey. Instead, construct an experiment. For example, one previous project involved testing whether measuring tablespoons that are differently shaped (round, oval, rectangular) are really identical measures for granular material (e.g., cornmeal).  Diets, fuel prices, real estate values, and grades are all sources of data.  Remember that the focus is on showing your ability to set up a problem and use statistical tools to solve it.

2. How Many Observations and Variables Can I Reasonably Work With and Obtain?

You should use at least 40 observations. Keep the number of different variables reasonable. For the best results, you want numerical data that can be separated into several different categories. Be sure to review the data, making sure you understand what the numbers represent. For example, numbers in one column may be in thousands, while another may be in hundreds. This is where common sense and estimation is needed to properly interpret your data.

Also, you should consider how many observations and variables your software can handle. If you are considering working with a large number of observations, you want to download the data from a database.  You are expected to use spreadsheet software in your analysis. When you download your data, save the data in a file and make sure you put the web url in the file, along with other pertinent citation information.  Data sources must be cited, and data must be included with your report.

3. How Can I Solve This Problem?

Your project should include at least one graph of your data, a numerical summary, a discussion of data distribution, at least two tests, and a confidence interval. Correlation/regression analysis is one appropriate test; z-tests, t-tests, proportion tests, Chi-squared and ANOVA are also appropriate. Your choices for tests should reflect the questions being asked and the type of data being analyzed.

4. Limitations of Your Data? How Serious Are These Limitations?

These limitations might include such items as the data not being even close to normally distributed.

## Examples of Projects

To help get you think about the problem you want to solve, here are a couple of examples:

• Which of five variables included in the 1990 Census are most useful in attempting to predict a county's median household income?
• Has there been a relationship between my company's profits and the average number of employees each quarter over the past 10 years?
• A previous project that was particularly impressive was presented by someone who worked in a local fire department, and used data collected from incidents for the last 10 years. First he graphed it to look at the data distribution. Then he used regression techniques to show how incidents were increasing over the years, yet staffing levels were decreasing (a great presentation for a real life meeting). Then he used inference techniques to compare the response levels of several groups (career vs. volunteers; 3 different age groups using ANOVA; and between his fire station and another local one). This was a superb report with clearly stated conclusions! You don't need to do everything we touch on in class, but this is a great example of how to use many techniques on one set of data.

Your instructor is available to discuss any ideas you might have before you actually prepare your proposal. Do not hesitate to ask. Remember that the final presentation will be in the form of a statistical research paper.  The methodology and statistical testing are essential parts of that but so are the problem statement, your conclusions, and analysis.

The following is a list of acceptable tests:

• regression line and equation; correlation
• one-sample t-test
• one-sample t confidence interval
• matched-pairs t-test
• two-sample t-test
• two-sample t confidence interval
• F-test for variances
• ANOVA
• one-sample z test for proportions
• one-sample z confidence interval for proportions
• two-sample z test for proportions
• two-sample z confidence interval for proportions
• chi-square

Your report should analyze the data distribution(s) graphically and with summary statistics. It should also have at least two significance tests and a confidence interval.

Explain what tools you will use to describe the data. Decide which of the tests would best analyze your data. List those tests, state H0 and H1, and describe exactly what you expect to learn from each test. If your data distribution is extremely non-Normal, check whether tests you are proposing would be usable.

 Subject Mathematics Due By (Pacific Time) 08/15/2013 12:00 am
TutorRating
pallavi

Chat Now!

out of 1971 reviews
amosmm

Chat Now!

out of 766 reviews
PhyzKyd

Chat Now!

out of 1164 reviews
rajdeep77

Chat Now!

out of 721 reviews
sctys

Chat Now!

out of 1600 reviews

Chat Now!

out of 770 reviews
topnotcher

Chat Now!

out of 766 reviews
XXXIAO

Chat Now!

out of 680 reviews