__Section 1__

Use the dataset given below you must use the sample allocated to you based on your student number

https://app.box.com/s/56pb6hqu0ypcg0f3lhy6cl5szt1jgdla

Note that for section 1 the answers are provided so you can check your work, the answers will not be provided for the other sections.

- A) paste in the scatterplot for your sample into your word document and give a simple comment about the relationship between the variables, (you do not need to submit the excel file)
- B) Estimate the annual contribution if the income is $200,000 using the regression line from part (a)
- C) Find the zscore of the estimate in part (B) note that average of the estimates is $27,000 with standard deviation $2,100, remember to show your work.
- D) using the zscore from part (C) Find P(Z<zscore) , you can find out the answer using wolframalpha.com

for example found the zscore was 1.5 if the zscore is 1.5 type in

P(Z<1.5)

into wolfram alpha.com - E) If there was a list of 10,000 estimates ranked from lowest to highest, what rank do you think your estimate would be close to?

Hint: just use the formula

expected rank = P(Z<zscore)*10000, remember to show your work.

__ __

__Section 2__

Use the dataset given below you must use the sample allocated to you based on your student number

https://app.box.com/s/yvhk3e3oymbs3toy6j5xetid82dsjyz4

- A) Use the PivotTable feature in excel to find appropriate summary statistics for your sample, This will probably require two PivotTables. You should paste both into word, you do not need the excel file.

__Make sure the pivotable (or pivottables) include the following statistics __

*Just considering the high risk (riskier type) investments what is the sample size *n _{1} *and the proportion of high risk investments that made a loss

*Just considering the low risk (safer type) investments what is the sample size =

*n*and What is the proportion of low risk investments that made a loss

_{2}- B) Use excel to make an appropriate graph that lets you compare the proportions found in parts A and paste this into your word document
- C) Looking at your answers to parts (A) and (B) Make a simple comment about the relationship between the variables

investment type (risky or safe ) and

Made a profit (made a profit/made a loss) - D) i) Using your sample what is the estimate for p
_{1}– p_{2}? In other words what is the difference between the sample proportions – - ii) Find the zscore of the estimate in part (i) note that average of the estimates is 0.1 with standard deviation 0.0743

iii) using part (ii) find P(Z<zscore) using www.wolframalpha.com

for example if the zscore is 0.5 type in

P(Z<0.5)”

into wolframalpha.com

- iv) IF there was a list of 4000 estimates ranked from lowest to highest, roughly what rank do you expect your estimate to have?

Hint: just use the formula

expected rank = P(Z<zscore)*4000

E) test the claim there is a difference in the proportions use a 5% level of significance

i)state an appropriate H_{0} and H_{1}

- ii) find the p-value Only using the answers to part (A) and the webpage

http://epitools.ausvet.com.au/content.php?page=z-test-2

Do NOT use any other method to find the p-value

Do NOT use any other software package such as SPSS or Analysis tookpak

iii) state whether or not you reject the H_{0}

- iv) give a conclusion in plain English

__Section 3__

Use the dataset given below you must use your own sample

https://app.box.com/s/z0mbtcfsdqxz1rm7rhw3p9sb75aq7174

- A) Use the pivot table feature in excel to find appropriate summary statistics for your sample. The following sample statistics must be found

Just considering the low risk investments, what is the sample size *n _{1 }*, the sample average return of low risk investments , and the sample standard deviation

*s*

_{1}Just considering the high risk investments , what is the sample size

*n*, the sample average return of high risk investments , and the sample standard deviation

_{2 }*s*

_{2}Paste the pivot table into the word document you do not need to submit the excel file

- B) Give an appropriate graph that shows the relationship between variables, Note that the information in part A is NOT Suitable for a graph you have to get different information
- C) Make a simple comment about the relationship between the variables using the answers to (A) and (B)
- D)
- i) Using your sample what is the estimate for µ
_{1}– µ_{2}? In other words what is the difference between the sample means – - ii) Find the zscore of the estimate in part (i) note that average of the estimates -0.0256 with standard deviation 0.0173

iii) using part (ii) What is P(Z<zscore), you can find out the answer using www.wolframalpha.com

for example if the zscore =-1 type in

* P(Z<-1)*

into wolfram alpha

- iv) If there was a list of 2000 estimates ranked from lowest to highest, what rank do you think your would be close to, hint just use the formula

expected rank = P(Z<zscore)*2000 - E) Test the claim that there is a difference between the means using a 5% level of significance

i)state an appropriate H_{0} and H_{1}

- ii) find the p-value using the answers to part (A))and the webpage

https://www.medcalc.org/calc/comparison_of_means.php

Do NOT find the p-value using any other method.

Do NOT use any other software package such as SPSS or Analysis tookpak

iii) state whether or not you reject H_{0}

- iv) give a conclusion in plain English

__ __

__Section 4 __

Use the dataset given below you must use your own sample

https://app.box.com/s/kzc6ivy10gvy4vz6d0pgy0lzh929ivx9

Suppose A business has conducted an opinion poll to find out if their customers support a change to the Business

- Use the PivotTable feature in excel to find appropriate summary statistics for your sample,. You should paste both into word, you do not need the excel file.

This pivot table must have the number of people that answer yes and the number of people that answer no

- What is sample size and the sample proportion of people that support the change, Note that is the estimate for the population proportion p
- i) Find the zscore of the estimate in part (a) note that average of the estimates 0.6 is with standard deviation 0.0357
- ii) using part (i) what is P(Z<zscore) you can find out the answer using wolframalpha.com

For example if the zscore is 2 then enter

P(Z<2)

into www.wolframalpha.com

- iv) If there was a list of 1000 estimates ranked from lowest to highest, what rank do you think your would be close to, hint just use the formula

expected rank = P(Z<zscore)*1000 - Find a 95% confidence interval for the proportion of people that support the change

__ __

__Section 5__

a)You have to obtain your own dataset,

Your dataset must have the following properties.

It must be have at least 5 rows (observations)

it must have at least 2 variables, (note that the name of each thing in the data set is NOT a variable)

At least one of the variables must be categorical

There are 3 options for getting the dataset

Option 1

*Make up your own dataset, this can be about anything you find interesting, So it could be about businesses, customers, students athletes, cats, monkeys, AYTHING AT ALL.

if you make up your own dataset there is no way it will be the same as another students.

Option 2

*Find and existing data set and email the lecturer the dataset matthew.maccallum@koi.edu.au

the lecturer will email you a sample of the data set, use the sample , this will make sure there is no way your sample is the same as other students.

Option 3

*find an existing dataset and make up an extra variable email the lecturer the dataset matthew.maccallum@koi.edu.au

the lecturer will email you a sample of the data set, use the sample , this will make sure there is no way your sample is the same as other students

b) Pick two of the variables (make sure one of the variables is categorical) and summarize the variables with a suitable pivot table

- c) Paste the dataset and your summary into the word file, you do not need to submit the excel file

add a very brief comment

__Section 6__

If you give a brief discussion (total 300 words) of any of the resources below, or pick two of the resources below or pick all 3, just make sure the total number of words in __section 6__ is 300 words or less. It is strongly suggested you discuss the examples given in the resources given below

1) Guide to summarizing datasets

https://app.box.com/s/jxuqhpzjrfj14xiq28x1bnywjv1iayr4

2) A students assignment from 2015

https://app.box.com/s/2a72e7i9lduyy3wp8nyd0uogsyvvnzrz

3) Discussion of how mean and standard deviation is used in finance

https://www.youtube.com/watch?v=UwO4JvB9OpE