Assume the question is asking you to create a vector that has 8 numbers with the entries [1.2,3,4,5,5,4,3] and specifically asks for using the colon operator without using any direct entries. The answer is ‘x <- c(1:5,5:3)’. Enter this in R and after entering it, show what is inside. Once you did this, copy and paste the code from the R command window as follows:

**> x <- c(1:5,5:3)**

**> x**

**[1] 1 2 3 4 5 5 4 3**

** **

You should report as in the bolded lines above. If you do not follow instructions, you will lose points.

** **

We may not have covered some of the questions in the class. If that is the case, you should use the help function and learn how to do it.

a. Create a vector with 10 numbers (3, 12, 6, -5, 0, 8, 15, 1, -10, 7) and assign it to x.

**> x <- c(3,12,6,-5,0,8,15,1,-10,7)**

**> x**

** [1] 3 12 6 -5 0 8 15 1 -10 7**

b. What is the data type of x? How can you find out?

Numerical

c. Subtract 5 from the 2nd, 4th, 6th, etc. element in x.

d. Compute the sum and the average for x (there are functions for that).

**> sum(x) **

**[1] 37**

** **

**> mean(x)**

**[1] 3.7**

e. Reverse the order of the elements in x (use a function that reverses a vector).

**> rev(x)**

** [1] 7 -10 1 15 8 0 -5 6 12 3**

f. Find out which numbers in x are negative (using conditionals).

**> x[x<0]**

**[1] -5 -10**

** **

** **

g. Remove all entries with negative numbers from x based on their index (use concatenate function as well as the index numbers).

h. How long is x now (use a function).

i. Remove x from the environment/workspace (session) and list the variables in your workspace.

j. Create the a vector of strings containing “CSE 8001", “CSE 8002", ...,\CSE 8100" using paste.

1. How could prediction models contribute to targeting of treatment and to achieve better care at reduced costs (increases cost-effectiveness) of medical care?

2.

a. What are the problems of dichotomization when studying the effect of one specific predictor, such as age?

b. Why should continuous predictors not categorized and why they should? Give a healthcare example.

3. Why are extreme values (outliers) a problem for data mining? When would truncation be reasonable?

4. Calculate Sensitivity, specificity, accuracy, precision and recall in the following confusion matrix, show your work – do not just report numbers.

Actual Classification of Classes in the Dataset |
|||

Positive |
Negative |
||

Model Classification |
Positive |
700 |
35 |

Negative |
80 |
156 |

** **

5. Apply the Apriori algorithm to the following to find the sets of associated items. Use a support of 3 and show your work step by step using tables

Transaction |
Items |

10000 |
1, 2,3,5 |

20000 |
1,3,5,4 |

30000 |
3,4,6 |

40000 |
1,5,6 |

50000 |
1,3,4,5 |

60000 |
2,3,4,5 |

Subject | Mathematics |

Due By (Pacific Time) | 09/30/2013 12:00 pm |

Tutor | Rating |
---|---|

pallavi Chat Now! |
out of 1971 reviews More.. |

amosmm Chat Now! |
out of 766 reviews More.. |

PhyzKyd Chat Now! |
out of 1164 reviews More.. |

rajdeep77 Chat Now! |
out of 721 reviews More.. |

sctys Chat Now! |
out of 1600 reviews More.. |

sharadgreen Chat Now! |
out of 770 reviews More.. |

topnotcher Chat Now! |
out of 766 reviews More.. |

XXXIAO Chat Now! |
out of 680 reviews More.. |