The data below outlines the cost of hard drives. What are the summary statistics and the correlation coefficient?  What kind of relationship, if any, does the cost have with the size? How sensitive is the R value on particular data points?

X= Y=
Capacity (TB) Cost ($)
0.08 29.99
0.12 35
0.2 299.99
0.25 49.95
0.32 68.99
1 99.99
2 200
4 449
in Statistics Answers by Level 1 User (300 points)
reshown by

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
Anti-spam verification:
To avoid this verification in future, please log in or register.

1 Answer

Best answer

2 decimal places accuracy adopted for this linear regression.

Summary statistics for X: mean=1.00Tb, pop SD=1.29Tb, sample SD=1.38Tb.

Summary statistics for Y: mean=$154.11, pop SD=$141.85, sample SD=$151.64.

Correlation coefficient R=0.80 (suggests good correlation)

Linear regression equation: Y=64.46+87.98X.

X Tb $Y (actual) $Y (predicted)
0.08 29.99 73.50
0.12 35 77.02
0.2 299.99 84.06
0.25 49.95 88.46
0.32 68.99 94.61
1 99.99 154.44
2 200 242.42
4 449 418.38

The table shows the actual measured values and values predicted by the linear equation. The cost of 0.2Tb at $299.99 clearly affected the correlation coefficient so that the predicted cost values tend to be higher than the actual values, except for 0.2Tb and 4Tb. The sensitivity of R has been influenced by the high cost for 0.2Tb, which has also shifted the mean cost. (A notably different result would be derived by treating the cost of 0.2Tb as an outlier. R then becomes 0.99 (very good correlation), the linear equation becomes Y=18.25+103.62X, producing (0.08Tb,$26.54), (0.12Tb,$30.68), (0.25Tb,$44.16), (0.32Tb,$51.41), (1Tb,$121.87), (2Tb,$225.49), ($Tb,$432.73).)

by Top Rated User (1.2m points)
selected by

Hello Rod, thank you for your help as always.

Based on above analysis for the R value; which of the below options we can say is more accurately fit to choose?

A: The x mean is 0.99625 and standard deviation is 1.37765; the y mean is 154.1138 and standard deviation is 151.6404. The R value is 0.799. There is a positive relationship. The R value is somewhat depending on the outliers at 0.2TB.

B: One should first take the outlier out at 4 TB. Then we find that the x mean is 0.567 and standard deviation is 0.652; the y mean is 111.9871 and standard deviation is 93.796. The R value is 0.3658. There is a direct relationship. The R value is nearly 0 so our problems are solved.

C: One should first take the outliers out at 0.2 TB and 4 TB. Then we find that the x mean is 0.628  and standard deviation is 0.685; the y mean is 80.65 and standard deviation is 58.23. The R value is 0..99. There is a direct positive relationship. The R value is nearly 1 so our problems are solved.

D: One should first take the outlier out at 0.2 TB. Then we find that the x mean is 1.11 and standard deviation is 1.3395; the y mean is 133.2743 and the standard deviation is 139.72. The R value is 0.994. There is a positive relationship. The R value is nearly 1 so our problems are solved.

E: The x mean is 1.0 and standard deviation is 1.4; the y mean is 154 and standard deviation is 152. The R value is 0.8. There is a positive relationship. The R value is strongly depending on the outlier at 0.2TB.

F: The x mean is 0.99625 and standard deviation is 1.37765; the y mean is 154.1138 and standard deviation is 151.6404. The R value is 0.799. There is a direct relationship. The R value is somewhat depending on the outlier at 4 TB.

I think D is the most applicable option. The cost of 0.2Tb is so high that I even thought it might be a mistake! The cost of 4Tb is not particularly an outlier, so there's no need to remove it from the dataset.

Related questions

1 answer
1 answer
0 answers
asked Feb 10, 2013 in Calculus Answers by anonymous | 578 views
Welcome to MathHomeworkAnswers.org, where students, teachers and math enthusiasts can ask and answer any math question. Get help and answers to any math problem including algebra, trigonometry, geometry, calculus, trigonometry, fractions, solving expression, simplifying expressions and more. Get answers to math questions. Help is always 100% free!
87,516 questions
100,279 answers
2,420 comments
732,335 users