ŷ=a+bx is the regression equation based on (x,y) data pairs, and ŷ is an estimated output value for some input value x. The hat symbol is used to distinguish estimated from actual. In a dataset, the y values are actually observed values.
So how are a and b calculated? Here are the formulas:
Slope, gradient or rate of change, b=(n∑xiyi-∑xi∑yi)/(n∑xi2-(∑xi)2). Intercept a=(∑yi-b∑xi)/n for 1≤i≤n, where n is the number of data pairs, which is 7 in this problem.
The table below shows what you need to calculate from the given data:
x |
y |
x2 |
xy |
19 |
410 |
361 |
7790 |
31 |
580 |
961 |
17980 |
34 |
590 |
1156 |
20060 |
35 |
570 |
1225 |
19950 |
39 |
640 |
1521 |
24960 |
39 |
680 |
1521 |
26520 |
43 |
660 |
1849 |
28380 |
∑xi=240 |
∑yi=4130 |
∑xi2=8594 |
∑xiyi=145640 |
Putting all these figures together into the formulas above, where n=7 we get a=210.9539, b=11.0555 (approx).
Take x=34, for example, then ŷ=210.9539+11.0555×34=587 approx, compared with 590 actually. A good fit.
The correlation coefficient, r=0.9606 approx, which indicates a strong correlation between x and y.