Test Quiz 9

For a device for measuring blood pressure at home the accuracy was investigated. Therefore repeated measurements of blood pressure of a person, with a time interval of 5 min and under as identical circumstances as possible. The following data were measured:

Measurement no	1	2	3	4	5	6	7	8	9	10	11	12	13	14
Systolic pressure (mmHg)	143	134	138	138	135	131	135	139	141	143	142	141	149	140
Diastolic pressure (mmHg)	98	94	96	89	88	95	85	88	89	92	89	92	93	92

Data is assumed normally distributed, and parameter estimates for the two blood pressure measurements are:

\[({\bar x_S};{s_S}) = (139.21;4.58)\quad \quad \quad \quad \quad \quad ({\bar x_D};{s_D}) = (91.43;3.61)\quad \quad\]

What is the $95\%$ confidence interval for the mean systolic pressure?

$[126.47,\; 151.95 ]$

$[136.47,\; 141.95 ]$

$[136.56,\; 141.85 ]$

This is a one sample confidence interval for $\mu$, so we use the Method for that from Chapter 3: $139.21 \pm t_{0.975}\frac{4.58}{\sqrt{14}}$ We find the $t$-quantile in Python and do the hand computations in Python as well:

print(stats.t.ppf(0.975, 13))
2.160369
CI = 139.21 + np.array([-1,1]) * stats.t.ppf(0.975, 13) * 4.58/np.sqrt(14)
print(CI)
[136.5656 141.8544]

So the answer is 3: $[136.56,\; 141.85 ]$

$[136.72,\; 141.70 ]$

$[2.57,\; 9.85]$

In a sports study one wants to investigate whether there is a difference in energy consumption for various types of training. We have (for a single person) measured the energy consumed in 10 jogs of 30 minutes and 10 bike rides of 30 minutes. Each jog and ride was on different days. Measurements, expressed in kcal, is given in the table below:

Jogs	Bike rides
314	294
340	317
331	317
333	310
329	327
322	300
332	293
330	321
338	307
325	304

The following Python code was run:

x1 <- c(314, 340, 331, 333, 329, 322, 332, 330, 338, 325)
x2 <- c(294, 317, 317, 310, 327, 300, 293, 321, 307, 304)
var(x1)
var(x2)
t.test(x1,x2)
t.test(x1,x2, pair = TRUE, mu = 20)

x1 = np.array([314, 340, 331, 333, 329, 322, 332, 330, 338, 325]) x2 = np.array([294, 317, 317, 310, 327, 300, 293, 321, 307, 304]) print(np.var(x1, ddof=1)) print(np.var(x2, ddof=1)) print(stats.ttest_ind(x1, x2, equal_var=False)) print(stats.ttest_rel(x1, x2, 20))

with the following results:

57.82222222222222   132.0   TtestResult(statistic=4.682272020223192, pvalue=0.0002658270893093172,    df=15.615442372621638)   TtestResult(statistic=6.168301872365076, pvalue=0.00016503831607051134, df=9)

What is the most correct answer to the question: Is there a difference in mean energy consumption between the two types of activities? (Both conclusion and argument should be correct)

Yes there is a difference, since the relevant $p$-value is about $ 0.0003 $

No there is no difference, because the relevant $p$-value is about $ 0.0003 $

Yes there is a difference, since the relevant $p$-value is about $ 0.91 $

No, there is no difference, because the relevant $p$-value is about $ 0.09$

Yes, there is a difference, because $ 20.4 $ is greater than $ 20$

We have the following observations of $x_1$, $x_2$ and $y$ on 15 persons:

Person	x1	x2	y
1	7.90	16.70	59.00
2	4.60	13.80	44.00
3	5.10	20.20	59.00
4	5.50	14.20	48.00
5	5.20	12.80	45.00
6	6.50	18.60	59.00
7	4.90	20.80	57.00
8	4.60	15.20	45.00
9	4.80	20.50	59.00
10	4.50	22.90	61.00
11	3.80	15.70	46.00
12	4.20	12.30	40.00
13	5.40	16.80	49.00
14	5.80	14.60	47.00
15	4.20	20.50	57.00

And the following Python code was run:

myfit = smf.ols(formula='y ~ x1 + x2', data=df).fit()
print(myfit.summary(slim=True))
sigma = np.sqrt(myfit.mse_resid)
print(sigma)

with the following results:

OLS Regression Results

============================================================================== Dep. Variable: y R-squared: 0.965 Model: OLS Adj. R-squared: 0.960 No. Observations: 15 F-statistic: 167.5 Covariance Type: nonrobust Prob (F-statistic): 1.71e-09 ============================================================================== coef std err t P>|t| [0.025 0.975] —————————————————————————— Intercept 4.1793 2.829 1.477 0.165 -1.984 10.343 x1 2.6886 0.374 7.196 0.000 1.875 3.503 x2 1.9769 0.116 17.113 0.000 1.725 2.229 ==============================================================================

sigma = 1.4377195

What kind of analysis is done here?

A two sample $t$-test

A one-way ANOVA

A simple Linear regression analysis

A multiple linear regression analysis, MLR

A two-way ANOVA

We repeat from the question above:

We have the following observations of $x_1$, $x_2$ and $y$ on 15 persons:

Person	x1	x2	y
1	7.90	16.70	59.00
2	4.60	13.80	44.00
3	5.10	20.20	59.00
4	5.50	14.20	48.00
5	5.20	12.80	45.00
6	6.50	18.60	59.00
7	4.90	20.80	57.00
8	4.60	15.20	45.00
9	4.80	20.50	59.00
10	4.50	22.90	61.00
11	3.80	15.70	46.00
12	4.20	12.30	40.00
13	5.40	16.80	49.00
14	5.80	14.60	47.00
15	4.20	20.50	57.00

And the following Python code was run:

myfit = smf.ols(formula='y ~ x1 + x2', data=df).fit()
print(myfit.summary(slim=True))
sigma = np.sqrt(myfit.mse_resid)
print(sigma)

with the following results:

OLS Regression Results

============================================================================== Dep. Variable: y R-squared: 0.965 Model: OLS Adj. R-squared: 0.960 No. Observations: 15 F-statistic: 167.5 Covariance Type: nonrobust Prob (F-statistic): 1.71e-09 ============================================================================== coef std err t P>|t| [0.025 0.975] —————————————————————————— Intercept 4.1793 2.829 1.477 0.165 -1.984 10.343 x1 2.6886 0.374 7.196 0.000 1.875 3.503 x2 1.9769 0.116 17.113 0.000 1.725 2.229 ==============================================================================

sigma = 1.4377195

What is the only correct statement among the following to make here?

$x_1$ influences $y$ significantly whereas $x_2$ does not as the two relevant $p$-values are $0.00001$ and $0.165$

Both $x_1$ and $x_2$ influence $y$ significantly as the two relevant $p$-values are very small

$x_2$ influences $y$ significantly whereas $x_1$ does not as the two relevant $p$-values are $8.54\cdot 10^{-10}$ and $0.165$

Neither $x_1$ nor $x_2$ influence $y$ significantly as the two relevant $p$-values are very small

Neither $x_1$ nor $x_2$ influence $y$ significantly as the two relevant $p$-values are quite big

Use the situation described in the exercise above, repeated here again. And the following Python code was run:

myfit = smf.ols(formula='y ~ x1 + x2', data=df).fit()
print(myfit.summary(slim=True))
sigma = np.sqrt(myfit.mse_resid)
print(sigma)

with the following results:

OLS Regression Results

============================================================================== Dep. Variable: y R-squared: 0.965 Model: OLS Adj. R-squared: 0.960 No. Observations: 15 F-statistic: 167.5 Covariance Type: nonrobust Prob (F-statistic): 1.71e-09 ============================================================================== coef std err t P>|t| [0.025 0.975] —————————————————————————— Intercept 4.1793 2.829 1.477 0.165 -1.984 10.343 x1 2.6886 0.374 7.196 0.000 1.875 3.503 x2 1.9769 0.116 17.113 0.000 1.725 2.229 ==============================================================================

sigma = 1.4377195

What is the estimate of the residual standard deviation, $\hat{\sigma}$?

0.116

0.374

0.965

2.829

1.438

Use again the situation described in the exercise above and repeated here again again:

OLS Regression Results

============================================================================== Dep. Variable: y R-squared: 0.965 Model: OLS Adj. R-squared: 0.960 No. Observations: 15 F-statistic: 167.5 Covariance Type: nonrobust Prob (F-statistic): 1.71e-09 ============================================================================== coef std err t P>|t| [0.025 0.975] —————————————————————————— Intercept 4.1793 2.829 1.477 0.165 -1.984 10.343 x1 2.6886 0.374 7.196 0.000 1.875 3.503 x2 1.9769 0.116 17.113 0.000 1.725 2.229 ============================================================================== sigma = 1.4377195

What is the $95\%$ confidence interval for $\beta_1$ the relation between $x_1$ and $y$?

$4.1793 \pm 1.477 \cdot 2.8289$

$2.6886 \pm 7.196 \cdot 0.3736$

$1.9769 \pm 17.113 \cdot 0.1155$

$[1.572,\; 4.200]$

$[1.874,\; 3.503]$

We follow the approach in the Example on MLR in the book: The relevant $t$-quantile to us is the one with $DF=15-2-1=12$:

print(stats.t.ppf(0.975, 12))
2.178813

And we can read off the standard error of $\hat{\beta}_1=2.6886$ as $\hat{\sigma}_{\beta_1}=0.3736$ So the confidence interval becomes:

print(2.6886 + np.array([-1, 1])*2.178813*0.3736)
[1.874595 3.502605]

So the correct answer is 5: $[1.874,\; 3.503]$

Ten students took a mathematics test with 25 questions with the following results (number of correct answers): 9, 18, 19, 21, 25, 25, 21, 19, 16, 7.

Which one of the following statements is true? (use the definition from Chapter 1)

The median equals $18$

The median equals $19$

The median equals $25$

The median is undefined in this case

The median equals $36$

Ten students took a mathematics test with 25 questions with the following results (number of correct answers): 9, 18, 19, 21, 25, 25, 21, 19, 16, 7.

What is the sample variance $s^2$ for these numbers?

$36$

Use the basic sample variance formula from Chapter 1. The mean of the observations is 18. Straight forward calculations yield the variance of 36. The following Python-code will also calculate it, here “v” will end up being the sample variance

x = np.array([7, 9, 16, 18, 19, 19, 21, 21, 25, 25])
v = 0
for i in x:
	v += (i - x.mean())**2
v = v/(len(x)-1)
print(v)
# or simply
print(np.var(x, ddof=1))

Correct answer is 1.

$32.4$

$6$

$25$

$5$

When making statistical hypothesis tests we often assume that the significance level $\alpha$ is $5\%$.

This means that:

The probability that the null hypothesis is wrong is $5\%$

The probability that the null hypothesis is correct is $5\%$

The probability that we, if the null hypothesis is true, make a wrong conclusion is $5\%$

The probability that we, if the alternative is true, conclude that the hypothesis is wrong is $5\%$

None of the above

02323 · Test Quiz 9

Question 1 of 9

Question 2 of 9

Question 3 of 9

Question 4 of 9

Question 5 of 9

Question 6 of 9

Question 7 of 9

Question 8 of 9

Question 9 of 9

Jogs	Bike rides
314	294
340	317
331	317
333	310
329	327
322	300
332	293
330	321
338	307
325	304

Jogs	Bike rides
314	294
340	317
331	317
333	310
329	327
322	300
332	293
330	321
338	307
325	304

Jogs	Bike rides
314	294
340	317
331	317
333	310
329	327
322	300
332	293
330	321
338	307
325	304