Preface |
|
vii | |
Symbols and Abbreviations |
|
xvii | |
1 Introduction |
|
1 | (20) |
|
1.1 Deterministic Data and Random Data |
|
|
1 | (4) |
|
1.2 Population, Sample and Statistics |
|
|
5 | (3) |
|
|
8 | (2) |
|
1.4 Probabilities and Distributions |
|
|
10 | (3) |
|
|
10 | (2) |
|
1.4.2 Continuous Variables |
|
|
12 | (1) |
|
1.5 Beyond a Reasonable Doubt |
|
|
13 | (4) |
|
1.6 STATISTICA, SPSS and MATLAB |
|
|
17 | (4) |
2 Presenting and Summarising the Data |
|
21 | (46) |
|
|
21 | (7) |
|
2.1.1 Reading in the Data |
|
|
21 | (4) |
|
2.1.2 Operating with the Data |
|
|
25 | (3) |
|
|
28 | (17) |
|
2.2.1 Counts and Bar Graphs |
|
|
28 | (7) |
|
2.2.2 Frequencies and Histograms |
|
|
35 | (4) |
|
2.2.3 Multivariate Tables, Scatter Plots and 3D Plots |
|
|
39 | (4) |
|
|
43 | (2) |
|
|
45 | (18) |
|
2.3.1 Measures of Location |
|
|
45 | (3) |
|
|
48 | (2) |
|
|
50 | (3) |
|
2.3.4 Measures of Association for Continuous Variables |
|
|
53 | (2) |
|
2.3.5 Measures of Association for Ordinal Variables |
|
|
55 | (4) |
|
2.3.6 Measures of Association for Nominal Variables |
|
|
59 | (4) |
|
|
63 | (4) |
3 Estimating Data Parameters |
|
67 | (18) |
|
3.1 Point Estimation and Interval Estimation |
|
|
67 | (4) |
|
|
71 | (6) |
|
3.3 Estimating a Proportion |
|
|
77 | (3) |
|
3.4 Estimating a Variance |
|
|
80 | (1) |
|
3.5 Estimating a Variance Ratio |
|
|
81 | (2) |
|
|
83 | (2) |
4 Parametric Tests of Hypotheses |
|
85 | (56) |
|
4.1 Hypothesis Test Procedure |
|
|
85 | (4) |
|
4.2 Test Errors and Test Power |
|
|
89 | (6) |
|
4.3 Inference on One Population |
|
|
95 | (5) |
|
|
95 | (4) |
|
|
99 | (1) |
|
4.4 Inference on Two Populations |
|
|
100 | (13) |
|
4.4.1 Testing a Correlation |
|
|
100 | (2) |
|
4.4.2 Comparing Two Variances |
|
|
102 | (3) |
|
4.4.3 Comparing Two Means |
|
|
105 | (8) |
|
4.5 Inference on More Than Two Populations |
|
|
113 | (24) |
|
4.5.1 Introduction to the Analysis of Variance |
|
|
113 | (1) |
|
|
114 | (13) |
|
|
127 | (10) |
|
|
137 | (4) |
5 Non-Parametric Tests of Hypotheses |
|
141 | (50) |
|
5.1 Inference on One Population |
|
|
142 | (16) |
|
|
142 | (2) |
|
|
144 | (4) |
|
5.1.3 The Chi-Square Goodness of Fit Test |
|
|
148 | (4) |
|
5.1.4 The Kolmogorov-Smirnov Goodness of Fit Test |
|
|
152 | (4) |
|
5.1.5 The Lilliefors Test for Normality |
|
|
156 | (1) |
|
5.1.6 The Shapiro-Wilk Test for Normality |
|
|
156 | (2) |
|
|
158 | (11) |
|
5.2.1 The 2x2 Contingency Table |
|
|
158 | (4) |
|
5.2.2 The rxc Contingency Table |
|
|
162 | (2) |
|
5.2.3 The Chi-Square Test of Independence |
|
|
164 | (2) |
|
5.2.4 Measures of Association Revisited |
|
|
166 | (3) |
|
5.3 Inference on Two Populations |
|
|
169 | (11) |
|
5.3.1 Tests for Two Independent Samples |
|
|
169 | (5) |
|
5.3.2 Tests for Two Paired Samples |
|
|
174 | (6) |
|
5.4 Inference on More Than Two Populations |
|
|
180 | (6) |
|
5.4.1 The Kruskat-Wallis Test for Independent Samples |
|
|
180 | (3) |
|
5.4.2 The Friedmann Test for Paired Samples |
|
|
183 | (2) |
|
|
185 | (1) |
|
|
186 | (5) |
6 Statistical Classification |
|
191 | (46) |
|
6.1 Decision Regions and Functions |
|
|
191 | (2) |
|
|
193 | (9) |
|
6.2.1 Minimum Euclidian Distance Discriminant |
|
|
193 | (3) |
|
6.2.2 Minimum Mahalanobis Distance Discriminant |
|
|
196 | (6) |
|
6.3 Bayesian Classification |
|
|
202 | (12) |
|
6.3.1 Bayes Rule for Minimum Risk |
|
|
202 | (6) |
|
6.3.2 Normal Bayesian Classification |
|
|
208 | (3) |
|
6.3.3 Dimensionality Ratio and Error Estimation |
|
|
211 | (3) |
|
|
214 | (6) |
|
|
220 | (4) |
|
6.6 Classifier Evaluation |
|
|
224 | (3) |
|
|
227 | (6) |
|
|
233 | (4) |
7 Data Regression |
|
237 | (46) |
|
7.1 Simple Linear Regression |
|
|
238 | (14) |
|
7.1.1 Simple Linear Regression Model |
|
|
238 | (1) |
|
7.1.2 Estimating the Regression Function |
|
|
238 | (5) |
|
7.1.3 Inferences in Regression Analysis |
|
|
243 | (5) |
|
|
248 | (4) |
|
|
252 | (11) |
|
7.2.1 General Linear Regression Model |
|
|
252 | (1) |
|
7.2.2 General Linear Regression in Matrix Terms |
|
|
253 | (3) |
|
7.2.3 Inferences on Regression Parameters |
|
|
256 | (1) |
|
7.2.4 ANOVA and Extra Sums of Squares |
|
|
257 | (4) |
|
7.2.5 Polynomial Regression and Other Models |
|
|
261 | (2) |
|
7.3 Building and Evaluating the Regression Model |
|
|
263 | (10) |
|
|
263 | (2) |
|
7.3.2 Evaluating the Model |
|
|
265 | (2) |
|
|
267 | (6) |
|
7.4 Regression Through the Origin |
|
|
273 | (1) |
|
|
274 | (2) |
|
7.6 Logit and Probit Models |
|
|
276 | (5) |
|
|
281 | (2) |
8 Data Structure Analysis |
|
283 | (22) |
|
|
283 | (6) |
|
8.2 Dimensional Reduction |
|
|
289 | (3) |
|
8.3 Principal Components of Correlation Matrices |
|
|
292 | (7) |
|
|
299 | (3) |
|
|
302 | (3) |
9 Survival Analysis |
|
305 | (22) |
|
9.1 Survivor Function and Hazard Function |
|
|
305 | (1) |
|
9.2 Non-Parametric Analysis of Survival Data |
|
|
306 | (9) |
|
9.2.1 The Life Table Analysis |
|
|
306 | (5) |
|
9.2.2 The Kaplan-Meier Analysis |
|
|
311 | (2) |
|
9.2.3 Statistics for Non-Parametric Analysis |
|
|
313 | (2) |
|
9.3 Comparing Two Groups of Survival Data |
|
|
315 | (3) |
|
9.4 Models for Survival Data |
|
|
318 | (6) |
|
9.4.1 The Exponential Model |
|
|
318 | (2) |
|
|
320 | (2) |
|
9.4.3 The Cox Regression Model |
|
|
322 | (2) |
|
|
324 | (3) |
10 Directional Data |
|
327 | (26) |
|
10.1 Representing Directional Data |
|
|
327 | (4) |
|
10.2 Descriptive Statistics |
|
|
331 | (3) |
|
10.3 The von Mises Distributions |
|
|
334 | (4) |
|
10.4 Assessing the Distribution of Directional Data |
|
|
338 | (8) |
|
10.4.1 Graphical Assessment of Uniformity |
|
|
338 | (2) |
|
10.4.2 The Rayleigh Test of Uniformity |
|
|
340 | (2) |
|
10.4.3 The Watson Goodness of Fit Test |
|
|
342 | (2) |
|
10.4.4 Assessing the von Misesness of Spherical Distributions |
|
|
344 | (2) |
|
10.5 Tests on von Mises Distributions |
|
|
346 | (1) |
|
10.5.1 One-Sample Mean Test |
|
|
346 | (1) |
|
10.5.2 Mean Test for Two Independent Samples |
|
|
346 | (1) |
|
10.6 Non-Parametric Tests |
|
|
347 | (3) |
|
10.6.1 The Uniform Scores Test for Circular Data |
|
|
347 | (1) |
|
10.6.2 The Watson Test for Spherical Data |
|
|
348 | (2) |
|
10.6.3 Testing Two Paired Samples |
|
|
350 | (1) |
|
|
350 | (3) |
Appendix A - Short Survey on Probability Theory |
|
353 | (28) |
|
|
353 | (3) |
|
A.1.1 Events and Frequencies |
|
|
353 | (1) |
|
|
354 | (2) |
|
A.2 Conditional Probability and Independence |
|
|
356 | (2) |
|
A.2.1 Conditional Probability and Intersection Rule |
|
|
356 | (1) |
|
|
356 | (2) |
|
|
358 | (1) |
|
|
359 | (1) |
|
A.5 Random Variables and Distributions |
|
|
360 | (4) |
|
A.5.1 Definition of Random Variable |
|
|
360 | (1) |
|
A.5.2 Distribution and Density Functions |
|
|
361 | (2) |
|
A.5.3 Transformation of a Random Variable |
|
|
363 | (1) |
|
A.6 Expectation, Variance and Moments |
|
|
364 | (4) |
|
A.6.1 Definitions and Properties |
|
|
364 | (3) |
|
A.6.2 Moment-Generating Function |
|
|
367 | (1) |
|
|
368 | (1) |
|
A.7 The Binomial and Normal Distributions |
|
|
368 | (4) |
|
A.7.1 The Binomial Distribution |
|
|
368 | (1) |
|
A.7.2 The Laws of Large Numbers |
|
|
369 | (1) |
|
A.7.3 The Normal Distribution |
|
|
370 | (2) |
|
A.8 Multivariate Distributions |
|
|
372 | (9) |
|
|
372 | (2) |
|
|
374 | (1) |
|
A.8.3 Conditional Densities and Independence |
|
|
375 | (2) |
|
A.8.4 Sums of Random Variables |
|
|
377 | (1) |
|
A.8.5 Central Limit Theorem |
|
|
378 | (3) |
Appendix B - Distributions |
|
381 | (24) |
|
B.1 Discrete Distributions |
|
|
381 | (8) |
|
B.1.1 Bernoulli Distribution |
|
|
381 | (1) |
|
B.1.2 Uniform Distribution |
|
|
382 | (1) |
|
B.1.3 Geometric Distribution |
|
|
383 | (1) |
|
B.1.4 Hypergeometris Distribution |
|
|
384 | (1) |
|
B.1.5 Binomial Distribution |
|
|
385 | (1) |
|
B.1.6 Multinomial Distribution |
|
|
386 | (2) |
|
B.1.7 Poisson Distribution |
|
|
388 | (1) |
|
B.2 Continuous Distributions |
|
|
389 | (16) |
|
B.2.1 Uniform Distribution |
|
|
389 | (2) |
|
B.2.2 Normal Distribution |
|
|
391 | (1) |
|
B.2.3 Exponential Distribution |
|
|
392 | (2) |
|
B.2.4 Weibull Distribution |
|
|
394 | (1) |
|
|
395 | (1) |
|
|
396 | (2) |
|
B.2.7 Chi-Square Distribution |
|
|
398 | (1) |
|
B.2.8 Student's t Distribution |
|
|
399 | (2) |
|
|
401 | (1) |
|
B.2.10 Von Mises Distributions |
|
|
402 | (3) |
Appendix C - Point Estimation |
|
405 | (4) |
|
|
405 | (1) |
|
C.2 Estimation of Mean and Variance |
|
|
406 | (3) |
Appendix D - Tables |
|
409 | (10) |
|
D.1 Binomial Distribution |
|
|
409 | (6) |
|
|
415 | (1) |
|
D.3 Student's t Distribution |
|
|
416 | (1) |
|
D.4 Chi-Square Distribution |
|
|
417 | (1) |
|
D.5 Critical Values for the F Distribution |
|
|
418 | (1) |
Appendix E - Datasets |
|
419 | (18) |
|
|
419 | (1) |
|
|
419 | (1) |
|
|
420 | (1) |
|
|
420 | (1) |
|
|
421 | (1) |
|
|
422 | (1) |
|
|
423 | (1) |
|
|
423 | (1) |
|
|
424 | (1) |
|
|
424 | (1) |
|
|
425 | (1) |
|
|
425 | (1) |
|
|
425 | (1) |
|
|
426 | (1) |
|
|
426 | (1) |
|
|
427 | (1) |
|
|
428 | (1) |
|
|
428 | (1) |
|
|
429 | (1) |
|
|
429 | (1) |
|
|
429 | (1) |
|
|
430 | (1) |
|
|
430 | (1) |
|
|
431 | (1) |
|
|
431 | (1) |
|
|
432 | (1) |
|
|
432 | (1) |
|
|
433 | (1) |
|
|
434 | (1) |
|
|
434 | (1) |
|
|
434 | (1) |
|
|
435 | (2) |
Appendix F - Tools |
|
437 | (2) |
|
|
437 | (1) |
|
|
438 | (1) |
|
|
438 | (1) |
References |
|
439 | |