Elimination of observer bias (or conscious or unconscious influence by researcher and subjects)
Blinding is important --> always advantageous
People in study are unaware and cannot be aware of
Enrolment before allocation.
Not just blind to treatment, but also to allocation.
3rd party (not involved in the study) does the allocation.
Recruited study sample
* Inclusive? Exclusive?
* Exclusive --> Use very tight criteria for inclusion. May affect how well the result of the study can be applied to the rest of the population. Generalisibility not good. Result may only apply to a subset of the population.
Randomisation
* New therapy
* Standard therapy
Outcome measurement
Identification of study population
Draw study sample
Identify outcome parametres
Always retrospective
Cases vs controls
* Population-base would be better
Different levels of exposure between groups --> People in the control group may also have the exposure to the study factor (e.g. smoking) as people in the cases.
Good for hypothesis generation. Very poor for proving causation.
Observational.
e.g. look at all lung cancer patients --> check if they are smokers.
At the end of the day, studies are designed to find out whether the difference between groups are due to chance.
Level one: Large RCTs or systematic reviews with formal meta-analysis
Level two: Smaller RCTs or good cohorts
Level three: case-controlled studies
Level four: Cross-sectional or descriptive studies
Level five: Case reports, anecdote
In general, data can either
* Conforms to a described distribution (Parametric), OR
* Distribution-free (non-parametric)
Population
Sample
Inference
Error. Sampling error vs non-sampling errors.
A numerical value describing a characteristic of a sample is called a statistic
A statistic is often used as an estimate of a population parameters
Mean, median, mode (central tendency)
Standard deviation, range (dispersion)
Bell-shaped, symmetrical about its mean
Has the property that 32% of the area under the curve is outside the 1 standard deviation in either direction from the mean, and 5% of the area is outside 1.96 standard deviation
Standard deviation = square root of variance
SEM
= standard error of mean
= SD of the distribution of sample means
* Tend to be smaller than SD
* More relevant in confidence interval
Variance = sum of {square of (difference between sample and mean)} / (n-1)
Use common sense
If not continuous, is it interval or categorical
--> Popular question
Questions can be:
Single group - descriptive only
Two groups
Number at any one time = prevalence
Rate at which things occur = incidence
Measures of frequency and association
Relative risk (RR)
2x2 table (drug 1 and drug 2, outcome 1, and outcome 2)
The ratio of incidence in one group to another
--> Chi-square test
Odds ratio (OR)
Ratio of two sections of a population. Not a proportion and therefore not interpretable as risk.
Drug 1 | Drug2 | |
Adverse effect | A | B |
No adverse effect | C | D |
Odds ratio = (a/c)/(b/d)
IV | not IV | Total | |
+ve | 35 | 380 | 415 |
-ve | 15 | 570 | 585 |
Total | 50 | 950 | 1000 |
Prevalence = 50/1000 = 5%
Sensitivity
= TP / (TP + FN)
= 35/50 = 70%
= the chance of the test being positive when patient is positive
Specificity
= TN / (FP + TN)
= 570/950 = 60%
= the chance of the test being negative when patient is negative
Positive predictive value
= 35/415
= 8.4%
= Requires prevalence to calculate
NB:
If we incorrectly reject the null hypothesis, we commit a type I error (alpha)
* Cannot be helped
* e.g. a true hypothesis is rejected because the data falls outside 95% CI due to chance
* Conventionally 5% of type I error is acceptable
With small samples and/or small real differences, we may accept the null hypothesis when it is incorrect --> a type II error (beta)
* occurs more often
* Conventionally 20% of type II error is acceptable
Increasing sample size is the only way of improving the chance of avoiding both errors at the same time
The difference we choose should be the smallest clinically important difference
To calculate sample size, therefore, we need to know or define
* minimal important effect
* type I error acceptable
* Type II error acceptable
* The variability (or SD) of the characteristics we are measuring in the sample we will use
P-vale and the point estimate of the effect gives us a lot of information, however...
There is no information on the precision of the result and the confidence we should have to apply the result in clinical practice
CI gives us this information
The range of values within which we can be 95% confident that the population value lies
General formula: d +/- 1.96 x standard error