|  | |
SSC Quickhelp | | |
Values and settings
General values
- Alpha
Suppose your research hypothesis is that people in London earn more money than people in Dublin. You
take a sample of each population, and average the income.
The result is that Londoners earn £2 more than Dubliners, but the difference is too small to be significant.
So your conclusion is that Londoners do not earn more money
than Dubliners.
In fact, your conclusions may be incorrect due to bad luck. Especially with high variance in the population
and a small sample size, you take the risk of researching non-representative samples.
Alpha is defined as "the probability that you conclude your research hypothesis (Londoners earn
more than Dubliners) is false while in fact it is true".
» See also: One-tailed test
» See also: Beta
- Population size
If the population size is small compared to the sample size, a correction for finite populations applies.
You don't need to asses the size of your population very precisely unless the sample size is indeed
large compared to the population size (say, 20% or higher). If your
population is large, it doesn't matter how big precisely.
» See also: Small population
- Response %
In survey research, the respons rate is usually far from 100%. A percentage between 30 and 70 is considered
normal.
» See also: Non-response
- Tolerance
Tolerance needs to be interpreted differently for the estimating a single characteristic versus concluding
about differences between two characteristics:
-
Single characteristic: in what interval (2 x tolerance) do you want the estimation to be? The
narrower you want this interval, the more samples you will need. For
example, it requires less samples to conclude some mean lies in the interval [2-10] than you need to
conclude the mean lies in the interval [5.1-5.3].
-
Difference between characteristics: How big a difference really makes a difference for you? Suppose
two proportions from two different groups are 0.85 and 0.87.
Would you treat them as two groups? Does this 0.02 difference make sense, real-world wise?
The units of the tolerance value are the same as the population characteristics value. Suppose you measure
a proportion: in that case, the tolerance value would not be
allowed to exceed the proportion value you expect.
Also note that the actual interval around the estimator is twice the tolerance value. For example, a
tolerance value 0.5 on an estimated mean of 1 means that you
conclude the true value of the mean lies in the interval [0.5-1.5]. The interval length is 1 or 2 x
0.5.
Values appropriate
to means
- Mean
The estimation of the mean is not used in these calculations. It only helps to interpret the numeric
results in the report generated by SSC.
- Standard deviation
It is important that you asses the amount of variability in the population as precisely as possible.
The statistical definition of a sample standard deviation, the number
you input here, is: the square root of the sum over all sample elements of squared differences between
the sample element and the mean, divided by sample size minus
one.
If you don't know the variability, you can use these sources:
-
Conduct a small preliminary research to asses the variability
-
Use your previous research, even if not fully relevant.
-
Use the literature. Other studies about the same subject often mention variances or standard deviations.
-
Use standard scales for which you can guess the variance well.
Be sure to recalculate your sample size after you have done the actual research, to asses if your original
guess was reliable enough.
Values appropriate to
proportions
- Proportion
The proportion is used mainly for yes/no questions. For example: a business analyst wants to know how
many people in a certain population are willing to invest in a
new investment plan. He can use the simple yes/no question and apply the sample size calculations for
proportions.
If you're unsure about the proportion of yes-sayers, leave it at 0.5. This is the most conservative
estimate.
If you suspect proportions are extreme (below .2 or above .8), use the correction for extreme proportions
to avoid underestimation of your sample size.
» See also: Extreme proportions
Values appropriate
to differences
- Beta
Beta is the risk you take to miss an existing difference between two populations. Suppose Londoners
indeed earn more than Dubliners (for example, the real difference
between their average incomes is £150 per year) but due to a small sample size and high variance within
each group, you find that there is no significant difference.
In that case, you would miss the real difference and draw an erroneous conclusion.
Statisticians call this the "power" of your test (power = 1 - beta): the higher you want the
power to detect existing differences, the higher sample size you need.
next Corrections
» See also: Alpha
|