Characteristics of Sampling

Sample Design

Sample design is an integral and vital part of the overall research design.

Four facets of sampling surveys are:

definition of population

methodology

number of stages in sample design

stratification of population

Sample size depends on the basic characteristics of the population, the type of information needed and, of course, the cost. It does not depend on the application of some arbitrary percentage.

 

Sample Design

Sample values are estimators of true population values, and so the former inevitably contain some measure of sampling error.

The degree to which numerical data are distributed about an average value is known as the dispersion or variation; a well-known and widely used measure ' the standard deviation (or its square, the variance).

The standard deviation, known as standard error of the mean, indicates the precision and reliability of a sample estimate.

 

The Normal Distribution: The Mystery of the Bell Curve

The normal distribution is the most widely used distribution and is most commonly known as the bell curve.

When a probability mass function is based on many trials, the curve tends to fill in and become bell shaped.

We call this a probability density function.

The hump in the middle is caused by the Central Limit Theorem. It states that "the distribution of averages of repeated independent samples will take the form of the bell-shaped normal distribution.

This is because a large number of independent samples tend to a central average.

 

The Normal Distribution

The normal distribution is easy to use and it has been proven to closely approximate reality.

Just about anything can be rationalized as an average, hence the usefulness of normal distributions.

The figure at the left shows the normal curve characteristic of many business program classes.

 

Measures of the Normal Curve

The bell-shaped curve is described by two terms, the mean and its standard deviation (SD).

The mean (µ) is the center of the curve. The mean is commonly called the "average." It is the result of adding up the data and dividing by the number of data points.

The standard deviation (s) is how wide the curve appears.

The SD can also be described as a measure of the "variability from the mean."

Other less-used measures of averages for a set of data are the median, the item in the middle of the list if sorted by size, and the mode, the item occurring most frequently in a data set.

 

Measures of the Normal Curve

Always, the sum of all the outcomes as represented by the region under the curve equals 100 percent.

What makes the normal distribution’s curve special is that for any given SD measure away from the mean or the center, the same probability exist for an event despite the normal distribution’s shape.

 

Normal Distribution Retailing Example

Al Bundy, a shoe store owner, wants to make sure he has enough stock for all size requests. He purchased a study of ladies' shoe sizes from the Academy of Feet and received a stack of research data from survey responses.

He plotted the data on graph paper and it appeared as a normal distribution. He also entered the series of sizes in his calculator and hit the "Standard Deviation" key. The answer was 2. Al also took the average or mean of all the surveys' respondents' sizes and found it to be 7. Looking at the graph he created, he saw that it looked like our trusty normal distribution.

 

Normal Distribution Retailing Example

Just by recognizing the shape, Al could apply the laws of the normal distribution curve. The laws governing the area under all normal curves are the following:

1 SD = .3413

2 SD = .4772

3 SD = .49865

4 SD = .4999683

Using these rules, if Mr. Bundy stocks sizes 5 to 9 he has covered .6826 (2 x,3413) of the population. Increasing the sizes to 3 to 11, he has covered .9544 of the feet out there. If AI stocked sizes 1 to 13, .9973 of customers at his store would be satisfied with his selection. He can always special-order for those feet beyond sizes 1 to 13.

 

Normal Distribution of Shoes

Normal Distribution Tables and the Z Value

Of course normal distribution tables have been developed to determine the probability for any specific point on the curve (non-integer SDs away from the mean). To use the tables, a Z value must be calculated.

Z= ((point of interest)-Mean))/SD

 

A Normal Curve Finance Example

Let's apply these new pieces of probability theory to finance. The monthly stock returns of a volatile stock, Pioneer Aviation, are assumed normally distributed as shown by a plotted graph. A summary of historical returns shows a mean (center) of 1 percent and an SD (dispersion) of 1 1 percent. Gerald Rasmussen wanted to know what was the probability that next month's return would be less than 13 percent.

Using our new Z value tool we can figure it out:

z=(13-1)/11=1.09 SD away from the mean

 

A Normal Curve Finance Example

A normal distribution table tells us that 1.09 SDs=.3621.

The entire left side of the graph equals .5000, as any complete half of the distribution would.

There is a 50% chance of being above or below the center or mean in any normal distribution.

Combining these pieces of information we calculate that there is an .8621 (.3621+.50) probability that there will be a return of less than 13%.

Conversely there will be a .1379 chance that it will be greater (1-.8621).

 

Sigma Demystified

Sigma (s) can be difficult to understand where it comes from when doing calculation exercises.

What you should remember is that Sigma applies to the population as a whole and not to the sample.

Sigma squared (s2) is a measurement of all units in the population from the mean.

We never really compute s2 or s from the population (the object of sampling is to avoid this costly procedure).

Sigma, therefore is normally estimated from a pilot study.