GROUP 9 - SAMPLING - Sushant Kulkarni

Sampling

Sampling is to select a group of people among a large population to estimate the characteristics of the whole lot.

Why Use Statistical Sampling?

By: Harold Jennings & Robert Schauer

We all know that taxpayer’s business records are often voluminous. The tax that is due is most often determined at the transaction level

requiring the auditor tolook at the source record to make a proper audit determination as to whether an error in reporting exists. A taxpayer could have hundreds, thousands, or even millions of transactions in any tax reporting period. Tax auditors typically audit tax periods that extend into the years, so it is often quite impractical to audit every business transaction. But if the auditor did so, that is giving equal and complete coverage to each transaction within the scope of the audit, than the auditor has done a detailed audit. In a detailed audit, the auditor will compute a total error amount for all audited transactions. This total error could equal zero (no change audit) or could represent a net tax overpayment or underpayment. If a detailed audit is possible and practical, it is always the preferred method of determining total error. But sifting through all transactions is not going to be practical in many audit cases, so the auditor must decide between two alternatives. The auditor could ignore certain business transactions (no audit of certain transactions). Or, the auditor could take a sample and presume that the audited sample results, if projected to the population, will be relatively accurate. (Oftentimes, the auditor will do both). Note that if a sample is projected, the detailed audit is the standard by which we should judge any sample results. We should be able to use a sample projection if we can prove with enough confidence, that the difference between the sample projection and the true total error, had a detailed audit been performed, is relatively small. But how can this possible if a detailed examination is never performed? The key to proving the accuracy of the sample lies in how the sample is taken from the population. Within the profession, auditors can take samples in a variety of different ways. But in essence, all different sampling methods can be reduced to two kinds of sampling. To do a statistical sample, the auditor must take a probability sample. A probability sample is any sample where all population units have a chance at selection - and this chance of selection is known, but not necessarily equal. Anything other than a probability sample is a judgmental sample, the other basic form of sampling. Probability samples include simple random samples, where all members of the sampled population have equal chance of being selected into the sample. Or more commonly, auditors will use stratified random samples. In a stratified random sample, the population is divided into groups, or strata. Within each stratum, all stratum units have an equal chance at being selected into the sample. But across the strata, the chances for selection for all population units differ across the strata, but the probability of selection for any unit in a stratified population is known. Finally, in judgmental sampling, the probability of selection is not known for any of the units, and includes block sampling that is common in auditing. The auditor can use the audit results of a probability sample (ether a simple random or a stratified random sample), and objectively prove, using probability theory, the accuracy of the sample. That is, the projected results can be compared to a detailed audit with some degree of confidence, had one been done. In any other type of sampling other than probability sampling, accuracy cannot be objectively measured. In all other types of sampling, accuracy of the projected sample results is a matter of subjective judgment (hence the name judgmental sampling). Therefore, if objective proof of the accuracy of the sample is a concern, then the auditor should be using probability sampling. But there are other concerns as well. These include efficiency and accuracy. With regard to accuracy, we would like to use a sample of the smallest size to give us the accuracy we desire. In most cases, this is going to be from a probability sample. Block samples tend to be less accurate for any given sample size, when compared to probability samples. This often has to do with the fact that the probability sample will come from the entire population, and a block sample will only come from one (or a few) portions of the population (there are other statistical reasons for this as well, which we will not discuss here). But on the other hand, convenience often enters into the picture, and auditors opt to take a block sample in any case. But the price that is paid is that the sample results will likely not be as accurate given the number of units to be audited, and no objective statement of accuracy can be made about the projected sample results. We believe, as auditors, that accuracy is always of the utmost concern, and therefore, statistical sampling, when possible, should be the preferred method of sampling. To that end, the Multistate Tax Commission offers a course in statistical sampling for tax auditors. The Commission also invites others, including those in private practice, to take the training if there is interest. Please visit www.MTC.gov for fee schedules, class times, and registration information.

Two types of sampling:

Probability sampling

· Simple random: it is done among equal probability/ by unequal probability. It works very good with homogenous population.

· Systematic: Fixed probability.

· Cluster sampling: different type of people gathering at one place

· Stratified sampling: to pick up a sample from an area where everything is same.

Why Probability Sampling?

· Probability sampling, where a small randomly selected sample of the population can be used to estimate the distribution of an attitude or opinion in the entire population with statistical confidence, provides the foundation for survey research and political polling. The basis of probability-based random sampling is that every member of the population must have a known, non-zero chance of being selected. Probability sampling provides the means by which the margin of sampling error can be calculated and the level of confidence in survey estimates reported. Sampling error results from collecting data from some rather than all members of the population and is highly dependent on the size of the sample.

Non probability sampling

· Convenience Sampling - This technique is considered easiest, cheapest and least time consuming.

· Consecutive Sampling - This non-probability sampling technique can be considered as the best of all non-probability samples because it includes all subjects that are available that makes the sample a better representation of the entire population.

· Quota Sampling - Quota sampling is a non-probability sampling technique wherein the researcher ensures equal or proportionate representation of subjects depending on which trait is considered as basis of the quota.

· Judgmental Sampling - In this type of sampling, subjects are chosen to be part of the sample with a specific purpose in mind

· Snowball Sampling - In this type of sampling, the researcher asks the initial subject

When to Use Non-Probability Sampling?

▪ This type of sampling can be used when demonstrating that a particular trait exists in the population.

▪ It can also be used when the researcher aims to do a qualitative, pilot or exploratory study.

▪ It can be used when randomization is impossible like when the population is almost limitless.

▪ It can be used when the research does not aim to generate results that will be used to create generalizations pertaining to the entire population.

▪ It is also useful when the researcher has limited budget, time and workforce.

▪ This technique can also be used in an initial study which will be carried out again using a randomized, probability sampling.

Sampling techniques: Advantages and disadvantages

Technique	Descriptions	Advantages	Disadvantages
Simple random	Random sample from whole population	Highly representative if all subjects participate; the ideal	Not possible without complete list of population members; potentially uneconomical to achieve; can be disruptive to isolate members from a group; time-scale may be too long, data/sample could change
Stratified random	Random sample from identifiable groups (strata), subgroups, etc.	Can ensure that specific groups are represented, even proportionally, in the sample(s) (e.g., by gender), by selecting individuals from strata list	More complex, requires greater effort than simple random; strata must be carefully defined
Cluster	Random samples of successive clusters of subjects (e.g., by institution) until small groups are chosen as units	Possible to select randomly when no single list of population members exists, but local lists do; data collected on groups may avoid introduction of confounding by isolating members	Clusters in a level must be equivalent and some natural ones are not for essential characteristics (e.g., geographic: numbers equal, but unemployment rates differ)
Stage	Combination of cluster (randomly selecting clusters) and random or stratified random sampling of individuals	Can make up probability sample by random at stages and within groups; possible to select random sample when population lists are very localized	Complex, combines limitations of cluster and stratified random sampling
Purposive	Hand-pick subjects on the basis of specific characteristics	Ensures balance of group sizes when multiple groups are to be selected	Samples are not easily defensible as being representative of populations due to potential subjectivity of researcher
Quota	Select individuals as they come to fill a quota by characteristics proportional to populations	Ensures selection of adequate numbers of subjects with appropriate characteristics	Not possible to prove that the sample is representative of designated population
Snowball	Subjects with desired traits or characteristics give names of further appropriate subjects	Possible to include members of groups where no lists or identifiable clusters even exist (e.g., drug abusers, criminals)	No way of knowing whether the sample is representative of the population
Volunteer, accidental, convenience	Either asking for volunteers, or the consequence of not all those selected finally participating, or a set of subjects who just happen to be available	Inexpensive way of ensuring sufficient numbers of a study	Can be highly unrepresentative

Wednesday, 9 October 2013

SAMPLING