Research Study Controls for Confounding Variables on the Dependent Variable

Subject: Sciences
Pages: 9
Words: 2362
Reading time:
9 min


The ability to identify and control the possible confounding variable in any research is one of the critical tasks that researchers have to tackle to produce authentic and reliable results. Just in case a confounding variable is not tackled and then it affects the outcomes of a research, then there can be no meaningful drawing of conclusions from the exhausting work of designing and conducting the research. As a result, there has been in increase in the number of study methodologies dedicated to resolving of the task at hand. It is therefore pertinent to be able to define what a confounding variable is so as to understand exactly what the researcher is dealing with in a research.

Confounding Variables

In its simple description, a confounding variable has time and gain been regarded as an extraneous variable capable of confounding research outcome owing to the fact that it is not the focal point of the research, and yet it bears a correlation with the independent variables, not to mention that it often affect them. This means that the independent variable will change based on the change in the confounding variables and this will definitely bring out unintended results for certain conditions (Parodi & Bottarelli, 2005, p. 20). This is a big problem because the goal of a study design is to have an experiment where the dependent variables have differences based on the change of independent variables (Parodi & Bottarelli, 2005, p. 20). This is what enables the researcher to draw conclusions that manipulating the variable causes the change in dependent variable. However, if an extra factor changes together with the variable, it could as well be the cause of the difference in dependent variable. For authentic conclusions, the confounding variable has to be controlled and there are several ways to achieve this (Parodi & Bottarelli, 2005, p. 22).

Random Assignment

Random assignment is regarded as a vital process in as far as the control of confounding variable is concerned. The reason behind this is that the process is capable of controlling the unknown and known variables. Due to this unique advantage, researchers should use this method always, whenever possible. Random assignment is totally different from random selection especially in their objectives (Little & Rubin, 2000, p. 122).

Demonstration of How confounding Variables are Eliminated

The random assignment begins by a group of participants who are selected to take part in a research (Parodi & Bottarelli, 2005, p. 23). After this, the researcher employs random assignment process to the participants categorizing them in arbitrary groups. The most common process of carrying out the random assignment is by use of a list of random numbers like an example below

1 2 3 4 5 6 7 8 9 10
8 4
7 2
0 2
1 7
2 1
0 1
7 2
1 6
0 6
5 1
2 1
2 8
9 7
0 3
4 1
4 7
1 3
0 7
3 4
0 9
3 2
8 9
1 7
3 2
1 7
0 7
2 0
0 6
2 1
1 3
1 2
7 9
6 3
3 2
1 8
5 6
0 3
7 5
0 4
6 7
1 3
0 0
4 7
0 2
0 9
7 4
0 3
1 1
9 4
7 5

This list has ten rows and ten columns and from it, we can realise that each number in each position is randomly positioned and that each number had equally probability of selection. Placing one number in one position did not in anyways affect another number for any other position. Because the individual numbers are randomly positioned, combination of numbers has also to be random (Little & Rubin, 2000, p. 122). Assuming that a study has 20 participants and the researcher wants to randomly assign them to four groups, first, each participant is allotted an identification number from 0 to 19. The list of random numbers is blocked into a list of two columns to have ten pairs. The next step is to pick five participants randomly from the sample of 20 and put them in the first group. Then a second group is randomly selected and assigned (Little & Rubin, 2000, p. 122). The same process is done for the third and fourth groups. The objective carrying out this random selection is to ensure that the sample obtain is an indiscriminate that is a representation of a normally distributed population.

To randomly select the first participant, the researcher proceeds down the first column to find a number less than 20. For instance from the above list, that number is 12, and this becomes the first participants. The next number is 17, who become the second participant. The researcher progresses with the procedure until all the participants have been selected (Little & Rubin, 2000, p. 123). Anytime a number that has already been selected is encountered, it’s disregarded. If the number of the participant is lacking in the table another random table is constructed and the process repeated. This process ensures that the participants have equal chance of selection based on probability (Stommel & Willis, 2007, p. 107).

Once each of the participants for the four groups has been elected, they are further assigned to the experimental groups by a random process. They are assigned numbers from 0 to 4. The researcher reads down the table for the numbers to find a number less than 5. From this example the first number is 02; this is then assigned to group 1. The next number is 01 and it’s allocated to group 2. The researcher proceeds with the process for all the four groups. This step ensure that each group probabilistically equal in every aspect (Stommel & Willis, 2007, p. 107).

This process clearly shows that the researcher is taken out of the difficulty of deciding which participant gets to what group rather the mathematical probability theory helps to do the random assignment (Stommel & Willis, 2007, p. 108). In medical research for the effects of drugs, it makes it plausible that the effects observed were evidently because of the administration of the drug through IV. Even when another controlling technique will be used, for instance Analysis of variance, the outcomes is more improved if random assignment is also performed (Stommel & Willis, 2007, p. 108).

Researchers refer to random assignment as a secret attaining a strong experimental design for any research. The random selection gives the research that exterior validity and the random assignment seals the internal validity therefore authenticating the research methodology (Keele, 2010, p. 23). Considering that the major objective of research is to establish concrete reason for cause and effect, the process of random assignment is very critical in ensuring that this is achieved.

The problems of differential influence is eliminated because the process gives each group equal chance of formation and the participants that get into the groups are not discriminated based on extraneous variables (Keele, 2010, p. 23). This means that the characteristic that each participant takes to one group is likely be taken to another group by another participant as well. Therefore the distribution of characteristics is almost equal across the groups.

Besides being able to control variable even when the research has no idea of what the confounding variable are, the random assignment has the advantage of statistically equating the groups of participant hence deviation can be quantified (Keele, 2010, p. 23). Inferential statistics help in compute the probability that the groups are equal. The whole process limits the biasness that could have resulted from the confounding variable on the results because initial differences (Stommel & Willis, 2007, p. 110). Extraneous variable should not be correlated with the independent variables.

Matching Confounders

Matching is very common in research case studies especially for the epidemiologic investigations and this process is basically the selecting of unexposed subjects, that is, the type of controls that in some critical characteristics are similar to cases. Matching is habitually used in case control but it’s also very useful in cohort studies and basically focuses on two extraneous factors, age and sex (Garay, 2004, p. 3). If the process of data collection is likely to be costly exercise, it’s desirable that the information that will be collected from should be optimised. This can be achieved by matching the cases to be studied with the control group. The process of matching the control with cases is extensively used in epidemiological studies (Rothman et al, 2008, p. 174). For instance, the study of the prevalence of diseases in the community like cancers and cardiovascular diseases utilise this method most of the time.

Application of matching for controlling confounding variable is relevant when it’s anticipated that there will be considerable difference of the results between cases and the controls due to the extraneous variables (Garay, 2004, p. 6). The confounder is also regarded as the third variable that correlates to the two variables in a study – the independent and independent variables. The mere existence of confounder causes biasness because of the change it contributes hence making the outcomes implausible. The researcher cannot confidently say that the observed results are attributable to the independent variable when the confounder was not eliminated.

There is a misconception regarding matching is that its objective is to enhance validity of the research. Nonetheless, the objective of matching is not exactly that but instead to increase the research efficiency (Rothman et al, 2008, p. 174). The efficiency is usually improved by a very small degree when matching is done unless in the case where the confounder is very influential.

Matching deals thing the confounders at design stage of the research rather than at the analysis stage. There are manly two types of matching techniques – individual and frequency matching (Garay, 2004, p. 6).

Individual matching

This process entails matching the controls to cases on one or several factors like gender, age or smoking. Ever paired case-control group is set to identical values on the matching aspects. This process takes more complex analyses that the unmatched data because the unmatched data still has to be stratified (Rothman et al, 2008, p. 176). Every match set has its owned defined aspect and the set can be treated as a single entity.

Frequency Matching

The matching is done based on a cell rather than individuals. Frequency matching on variables like age and gender is explained as follows, if there is 15% of the cases are grouped in a group of participants aged 50-54 years and are women; then the controls will be grouped in similar manner. This does not need matching analysis as the researcher just picks a random sample from that group but he/she will have to wait for the cases to be enough before picking the controls (Rothman et al, 2008, p. 177).

Matching Procedure

The first step is usually to identify the extraneous variable that needs to be matched and the variable will then be known as matching variables. This procedure matches corresponding cases to the corresponding controls hence eliminating the impact of differential influence. It’s usually possible to match groups on several of extraneous variables (Garay, 2004, p. 6). For instance if a researcher wants to match two categories of sample in groups that will be allocated treatment and the control groups on Intelligence Quotient (IQ).

IQ will be the matching variable in this case. The researcher will then rank the participants based on the level of IQ from the highest to the lowest. Some will be allocated in the case study or experimental category and the others will be placed in the control group. This allocation is usually done by random process in most cases using the random assignment already explained above (Rothman et al, 2008, p. 88). This is very effective because the two processes are merged for experimentation.

The subsequent step involves assigning one individual with highest IQ to each of the two groups respectively – the experimental and the control group. The researcher continues with the process for the rest of the ranks until the lowest IQ are allocated to the two groups. Once this has been achieved, the two groups are said to me a match and ready for next step of experimentation. If random assignment was not done, it’s possible that the groups may not be matched on several other variables even though IQ will be matched properly. This is one major weakness of using matching alone without random assignment (Rothman et al, 2008, p. 88).

Demonstration of Matching

An example of data matching: individual matching

Unmatched table

cases controls Total
Exposed 242 202 444
Unexposed 36 76 36
Total 278 278 480

Matched data in epidemiology: occurrence of a disease

This is data about prevalence of Echovirus in Germany and the risk factor for occurrence of the virus.

Cases Controls
Exposed Unexposed Total
Exposed 195 47 242
Unexposed 7 29 36
Total 202 76 278

The groups marked in red are the concordant pairs while the green marked cells are representing discordant pair.

Frequency Matching

The controls are chosen from the groups of the matching category based on the distribution on the cases under research. The confounder is also distributed.

Age Cases Controls, Matched
0-14 10 10
15-29 15 15
30-44 35 35
>44 25 25
Total 85 85

Matched stratum 1

0-14 yrs Cases Controls
Exposed 6 1
Unexposed 4 9
Total 10 10

Matched stratum 2

15-29 Cases Controls
Exposed 7 5
Unexposed 8 10
Total 15 15

Even though matching manages to eliminate confounder effect, it’s usually biased and not a good representation of the population because the matching is done by selection.


It’s important to eliminate the effects of the confounding variables on the results because, if these variables affect the dependent variables, then it would mean that the validity of the results is questionable. The techniques to eliminate confounding effect should always be conducted because at times it’s difficult to know which ones they are. This means that if matching and random assignments were conducted initially even if the unidentified confounding variable would not influence the conclusions. And if on the other hand the variable influences the variable and the groups were equivalent on that potential confounder, then that effect on the dependent would be equal to the same effect in all the categories hence not influencing relativity of among the groups.

Reference List

Garay, K.W. (2004). The Role Of Matching In Epidemiological Studies, Am J. Pharm. Edu,3: P. 1–7.

Keele, R. (2010). Nursing Research and Evidence-Based Practice, Sudbury; Jones and Bartlett Publisher

Little, R.J & Rubin, D.B. (2000). Causal Effects in Clinical and Epidemiological Studies via Potential Outcomes: Concepts and Analytical Approaches, Annu Rev Public Health, 21: 121 – 45.

Parodi S. & Bottarelli E. (2005). Controlling For Confounding In CaseControl Studies, Ann. Fac. Med. Vet. Parma, 25: 1946

Rothman, K. , Greenland, S., & Lash, T., (2008). Modern Epidemiology, Philadelphia, Lippincott Williams & Wilkins, P. 174-78

Stommel, M., & Willis, C., (2007). Clinical Research: Concepts and Principles for Advanced Practice Nurses, Philadelphia; Lippincott Williams & Wilkins, p. 106 – 15