Survey Research Guidance

Surveys can be important tools for studying human behavior. In order to produce reliable and valid data, surveys must be carefully designed and rigorously pre-tested. This guidance is designed to provide best practice for survey research.

The Graduate School requires all students conducting survey research to complete this Survey Development Checklist [DOCX] and submit it with their IRB application in Cayuse IRB.

The Survey Development Checklist encompasses the practices described in this guidance.

Research Question/Hypothesis Generation

All research studies should be designed to generate data that, when analyzed, will either answer a question or confirm/disprove a hypothesis that is developed based on existing literature. Survey research is no different. A clearly articulated research question and/or hypothesis must be used so that the variables to be assessed can be clearly defined. If the researcher does not clearly define the variables, then it is impossible to measure the variables.

Once the research question and/or hypothesis is generated and the variables under study clearly defined, the researcher can go about deciding how to measure the variables and what kind of data will be collected.

Data Collection and Analysis

Before beginning any research study, it is important to understand the type of data that will be collected and to have a plan for how to analyze that data. With survey research, many different types of data can be collected. Data can be qualitative or quantitative. Quantitative data can be nominal/categorical, ordinal, interval or ratio. Each different type of data will require different data analysis techniques and statistical tests. Using data analysis techniques that are not consistent with the type of data collected will result in incorrect findings. Knowingly reporting incorrect findings is research misconduct.

In addition, it is important to determine how much data need to be collected. That is, how many people will need to be enrolled in the survey study in order to collect sufficient data for analysis (i.e., sufficient power). Often, for quantitative data, this involves a power analysis, using the expected effect size and an arbitrary standard for statistical power in order to determine how many subjects are needed. For qualitative data, saturation is used as a determination of sample size.

It is important to consider sample size in planning survey research for statistical validity and for human subject protection purposes. Enrolling too few subjects means that people were put at risk, no matter how minimal the risk may be, for no reason because viable findings won't result from the study. Enrolling too many subjects puts more people at risk than absolutely necessary. In addition, some statistical tests require a minimum sample size in order to produce valid results. All of this should be determined before data collection occurs.

Validity and Reliability


Validity refers to the survey or assessment actually measuring what it is designed to measure. Surveys must be valid in order to produce interpretable results. There are many different types of validity.

Content Validity is a qualitative assessment of the appropriateness of each survey item given the variables to be measured by the survey. Content validity is assessed by having people with knowledge of the content material and variables review each question to make sure that it belongs in the survey and is relevant to the variables being measured.

Criterion Validity is more stringent than content validity. Criterion validity uses an already-validated, standard measure of the variables under study as a comparison for the survey under design. The survey under design, if valid, should show a high correlation with the criterion measure.

Construct Validity assesses the practical utility of the survey. That is, do many different measures of the same variable produce similar results, and do different measures of similar but distinct variables show little to no relation?

There are many different ways to measure validity; however, all surveys should be validated before use in research. Validation can occur as pilot testing in a small sample for the purposes of survey development. Human subject review is typically not needed for validation provided the data used to validate the survey will not be combined with the data when the survey is deployed.


Reliability refers to the stability or repeatability of a survey or assessment, ensuring that the survey will obtain the same results for the same people under the same circumstances. There are three types of reliability.

Test-retest Reliability examines the correlation in responses to the same survey given to the same people at different time points. If the survey is reliable, the measurement should be very similar at both time points.

Alternate-form Reliability uses two differently worded versions of the same survey, with either question wording or question order slightly changed, to determine the reliability of the survey. Both versions of the survey should produce highly similar results.

Internal Consistency Reliability measures how all survey or scale items, when grouped together, measure the same variable. Internal consistency reflects how well items complement each other in measuring the same variable.

Different types of reliability may be appropriate for different surveys. For example, internal consistency reliability may not be the best measure of reliability for a demographic information survey because each item in a demographic survey measures a different construct or variable (i.e., a question about gender would not measure the same underlying variable as a question about income or race). It is important to consider the type of information in your survey when choosing a measure of reliability. Choosing an inappropriate measure of reliability will not result in a meaningful determination.

Survey Question Responses

Responses to survey questions must be designed to provide the type of data needed for the data analysis plan that has already been determined. Different statistical tests require different types of data. If the response choices to the survey questions do not provide the type of data required by a particular statistical test or data analysis plan, then the survey is not an accurate measurement tool for the variables under study.

Survey Review by Another Party

It is considered best practice to have a few people, ideally people who are not on your research team, review your survey for clarity. This is different from a review for content validity. The purpose of this review is to make sure that the questions, as worded, are being interpreted as you intended. If your questions are misinterpreted by a research participant, then the responses to those questions will not be valid.

Pilot Testing

Pilot testing refers to deploying your survey to a very small sample of people for evaluation purposes and either surveying or interviewing this sample afterward about the wording, content and clarity of the survey. Pilot testing will reveal if questions are clearly worded, if the survey is easy or burdensome to complete, and if the questions seem relevant to a typical research participant.

Most important, pilot testing will reveal if your questions and response choices produce sufficient variability, which is important for data analysis. For example, if you have a question with four response choices, and all pilot subjects choose the same response, then the question may not be meaningful to include in the survey as is and may need to be changed or reworded. 

Other Guidance

Additional Resources

For additional assistance in research design and statistics, including survey development, the Office of Graduate Studies and Research provides assistance to faculty and students. Please contact  [email protected] for an appointment.

Skip Section Navigation