How to Build a Survey

How to build a survey, avoid the pitfalls make sure you dont miss often overlooked steps and get it into field like the professionals.

If you just want to know the buttons to press, watch the video or sign up for a trial, there’s plenty of tutorials to go through.

Otherwise read on for the real substance in what to consider when you build a survey.

This is part of a series of articles sharing research methods learned from thousands of commercial and academic studies at SurveyEngine GmbH.

Preparation

Success depends upon previous preparation, and without such preparation there is sure to be failure.

Confucius

If will save you a lot of time later if you get your survey organised before starting with survey coding.

In your survey you will be collecting data. That data will then go on its own journey as its cleaned, filtered organised, shared with colleagues and analysed. preparing for those stages means thinking now.

Tip 1 – Use Short Variable Names

Many common stats packages are stuck with legacy variable naming conventions or maximum variable lengths.

The table on the right shows how some innocuous variants of the variable gender are accepted by the most popular analysis tools. Many also have a modest maximum size with SPSS and STATA having a 32 character limit.

comparison of variable names compatibilities with statistics packages.

It’s also possible that these names end up places you hadn’t not considered such as files or folders. If you haven’t thought about the naming now you may end up with inconsistencies and therefore a higher chance of mistakes.

You also should consider that your variable names may need to have modifiers appended such as “male, female” etc.

For maximum compatibility with software that you may use be as conservative as possible with the naming. You can always have an auxillary dictionary file to explain them. A conservative rule would be only allow names that

  1. are all lowercase, no spaces
  2. use the ASCII character set (abc etc)
  3. start with an alphabetic character (a-z)
  4. use only numbers and underscores as special characters (e.g. q1_abc)
  5. are less than 16 characters long
  6. shorter if you consider appending further descriptives e.g. s1_age, s2_gender

Tip 2 – Use a Variable Naming Convention

This will save you lots of time later and avoid potential mistakes. Build a naming system that will allow easy recall of type, order and location of that variable.

Below is an example of a rule for a typical survey

  • screening questions should start with an ‘s’ and increase numerically from 1. e.g. s1, s2 etc.
  • demographic questions should start with an ‘d’ and increase numerically from 1. e.g. d1, d2 etc.
  • other general questions should start with an ‘s’ and increase numerically from 1.e.g. q1, q2, q3 etc.

Tip 3 – Name Surveys Properly

Good naming also applies to project names. Even though you may be only building one survey – there will be backups, drafts and simulations. At some point this will be shared with others: respondents, reviewers and panel providers by way of a survey link. As such, it is important that this doesn’t change too much, as each time you will may to recontact everyone you have distributed the link to.

At SurveyEngine we use the following project naming convention when we build surveys

<project name>_live – this is the project that will be collecting data – its name never changes
<project name>_bak<n> – a checkpointed backup of project at successive point ‘n’
<project name>_sim<n> – checkpointed webbot simulations of the project

Tip 4 – Agree Which direction is up

If you have two variables, say ‘satisfaction’ and ‘price’, its helpful to make sure that rating ‘1’ for example is the lowest for both, or alternately, highest for both. Mixing the order of the meanings will no only confuse your analyst but likely your respondent as well.

Building Screening Questions

Most surveys will have questions designed to only allow relevant respondents to participate in the study. These are called screener questions.

The respondent is asked, for example their age and then if the are under 18 they are excluded.

Screening plays an important function in many aspects of a survey. In survey’s where there is an incentive for completion, paying for irrelevant respondents means there is less budget for the respondents you want.

When we build more sophisticated surveys involving fine stratification, quota and sampling schemes, live screening of respondents is critical and post collection filtering is not an option.

Tip 1- Avoid leading questions

if you plan to screen respondents according to a self-reported question, it is best to not expose the purpose of the question. So, if you just wanted females from California aged over 50.

So this is better

Than this

Tip 2 – Screen at the Start

Put your screener question at the beginning of your survey. Otherwise you are paying respondents that you were not looking for.

Many panel providers will have different rates for when respondents are screened out. Putting the screener in the first few pages means you can claim a discount for ‘early screen-outs’

Tip 3 – Add tracking even if you don’t think you need it

Even if you don’t think you’ll need it – its a good idea to at least collect tracking and device data when you build a survey. Do this during your pilot and evaluate whether you need to use it to screen out later.

Additional tracking may include:

  • Dropping a cookie – to detect multiple attempts from same respondent
  • Detecting device size – to see if there are any layout biases
  • Log browser type
  • Mouse movement tracking – useful later if you want to undertand engagement
  • Captchas – initially to log if you suspect bots may be completing the survey


Survey Coding

When building more complex surveys you should consider the auditing aspects of the survey. If something unexpected happens, how quickly can you identify and fix it.

Tip 1 – store your intermediate values

This applies if you are using any sort of dynamic calculations. For example adjusting pricing questions based on the respondents income.

Instead of doing the calculation on the fly and displaying it to the respondent, store the calculation and in a temporary variable, then show that variable to the respondent.

What problem does this solve? Well say partway through a study you get the comment

“I didn’t choose any of the products becuase they were too expensive”.

Perhaps there was a coding error you think. To audit this and recover what was actually shown is much easier if it’s stored in the respondents data. Displaying it on the fly is dangerous because you are relying on implicit data, the calculation, which may have been corrected in between. Resolving what was really displayed will be difficult. Conversely, having the calculated value in your data will make your analyst’s job much easier if they want to use it.


Tip 2 – Keep the respondent in the survey

Sometimes it will be necessary to provide additional information to the respondent as web links such as privacy policies or University websites.

Provide these as either:

  • as text only – such that the respondent would need to manually copy the link into another browser window or
  • as a popup link such that the page appears in a different browser window. This may require some technical expertise, in which case a) is the preferred method.

Survey Management

Sounds boring, until you have problems that only going back in time can solve.

Tip 1 – Use Versioning

Each time you publish a survey project file in SurveyEngine, all the changes you made since the last publish will be recorded as a single new version. Versions starting at one (1) and increasing each time you publish. This is similar in most professional systems.

A version publishing scheme has lots of benefits other than auditing changes to the survey. It also allows updates to a survey to happen in parallel with live data collection. Any links you have used will still continue to work. New respondents will automatically be routed to the most recent version of the survey and those in the middle of a previous version will continue as though nothing has changed.

It is good practice to republish the survey once a set of work has been completed that is relevant to the recipients.

The following schedule is typical of a project:

  • publish version 1: send link to my colleagues for initial feedback
  • publish version 2: completed content without graphics
  • publish version 3: final pre-live version for panel to test links
  • publish version 4: clear test data and ready for live
  • publish version 5: correction of minor textual mistakes mid survey

In this schedule version 2, 4 and 5 contain the live data for analysis.

Version 1, 2, 3 retain the data from previous versions. When publishing version 4, data will be cleared. It depends on the project, if you retain the data from version 4 for version 5. If you forget to clear your data, the versions will be differentiated in the data spreadsheets.

Tip 2 – Don’t run Content Revisions in Parallel

While it’s possible for more than one editor to work on the same project it is discouraged as bad management practice. Instead the roles of editing and reviewing should be separated and should not be run in parallel.

It is also a good idea to break down reviewing into two phases. The first allowing changes, the second only verification.

A typical good practice workflow would be

  1. initial Create/Edit – the editor roughs in the survey
  2. Initial Review and Changes – a reviewer checks the survey from the preview link and makes notes on suggested changes
  3. Second round of edits – the editor makes the changes and keeps track of the changes that were implemented.
  4. Second review (verify only) – the review checks that the changes have been made correctly – but does not add new changes and either signs off or returns the changes not adequately made in step 3
  5. final round of edits – the editor corrects any initial changes and returns to the reviewer

Stages 4 and 5 may iterate of course but this process guarantees that the editing and reviewing process will actually terminate. From a management perspective it is critical to set the expectations of the reviewer of how many rounds they will have where they will be able to add/delete or make changes. In the schedule above the reviewer has only one round of changes.


Ensuring Quality of your Survey Build

These tips come from best practice in experience in the Clinical Trial research where data quality and survey quality is paramount.

Tip 1 – Use Simulations

Simulating respondents before going to field is an excellent way of identifying problems with logic and flow and provide you with a data set to review.

Browserstack and Selenium are tools that can achiev the same as SurveyEngine’s built in web-bot simulator. Ideally the simulator should behave as an external respondent, clicking buttons and entering data.

Our summarised process for running Simulations is

  1. Make a copy of the survey for simulation purposes
  2. Implement sample segments and publish the survey to activate the segments
  3. Removing any ‘difficult’ branching conditions that require real human knowledge to enable simulation
  4. Run simulations on a minimum of 100 respondents
  5. Run through the entire survey twice as a respondent with no errors
  6. Check simulation data matches expected proportions of treatment allocation, segments, branching, derived values and any other instrument specific data elements.
  7. Internal Signoff that the simulation passed internal test criteria

Before Data collection, simulated data and the survey needs to be written signed off by the principal investigator whose responsibility it is to ensure that the instrument is collecting data in the way it is intended.

A thorough check and signoff is mandatory to save you from delay and costs.

Tip 2 – Take a Rest and Check a 3rd Time

Often you will have looked through the survey what feels like 50 times as an editor. Sometimes be your own reviewer in which case it feels like 100 times. Nevertheless, it is important that the final check be done thoroughly, seriously and from the perspective of the eventual respondent.

take a rest before testing your survey
  • Take a break, make a cup of tea
  • Republish the survey and sit down one last time with fresh eyes
  • Set yourself up as the respondent would, if they are receiving the link in an email for example – read the email and click the link from that email. If you are providing the link to a panel – use that link as panel respondents would.
  • Use the proper forward and backwards buttons, – don’t use the convenience of the preview menu
  • If there are multiple branches respondents may go down depending on their responses – check all of them
  • If there are complex functions, record your entries manually and then manually eyeball the final data
  • Use the Project Management checklists
  • Don’t skip steps

More resources for learning to build surveys