Registry Questionnaire Design and Data Standardization

When developing a registry questionnaire, it is useful to incorporate standardized data collection instruments where possible. Standardization will minimize some of the work of data collection, will increase the likelihood that data collected today will be compatible with data collected a few years from now, and will facilitate pooling of data between related registries and studies.

One way of standardizing data collection is to use scientifically validated survey instruments, such as the SF-36. Another approach is to use questions that others have used on a large scale. Here is a list of resources for potential questions and answers:

  • NHANES (National Health and Nutrition Examination Survey)
  • PROMIS (Patient-Reported Outcomes Measurement Information System)
  • PhenX (consensus measures for phenotypes and exposures)
  • dbGaP (database of Genotype and Phenotype)
  • PRISM (Patient Registry Item Specifications and Metadata for Rare Diseases)

You will also want to identify any questionnaires currently being used to collect information on your condition of interest. Your medical and scientific advisors can help you identify existing data collection instruments. You will also need to design your own questions to capture specific information required for your registry.

Developing your Registry Questionnaire

Detailed work is needed to build your survey instrument or data collection form. This planning phase, sometimes called pre-technical planning, is critical to developing your registry. Below are some things to consider when developing your registry questionnaire.

  • What information will you collect? Each registry is different, but many questionnaires collect medical information, participant demographics, lifestyle information, family history, genetic information, diagnosis/treatment information, and quality of life metrics.
  • Where will the information be stored? Every question and answer must have a place, as you can’t scribble in the margins on a web-form.
  • What questions will you ask?
  • What data type will you select for each answer? Specify answer choices and be explicit. If the answer is numeric, units will need to be specified. If text is required, there must be a text box to enter free text.
  • What format will you choose for each answer? Checkboxes are most appropriate when there can be more than one choice (e.g. race), while dropdowns are preferred when there can be only one choice (e.g. marital status). Radio buttons can be used instead of dropdown menus, but they often take up more space. Free text should be used sparingly, but is often necessary to enhance specific answers.

Developing your registry questionnaire takes thoughtful planning, and there are significant benefits to having a well-designed questionnaire.

Participant vs. Provider-Entered Data

With any registry, it is important to determine who will enter data. This area has become a hot topic with strong opinions. Generally, data are entered by the participant, the provider, or the provider’s staff. This is influenced by the type of information collected and resources.

Research has shown that participants can accurately enter certain information, although complex information may be better obtained from the provider or the medical record. For example, participants often know if they have high cholesterol, high blood sugar, or hypertension, but they may not know the specific lab values. There are also many instruments that have been designed specifically to collect self-reported data (e.g. SF-36 Health Survey).

Resources may influence data entry, as providers usually need compensated to enter data. It is also possible to have providers verify specific information (e.g. diagnosis) or to have the participant release their medical records to the registry.

Whether you chose participant or provider-entered data, the questionnaire must be designed for the person entering the data. For example, if it is participant entered, it is helpful to define all medical terms. There is a place for both participant and provider-entered data, and many registries use a combination of the two.

