Use of population databases with double sampling for choice-based sample surveys and time-to-event data (Project in the Priority Programme on Survey Methodology)

Description

Two-phase or double sampling methods were developed for studies with missing or imprecise covariate information. These methods are typically applied, if information on the event and crude information on the exposure of interest are available for a large population (phase 1), while complete exposure information is only available for a stratified subsample (phase 2). The stratification is built based on phase 1 information and plays an important role in the combined analysis of both data sources. The aim of the project was the development of empirically guided recommendations for designs of two-phase studies with a focus on the efficient use of phase 1 information in large administrative databases.
A two-phase study to investigate the effect of phenprocoumon on the risk of severe bleedings was conducted based on routine data of a statutory health insurance (phase 1, n=26,208) and data of a health survey (phase 2, n=498). The response analysis identified sex and age as predictors for participation in the health survey. These factors are often risk factors of the event of interest in epidemiological studies. Consequently, they must be taken into account in the stratification to ensure the stratum-specific represantativity of the phase 2 data. First analyses showed, that many stratifications are appropriate to efficiently estimate the parameter of interest (exposure to phenprocoumon) withour bias. However, the effect of covariates was only estimated with sufficient precision, if these covariates were inlcuded in the stratification. In total, 29 different stratifications were constructed, either based on a cross classification of few covariates or based on a disease score, which included information on a multitude of covariates. Since the results of the different stratifications were not robust regarding bias and efficiency, a simulation study was conducted, which was based on the same distribution of the covariates and the same size of phase 1. However, the size of the phase 2 was varied (n=500; 1,000; 2,000; 10,000). The simulation showed that stratifications based on percentiles of disease scores did not outperform stratifications based on cross classification of few important risk factors of the outcome. Age was identified as an important factor in the epidemiological apllication: Because comedications and comorbidities was strongly associated with age, a stratification based on age contained information about comedications and comorbidities in phase 1 as well.

Funding period

Begin:   March 2010
End:   February 2012

Sponsor

  • German Research Foundation

Contact

Prof. Dr. rer. nat. Iris Pigeot

Link

Project description on the GEPRIS homepage DFG

Selected project-related publications

    Articles with peer-review

  • Behr S, Schill W, Pigeot I. Does additional confounder information alter the estimated risk of bleeding associated with phenprocoumon use - Results of a two-phase study. Pharmacoepidemiology and Drug Safety. 2012;21(5):535-545.
    https://doi.org/10.1002/pds.3193
  • PhD-Theses

  • Behr S. Efficient use of phase 1 information in two-phase case-control studies based in administrative databases. Bremen: Universität Bremen; 2013.
    http://nbn-resolving.de/urn:nbn:de:gbv:46-00103296-17.
  • Presentations at scientific meetings/conferences (invited)

  • Behr S. Two-phase designs in pharmacoepidemiology. International Society for Pharmacoepidemiology (ISPE). 2013 Mid-Year Meeting, 11 April 2013, Munich.
  • Pigeot I, Behr S. Pharmacoepidemiological databases: Strengths, limitations, methodological challenges. 2nd Conference of the Central European Network (CEN), 12-16 September 2011, Zurich, Switzerland.
  • Presentations at scientific meetings/conferences

  • Behr S, Schill W, Pigeot I. Zwei-Phasen Designs zur Berücksichtigung zusätzlicher Confounder-Information in pharmakoepidemiologischen Datenbankstudien. Biometrisches Kolloquium, 13.-15. März 2012, Berlin.
  • Behr S, Schill W, Pigeot I. Design aspects of pharmacoepidemiological two-phase studies. International Conference on Pharmacoepidemiology & Therapeutic Risk Management (ICPE), 22-26 August 2012, Barcelona, Spain.
  • Behr S, Schill W, Pigeot I. Does additional confounder information alter the results of a database study on the risk of bleeding associated with phenprocoumon use? 27th International Conference on Pharmacoepidemiology & Therapeutic Risk Management (ICPE), 14-17 August 2011, Chicago, USA. (Abstract published in: Pharmacoepidemiology & Drug Safety. 2011;20(Suppl.1):254-255)
  • Behr S, Schill W, Pigeot I. Does additional confounder information alter the results of a database study on the risk of bleeding associated with phenprocoumon use? "Biometrie, Epidemiologie und Informatik - Gemeinsam forschen für Gesundheit". 56. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS) und 6. Jahrestagung der Deutschen Gesellschaft für Epidemiologie (DGEpi), 26.-29. September 2011, Mainz.
  • Schill W, Behr S. Impact of stratification in choice-based sample studies with double sampling. "Advancing Survey Methods", 2. internationale Konferenz des Priority Program Survey Methodology (PPSM), 17-18 November 2011, Bremen.