ISPOR Europe 2018
Barcelona, Spain
November, 2018
PRM88
Multiple Diseases/No Specific Disease
Research on Methods (RM)
Databases & Data Management Methods (DM)
THE ADVANCEMENT OF TOOLS FOR AUTOMATING DATA EXTRACTION IN SYSTEMATIC REVIEWS
Scott DA1, Colquitt J2, Loveman E2, Royle P3
1University of Leicester, Leicester, UK, 2Effective Evidence LLP, Waterlooville, UK, 3Warwick University, Coventry, UK
OBJECTIVES: The potential for automation to reduce time required to undertake the key stages of systematic reviews has seen a growth in software applications in recent years. A recent systematic review found no unified information extraction framework for automating data extraction in systematic reviews. Whilst a proliferation of risk of bias and “process management” tools exist, data extraction from text and tables in systematic review has received less attention. We reviewed available tools and evaluative studies to determine progress in automation in this area.

METHODS: Systematic searches of Medline, Embase, and Scopus were undertaken until May 2018. A bespoke search strategy was developed; search terms included automation, machine learning, and data mining. Studies developing automated tools or methods for data extraction from published primary studies were included. We excluded tools used for “process management”, risk of bias assessment, or solely for use on abstracts. Searches were cross-checked with the SRToolbox (York Health Economics Consortium).

RESULTS: After deduplication, 3563 records were screened by two independent reviewers and 23 were marked for full paper screening. Six studies were included. Studies presented methods or proof-of-concept on the extraction of information from text in full papers. Bespoke programs were used to detect and classify elements within a paper, to identify key information, annotate, and extract the information. Similar techniques were applied to detection, classification, decomposition, and extraction of data from tables, a key source of trial data. One “user ready” (others were at the theoretical stage) tool was identified. ExaCT identifies key phrases containing data such as interventions and reported outcomes for the reviewer to check. However, extraction was limited, and development appears to have ceased.

CONCLUSIONS: Our research demonstrates promising automation technologies which have the potential to assist rather than displace systematic reviewers, but challenges remain and established tools have yet to emerge.