Medical practice too often fails to incorporate recent medical advances. The two main reasons are that over 25 million scholarly medical articles have been published, and medical practitioners do not have the time to perform literature reviews. Systematic reviews aim at summarizing published medical evidence, but writing them requires tremendous human efforts. In this thesis, we propose several natural language processing methods based on artificial neural networks to facilitate the completion of systematic reviews. In particular, we focus on short-text classification, to help authors of systematic reviews locate the desired information. We introduce several algorithms to perform sequential short-text classification, which outperform state-of-the-art algorithms. To facilitate the choice of hyperparameters, we present a method based on Gaussian processes. Lastly, we release PubMed 20k RCT, a new dataset for sequential sentence classification in randomized control trial abstracts.
Thesis Supervisor: Peter Szolovits