Mining scientific articles with Public Library of Science (PLoS) -- Elizabeth Seiver

November 29, 2017 at 5-6:30pm in BIDS, 190 Doe Library

Link to presentation slides

CSV datasets

Have you ever wanted to learn how to mine the text and data from scientific articles? Come join us at The Hacker Within for a tutorial and mini-hackathon!

First will be a brief tutorial on the basic structure of XML documents, the JATS XML structure used by PLOS and other scientific publishers, as well as the XML parsing tools in allofplos, a Python library that downloads and parses PLOS articles. Then we’ll have some time to mine the corpus, contribute to the allofplos codebase, or whatever else you want to do with hundreds of thousands of research articles at your fingertips!

Spots are limited, so please sign up here:

The tutorial portion will be broadcast live and recorded on YouTube. While a working knowledge of Python is helpful, we will also have .csv documents of allofplos’s metadata that can be parsed in R.

Pizza will be provided.


About the presenter

Elizabeth Seiver is a Researcher at the Public Library of Science, a non-profit Open Access publisher. She wrote the codebase for allofplos.