Strengthening Reproducibility in Network Science

NetSci 2017 Reproducibility Workshop

Welcome

As a half-day satellite of the NetSci2017 conference, the workshop will be held on Monday afternoon, June 19, 2017 in Indianapolis, Indiana, USA. It will feature invited talks from publishing, funding, industry and academia perspectives, followed by a panel discussion and a break out session that aims to identify actions the community can engage in to strengthen scientific rigor in the field of network science. The details can be view on the workshop program. All attendees of this symposium must be registered for the NetSci2017 conference in order to attend.

Background

The holy grail of science is truth, and at the heart of science is rigor. Through careful observation, experimentation with rigorous methods, we are able to drawn inferences and derive knowledge that we rely on as truth. Each new study is a building block of science, taking as fact the results of prior studies. In this way, the foundation of science is built, painstakingly, brick by brick. In the past few years, the quality of this foundation, on which scientific advancement stands, has come into question because of systematic failure of the scientific enterprise to prioritize reproducibility and replicability (R&R). The ramifications of this failure are dire, forcing not only a reconsideration of the veracity of nearly all scientific evidence and knowledge to date, but also calling into question the prevailing scientific practices of our time. Empirical studies of reproducibility have reflected a variety of flaws in the scientific enterprise that contribute to this state of affairs including the incentive structure of academia, editorial bias toward novel and surprising findings and lack of interest in replication studies, inadequate sharing of information necessary to reproduce, and more. Recognizing the dire state of affairs, the scientific community has begun to come up with ways to remedy or at least mitigate the situation. Leading the way are scientific journals and funding agencies who have issued new guidelines and requirements designed to improve reproducibility and replicability.

Against this backdrop, Network Science (NS) is emerging as a transformative, interdisciplinary field. To succeed, Network Science must, above all else, stand for scientific rigor, including attention to reproducibility and replicability. Network Science must bear a community of scholars committed to discovering the novel features of NS that require our attention to ensure adequate rigor. For example, when attempting to reproduce or replicate another's study, one typically relies on the availability of the original data. However, some data is private. Standard methods to anonymize the data before sharing could have exaggerated implications in NS - the relational nature of network data may mean that identities can more easily discoverable when combined with publicly available data sets. Therefore, innovative new ways of privacy preservation may be needed for sharing data. A second example of how network science may need special attention to issues of R&R relates to data provenance. Twitter data, for example, which is a treasure trove of social media data, is subject to updates and deletions every second. So, depending on when the data is accessed, even if it covers the same time period, the data within may be different. Thanks again to their relational nature, big network data suffers more from missing links, and requires additional version control considerations. For other data sets, the issues may be more about data quality. Bibliographic data is fundamental to understanding the progress and process of scientific discovery. A critical feature of such data is citations, creating vast intertwining webs of coauthor and co-citation networks ripe of mining. However, the problem of author disambiguation is non-trivial. How does the disambiguation problem and other data quality problems, contribute to challenges in R&R.

In this satellite, we plan to take on the problem of R&R in Network Science. Through this workshop, participants can expect to: 1) get a broad overview of the importance of R&R in the scientific process, current state of the field and what funding agencies and journal are doing to ensure R&R moving forward; 2) learn about the R&R issues in two data sets used by network scientists: social media data (Twitter) with privacy and provenance concerns, and an open data set in Microsoft Academic Graph. 3) learn what has been done in related domains in terms of promoting R&R and discuss common concerns and guidelines for NS in general. 4) participate in small groups to identify actions the NS community can engage in to address R&R, strengthen scientific rigor, comport with industry, journal and funding agency practices and guidelines. The result will be an interdisciplinary group of network scientists, galvanized to bring scientific rigor to our field and poised to carry out the action items generated during the workshop.

Invited Speakers

Barbara R. Jasny Deputy Editor, Emeritus at the Science Journal. Expert in reproducibility of scietific research.
Philip E. Bourne Former NIH Associate Director for Data Science, led the Big Data to Knowledge (BD2K) initiative. He will be joining the University of Virginia as the Stephenson Chair of Data Science, Director of the Data Science Institute, and Professor in the Department of Biomedical Engineering.
Richard Shiffrin Distinguished Professor and Luther Dana Waterman Professor of Psychological and Brain Sciences at Indiana University. Expert in Bayesian statistics and reproducibility of research in general.

Organizers

Kuansan Wang Managing Director at Microsoft Research, MSR Outreach Innovation, leads the Microsoft Academic project.
Santo Fortunato Professor of Informatics and Computing at Indiana University, Scientific Director at Indiana University Network Science Institute.
Patricia L. Mabry Executive Director and Senior Research Scientist at Indiana University Network Science Institute. Former Senior Advisor for Disease Prevention in the Office of Disease Prevention (ODP) at NIH.
Xiaoran Yan Assistant Research Scientist at Indiana University Network Science Institute.