All resources in Researchers

How significant are the public dimensions of faculty work in review, promotion and tenure documents?

(View Complete Item Description)

Much of the work done by faculty at both public and private universities has significant public dimensions: it is often paid for by public funds; it is often aimed at serving the public good; and it is often subject to public evaluation. To understand how the public dimensions of faculty work are valued, we analyzed review, promotion, and tenure documents from a representative sample of 129 universities in the US and Canada. Terms and concepts related to public and community are mentioned in a large portion of documents, but mostly in ways that relate to service, which is an undervalued aspect of academic careers. Moreover, the documents make significant mention of traditional research outputs and citation-based metrics: however, such outputs and metrics reward faculty work targeted to academics, and often disregard the public dimensions. Institutions that seek to embody their public mission could therefore work towards changing how faculty work is assessed and incentivized.

Material Type: Reading

Authors: Carol Muñoz Nieves, Erin C McKiernan, Gustavo E Fischman, Juan P Alperin, Lesley A Schimanski, Meredith T Niles

Data policies of highly-ranked social science journals

(View Complete Item Description)

By encouraging and requiring that authors share their data in order to publish articles, scholarly journals have become an important actor in the movement to improve the openness of data and the reproducibility of research. But how many social science journals encourage or mandate that authors share the data supporting their research findings? How does the share of journal data policies vary by discipline? What influences these journals’ decisions to adopt such policies and instructions? And what do those policies and instructions look like? We discuss the results of our analysis of the instructions and policies of 291 highly-ranked journals publishing social science research, where we studied the contents of journal data policies and instructions across 14 variables, such as when and how authors are asked to share their data, and what role journal ranking and age play in the existence and quality of data policies and instructions. We also compare our results to the results of other studies that have analyzed the policies of social science journals, although differences in the journals chosen and how each study defines what constitutes a data policy limit this comparison.We conclude that a little more than half of the journals in our study have data policies. A greater share of the economics journals have data policies and mandate sharing, followed by political science/international relations and psychology journals. Finally, we use our findings to make several recommendations: Policies should include the terms “data,� “dataset� or more specific terms that make it clear what to make available; policies should include the benefits of data sharing; journals, publishers, and associations need to collaborate more to clarify data policies; and policies should explicitly ask for qualitative data.

Material Type: Reading

Authors: Abigail Schwartz, Dessi Kirilova, Gerard Otalora, Julian Gautier, Mercè Crosas, Sebastian Karcher

Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations

(View Complete Item Description)

The Journal Impact Factor (JIF) was originally designed to aid libraries in deciding which journals to index and purchase for their collections. Over the past few decades, however, it has become a relied upon metric used to evaluate research articles based on journal rank. Surveyed faculty often report feeling pressure to publish in journals with high JIFs and mention reliance on the JIF as one problem with current academic evaluation systems. While faculty reports are useful, information is lacking on how often and in what ways the JIF is currently used for review, promotion, and tenure (RPT). We therefore collected and analyzed RPT documents from a representative sample of 129 universities from the United States and Canada and 381 of their academic units. We found that 40% of doctoral, research-intensive (R-type) institutions and 18% of master’s, or comprehensive (M-type) institutions explicitly mentioned the JIF, or closely related terms, in their RPT documents. Undergraduate, or baccalaureate (B-type) institutions did not mention it at all. A detailed reading of these documents suggests that institutions may also be using a variety of terms to indirectly refer to the JIF. Our qualitative analysis shows that 87% of the institutions that mentioned the JIF supported the metric’s use in at least one of their RPT documents, while 13% of institutions expressed caution about the JIF’s use in evaluations. None of the RPT documents we analyzed heavily criticized the JIF or prohibited its use in evaluations. Of the institutions that mentioned the JIF, 63% associated it with quality, 40% with impact, importance, or significance, and 20% with prestige, reputation, or status. In sum, our results show that the use of the JIF is encouraged in RPT evaluations, especially at research-intensive universities, and indicates there is work to be done to improve evaluation processes to avoid the potential misuse of metrics like the JIF.

Material Type: Reading

Authors: Carol Muñoz Nieves, Erin C. McKiernan, Juan Pablo Alperin, Lesley A. Schimanski, Lisa Matthias, Meredith T. Niles

Assessing data availability and research reproducibility in hydrology and water resources

(View Complete Item Description)

There is broad interest to improve the reproducibility of published research. We developed a survey tool to assess the availability of digital research artifacts published alongside peer-reviewed journal articles (e.g. data, models, code, directions for use) and reproducibility of article results. We used the tool to assess 360 of the 1,989 articles published by six hydrology and water resources journals in 2017. Like studies from other fields, we reproduced results for only a small fraction of articles (1.6% of tested articles) using their available artifacts. We estimated, with 95% confidence, that results might be reproduced for only 0.6% to 6.8% of all 1,989 articles. Unlike prior studies, the survey tool identified key bottlenecks to making work more reproducible. Bottlenecks include: only some digital artifacts available (44% of articles), no directions (89%), or all artifacts available but results not reproducible (5%). The tool (or extensions) can help authors, journals, funders, and institutions to self-assess manuscripts, provide feedback to improve reproducibility, and recognize and reward reproducible articles as examples for others.

Material Type: Reading

Authors: Adel M. Abdallah, David E. Rosenberg, Hadia Akbar, James H. Stagge, Nour A. Attallah, Ryan James

On the Plurality of (Methodological) Worlds: Estimating the Analytic Flexibility of fMRI Experiments

(View Complete Item Description)

How likely are published findings in the functional neuroimaging literature to be false? According to a recent mathematical model, the potential for false positives increases with the flexibility of analysis methods. Functional MRI (fMRI) experiments can be analyzed using a large number of commonly used tools, with little consensus on how, when, or whether to apply each one. This situation may lead to substantial variability in analysis outcomes. Thus, the present study sought to estimate the flexibility of neuroimaging analysis by submitting a single event-related fMRI experiment to a large number of unique analysis procedures. Ten analysis steps for which multiple strategies appear in the literature were identified, and two to four strategies were enumerated for each step. Considering all possible combinations of these strategies yielded 6,912 unique analysis pipelines. Activation maps from each pipeline were corrected for multiple comparisons using five thresholding approaches, yielding 34,560 significance maps. While some outcomes were relatively consistent across pipelines, others showed substantial methods-related variability in activation strength, location, and extent. Some analysis decisions contributed to this variability more than others, and different decisions were associated with distinct patterns of variability across the brain. Qualitative outcomes also varied with analysis parameters: many contrasts yielded significant activation under some pipelines but not others. Altogether, these results reveal considerable flexibility in the analysis of fMRI experiments. This observation, when combined with mathematical simulations linking analytic flexibility with elevated false positive rates, suggests that false positive results may be more prevalent than expected in the literature. This risk of inflated false positive rates may be mitigated by constraining the flexibility of analytic choices or by abstaining from selective analysis reporting.

Material Type: Reading

Author: Joshua Carp

Does use of the CONSORT Statement impact the completeness of reporting of randomised controlled trials published in medical journals? A Cochrane reviewa

(View Complete Item Description)

Background The Consolidated Standards of Reporting Trials (CONSORT) Statement is intended to facilitate better reporting of randomised clinical trials (RCTs). A systematic review recently published in the Cochrane Library assesses whether journal endorsement of CONSORT impacts the completeness of reporting of RCTs; those findings are summarised here. Methods Evaluations assessing the completeness of reporting of RCTs based on any of 27 outcomes formulated based on the 1996 or 2001 CONSORT checklists were included; two primary comparisons were evaluated. The 27 outcomes were: the 22 items of the 2001 CONSORT checklist, four sub-items describing blinding and a ‘total summary score’ of aggregate items, as reported. Relative risks (RR) and 99% confidence intervals were calculated to determine effect estimates for each outcome across evaluations. Results Fifty-three reports describing 50 evaluations of 16,604 RCTs were assessed for adherence to at least one of 27 outcomes. Sixty-nine of 81 meta-analyses show relative benefit from CONSORT endorsement on completeness of reporting. Between endorsing and non-endorsing journals, 25 outcomes are improved with CONSORT endorsement, five of these significantly (α = 0.01). The number of evaluations per meta-analysis was often low with substantial heterogeneity; validity was assessed as low or unclear for many evaluations. Conclusions The results of this review suggest that journal endorsement of CONSORT may benefit the completeness of reporting of RCTs they publish. No evidence suggests that endorsement hinders the completeness of RCT reporting. However, despite relative improvements when CONSORT is endorsed by journals, the completeness of reporting of trials remains sub-optimal. Journals are not sending a clear message about endorsement to authors submitting manuscripts for publication. As such, fidelity of endorsement as an ‘intervention’ has been weak to date. Journals need to take further action regarding their endorsement and implementation of CONSORT to facilitate accurate, transparent and complete reporting of trials.

Material Type: Reading

Authors: David Moher, Douglas G Altman, Kenneth F Schulz, Larissa Shamseer, Lucy Turner

Instead of "playing the game" it is time to change the rules: Registered Reports at AIMS Neuroscience and beyond

(View Complete Item Description)

The last ten years have witnessed increasing awareness of questionable research practices (QRPs) in the life sciences, including p-hacking, HARKing, lack of replication, publication bias, low statistical power and lack of data sharing (see Figure 1). Concerns about such behaviours have been raised repeatedly for over half a century but the incentive structure of academia has not changed to address them. Despite the complex motivations that drive academia, many QRPs stem from the simple fact that the incentives which offer success to individual scientists conflict with what is best for science. On the one hand are a set of gold standards that centuries of the scientific method have proven to be crucial for discovery: rigour, reproducibility, and transparency. On the other hand are a set of opposing principles born out of the academic career model: the drive to produce novel and striking results, the importance of confirming prior expectations, and the need to protect research interests from competitors. Within a culture that pressures scientists to produce rather than discover, the outcome is a biased and impoverished science in which most published results are either unconfirmed genuine discoveries or unchallenged fallacies. This observation implies no moral judgement of scientists, who are as much victims of this system as they are perpetrators.

Material Type: Reading

Authors: Christopher D. Chambers, Eva Feredoes, Peter Etchells, Suresh Daniel Muthukumaraswamy

Recommendations for Increasing Replicability in Psychology: Recommendations for increasing replicability

(View Complete Item Description)

Replicability of findings is at the heart of any empirical science. The aim of this article is to move the current replicability debate in psychology towards concrete recommendations for improvement. We focus on research practices but also offer guidelines for reviewers, editors, journal management, teachers, granting institutions, and university promotion committees, highlighting some of the emerging and existing practical solutions that can facilitate implementation of these recommendations. The challenges for improving replicability in psychological science are systemic. Improvement can occur only if changes are made at many levels of practice, evaluation, and reward.

Material Type: Reading

Authors: Brent W. Roberts, Brian A. Nosek, David C. Funder, Filip De Fruyt, Hannelore Weber, Jaap J. A. Denissen, Jan De Houwer, Jelte M. Wicherts, Jens B. Asendorpf, Klaus Fiedler, Manfred Schmitt, Marcel A. G. van Aken, Marco Perugini, Mark Conner, Reinhold Kliegl, Susann Fiedler

A Short Introduction to the Reproducibility Debate in Psychology

(View Complete Item Description)

The Journal of European Psychology Students (JEPS) is an open-access, double-blind, peer-reviewed journal for psychology students worldwide. JEPS is run by highly motivated European psychology students and has been publishing since 2009. By ensuring that authors are always provided with extensive feedback, JEPS gives psychology students the chance to gain experience in publishing and to improve their scientific skills. Furthermore, JEPS provides students with the opportunity to share their research and to take a first step toward a scientific career.

Material Type: Reading

Author: Cedric Galetzka

The citation advantage of linking publications to research data

(View Complete Item Description)

Efforts to make research results open and reproducible are increasingly reflected by journal policies encouraging or mandating authors to provide data availability statements. As a consequence of this, there has been a strong uptake of data availability statements in recent literature. Nevertheless, it is still unclear what proportion of these statements actually contain well-formed links to data, for example via a URL or permanent identifier, and if there is an added value in providing them. We consider 531,889 journal articles published by PLOS and BMC which are part of the PubMed Open Access collection, categorize their data availability statements according to their content and analyze the citation advantage of different statement categories via regression. We find that, following mandated publisher policies, data availability statements have become common by now, yet statements containing a link to a repository are still just a fraction of the total. We also find that articles with these statements, in particular, can have up to 25.36% higher citation impact on average: an encouraging result for all publishers and authors who make the effort of sharing their data. All our data and code are made available in order to reproduce and extend our results.

Material Type: Reading

Authors: Barbara McGillivray, Giovanni Colavizza, Iain Hrynaszkiewicz, Isla Staden, Kirstie Whitaker

Releasing a preprint is associated with more attention and citations for the peer-reviewed article

(View Complete Item Description)

Preprints in biology are becoming more popular, but only a small fraction of the articles published in peer-reviewed journals have previously been released as preprints. To examine whether releasing a preprint on bioRxiv was associated with the attention and citations received by the corresponding peer-reviewed article, we assembled a dataset of 74,239 articles, 5,405 of which had a preprint, published in 39 journals. Using log-linear regression and random-effects meta-analysis, we found that articles with a preprint had, on average, a 49% higher Altmetric Attention Score and 36% more citations than articles without a preprint. These associations were independent of several other article- and author-level variables (such as scientific subfield and number of authors), and were unrelated to journal-level variables such as access model and Impact Factor. This observational study can help researchers and publishers make informed decisions about how to incorporate preprints into their work.

Material Type: Reading

Authors: Darwin Y Fu, Jacob J Hughey

Transparency of CHI Research Artifacts: Results of a Self-Reported Survey

(View Complete Item Description)

Several fields of science are experiencing a "replication crisis" that has negatively impacted their credibility. Assessing the validity of a contribution via replicability of its experimental evidence and reproducibility of its analyses requires access to relevant study materials, data, and code. Failing to share them limits the ability to scrutinize or build-upon the research, ultimately hindering scientific progress.Understanding how the diverse research artifacts in HCI impact sharing can help produce informed recommendations for individual researchers and policy-makers in HCI. Therefore, we surveyed authors of CHI 2018–2019 papers, asking if they share their papers' research materials and data, how they share them, and why they do not. The results (N = 460/1356, 34% response rate) show that sharing is uncommon, partly due to misunderstandings about the purpose of sharing and reliable hosting. We conclude with recommendations for fostering open research practices.This paper and all data and materials are freely available at https://osf.io/csy8q

Material Type: Reading

Authors: Chatchavan Wacharamanotham, Florian Echtler, Lukas Eisenring, Steve Haroz

Evaluating Registered Reports: A Naturalistic Comparative Study of Article Impact

(View Complete Item Description)

Registered Reports (RRs) is a publishing model in which initial peer review is conducted prior to knowing the outcomes of the research. In-principle acceptance of papers at this review stage combats publication bias, and provides a clear distinction between confirmatory and exploratory research. Some editors raise a practical concern about adopting RRs. By reducing publication bias, RRs may produce more negative or mixed results and, if such results are not valued by the research community, receive less citations as a consequence. If so, by adopting RRs, a journal’s impact factor may decline. Despite known flaws with impact factor, it is still used as a heuristic for judging journal prestige and quality. Whatever the merits of considering impact factor as a decision-rule for adopting RRs, it is worthwhile to know whether RRs are cited less than other articles. We will conduct a naturalistic comparison of citation and altmetric impact between published RRs and comparable empirical articles from the same journals.

Material Type: Reading

Authors: Brian A. Nosek, Felix Singleton Thorn, Lilian T. Hummer, Timothy M. Errington

The earth is flat (p > 0.05): significance thresholds and the crisis of unreplicable research

(View Complete Item Description)

The widespread use of ‘statistical significance’ as a license for making a claim of a scientific finding leads to considerable distortion of the scientific process (according to the American Statistical Association). We review why degrading p-values into ‘significant’ and ‘nonsignificant’ contributes to making studies irreproducible, or to making them seem irreproducible. A major problem is that we tend to take small p-values at face value, but mistrust results with larger p-values. In either case, p-values tell little about reliability of research, because they are hardly replicable even if an alternative hypothesis is true. Also significance (p ≤ 0.05) is hardly replicable: at a good statistical power of 80%, two studies will be ‘conflicting’, meaning that one is significant and the other is not, in one third of the cases if there is a true effect. A replication can therefore not be interpreted as having failed only because it is nonsignificant. Many apparent replication failures may thus reflect faulty judgment based on significance thresholds rather than a crisis of unreplicable research. Reliable conclusions on replicability and practical importance of a finding can only be drawn using cumulative evidence from multiple independent studies. However, applying significance thresholds makes cumulative knowledge unreliable. One reason is that with anything but ideal statistical power, significant effect sizes will be biased upwards. Interpreting inflated significant results while ignoring nonsignificant results will thus lead to wrong conclusions. But current incentives to hunt for significance lead to selective reporting and to publication bias against nonsignificant findings. Data dredging, p-hacking, and publication bias should be addressed by removing fixed significance thresholds. Consistent with the recommendations of the late Ronald Fisher, p-values should be interpreted as graded measures of the strength of evidence against the null hypothesis. Also larger p-values offer some evidence against the null hypothesis, and they cannot be interpreted as supporting the null hypothesis, falsely concluding that ‘there is no effect’. Information on possible true effect sizes that are compatible with the data must be obtained from the point estimate, e.g., from a sample average, and from the interval estimate, such as a confidence interval. We review how confusion about interpretation of larger p-values can be traced back to historical disputes among the founders of modern statistics. We further discuss potential arguments against removing significance thresholds, for example that decision rules should rather be more stringent, that sample sizes could decrease, or that p-values should better be completely abandoned. We conclude that whatever method of statistical inference we use, dichotomous threshold thinking must give way to non-automated informed judgment.

Material Type: Reading

Authors: Fränzi Korner-Nievergelt, Tobias Roth, Valentin Amrhein

Signaling the trustworthiness of science

(View Complete Item Description)

Trust in science increases when scientists and the outlets certifying their work honor science’s norms. Scientists often fail to signal to other scientists and, perhaps more importantly, the public that these norms are being upheld. They could do so as they generate, certify, and react to each other’s findings: for example, by promoting the use and value of evidence, transparent reporting, self-correction, replication, a culture of critique, and controls for bias. A number of approaches for authors and journals would lead to more effective signals of trustworthiness at the article level. These include article badging, checklists, a more extensive withdrawal ontology, identity verification, better forward linking, and greater transparency.

Material Type: Reading

Authors: Kathleen Hall Jamieson, Marcia McNutt, Richard Sever, Veronique Kiermer

An excess of positive results: Comparing the standard Psychology literature with Registered Reports

(View Complete Item Description)

When studies with positive results that support the tested hypotheses have a higher probability of being published than studies with negative results, the literature will give a distorted view of the evidence for scientific claims. Psychological scientists have been concerned about the degree of distortion in their literature due to publication bias and inflated Type-1 error rates. Registered Reports were developed with the goal to minimise such biases: In this new publication format, peer review and the decision to publish take place before the study results are known. We compared the results in the full population of published Registered Reports in Psychology (N = 71 as of November 2018) with a random sample of hypothesis-testing studies from the standard literature (N = 152) by searching 633 journals for the phrase ‘test* the hypothes*’ (replicating a method by Fanelli, 2010). Analysing the first hypothesis reported in each paper, we found 96% positive results in standard reports, but only 44% positive results in Registered Reports. The difference remained nearly as large when direct replications were excluded from the analysis (96% vs 50% positive results). This large gap suggests that psychologists underreport negative results to an extent that threatens cumulative science. Although our study did not directly test the effectiveness of Registered Reports at reducing bias, these results show that the introduction of Registered Reports has led to a much larger proportion of negative results appearing in the published literature compared to standard reports.

Material Type: Reading

Authors: Anne M. Scheel, Daniel Lakens, Mitchell Schijen

Estimating the prevalence of transparency and reproducibility-related research practices in psychology (2014-2017)

(View Complete Item Description)

Psychological science is navigating an unprecedented period of introspection about the credibility and utility of its research. A number of reform initiatives aimed at increasing adoption of transparency and reproducibility-related research practices appear to have been effective in specific contexts; however, their broader, collective impact amidst a wider discussion about research credibility and reproducibility is largely unknown. In the present study, we estimated the prevalence of several transparency and reproducibility-related indicators in the psychology literature published between 2014-2017 by manually assessing these indicators in a random sample of 250 articles. Over half of the articles we examined were publicly available (154/237, 65% [95% confidence interval, 59% to 71%]). However, sharing of important research resources such as materials (26/183, 14% [10% to 19%]), study protocols (0/188, 0% [0% to 1%]), raw data (4/188, 2% [1% to 4%]), and analysis scripts (1/188, 1% [0% to 1%]) was rare. Pre-registration was also uncommon (5/188, 3% [1% to 5%]). Although many articles included a funding disclosure statement (142/228, 62% [56% to 69%]), conflict of interest disclosure statements were less common (88/228, 39% [32% to 45%]). Replication studies were rare (10/188, 5% [3% to 8%]) and few studies were included in systematic reviews (21/183, 11% [8% to 16%]) or meta-analyses (12/183, 7% [4% to 10%]). Overall, the findings suggest that transparent and reproducibility-related research practices are far from routine in psychological science. Future studies can use the present findings as a baseline to assess progress towards increasing the credibility and utility of psychology research.

Material Type: Reading

Authors: Jessica Elizabeth Kosie, john Ioannidis, Joshua D Wallach, Mallory Kidwell, Robert T. Thibault, Tom Elis Hardwicke

Open Science Practices are on the Rise: The State of Social Science (3S) Survey

(View Complete Item Description)

Has there been meaningful movement toward open science practices within the social sciences in recent years? Discussions about changes in practices such as posting data and pre-registering analyses have been marked by controversy—including controversy over the extent to which change has taken place. This study, based on the State of Social Science (3S) Survey, provides the first comprehensive assessment of awareness of, attitudes towards, perceived norms regarding, and adoption of open science practices within a broadly representative sample of scholars from four major social science disciplines: economics, political science, psychology, and sociology. We observe a steep increase in adoption: as of 2017, over 80% of scholars had used at least one such practice, rising from one quarter a decade earlier. Attitudes toward research transparency are on average similar between older and younger scholars, but the paceof change differs by field and methodology. According with theories of normal science and scientific change, the timing of increases in adoption coincides with technological innovations and institutional policies. Patterns are consistent with most scholars underestimating the trend toward open science in their discipline.

Material Type: Reading

Authors: David J. Birke, Edward Miguel, Elizabeth Levy Paluck, Garret Christensen, Nicholas Swanson, Rebecca Littman, Zenan Wang

Are choices based on conditional or conjunctive probabilities in a sequential risk-taking task?

(View Complete Item Description)

In this study, we examined participants' choice behavior in a sequential risk-taking task. We were especially interested in the extent to which participants focus on the immediate next choice or consider the entire choice sequence. To do so, we inspected whether decisions were either based on conditional probabilities (e.g., being successful on the immediate next trial) or on conjunctive probabilities (of being successful several times in a row). The results of five experiments with a simplified nine-card Columbia Card Task and a CPT-model analysis show that participants' choice behavior can be described best by a mixture of the two probability types. Specifically, for their first choice, the participants relied on conditional probabilities, whereas subsequent choices were based on conjunctive probabilities. This strategy occurred across different start conditions in which more or less cards were already presented face up. Consequently, the proportion of risky choices was substantially higher when participants started from a state with some cards facing up, compared with when they arrived at that state starting from the very beginning. The results, alternative accounts, and implications are discussed.

Material Type: Reading

Authors: Peter Haffke, Ronald Hübner