September 15, 2024

Why Novelty Scores for science are a game changer

Philipp Koellinger

Scientists now have an automated mathematical measure for novelty

Evaluating a paper's contributions to the scientific literature is one of the core tasks of peer review. But how effective is the current system at selecting impactful, innovative work for publication or funding? As we release our new novelty scores feature for over 250 million research articles on DeSci Publish, co-founder and CEO Dr. Philipp Koellinger explains what these scores are, how they are developed, and their potential effect on science.

Discovering novel insights that expand human knowledge is one of the most essential functions of scientific research. As a result, evaluating the novelty of scientific manuscripts and grant applications takes centre stage in the peer review process. For example, the United States’ National Institute of Health (NIH) asks referees to evaluate “significance and innovation” or grant proposals on a 1-9 scale.

However, the current peer review process is subjective, slow, labour-intensive, and prone to bias and inaccuracy. Referees can often disagree on whether a particular contribution is novel.

The scores we have released are based on an objective, mathematical definition of novelty introduced by Professor James Evans and Dr Feng Shi from the University of Chicago and published in Nature Communicationsin 2023.

Figure A: Illustration of the embedding space and example combinations

A diagram of a graphDescription automatically generated with medium confidence

^Note:^{This is Figure 1a from Feng & Shi (2023) “Surprising combinations of research contents and contexts are related to impact and emerge with scientific outsiders from distant disciplines”,}^{Nature Communications}^,^{Article 1641}

Professor Evans and Dr Shi developed a hypergraph model (Figure A) that looks at combinations of keywords and journals mentioned in a particular scientific manuscript and compares it with previous publications. The estimated model is dynamic and evolves over time by first analysing the observed combinations in the previous years and then extrapolating those to predict what type of papers will be published in the following year. The model predicts the kind of manuscripts scientists will write with very high accuracy (area under the curve (AUC) > 0.95). Thus, science typically progresses in a highly predictable manner, with most scientists working on minor variations of well-established topics and methods in a particular field of research. This is exemplified by the orange and blue hypergraphs in Figure A, which connect common combinations of keywords or journals in a particular science niche. Novelty is defined as deviation from expectations, exemplified by the green hypergraph that connects distant keywords or journals in a surprising way.

What do the novelty scores measure?

This approach and the available data lead to two novelty metrics—one measuring content novelty and the other context novelty.

Context novelty is calculated by examining how unique the combination of journals cited in the reference list is, while content novelty is calculated in a similar way by examining the combinations of keywords that describe the manuscript.

The original model from Evans and Shi was estimated using 20 million open-access papers in the medical literature extracted from PubMed, and physics papers from the American Physical Society (APS).

“We wanted a measure of surprise that captured the native complexity of the combinations of conceptual elements, and of sources, rather than summing over pairwise combinations,” explained Professor James Evans. “Incidentally, we found that calculating it over the complete hypergraph rather than the sum of pairwise surprises, we improved our prediction of "surprising reception" or citation hit likelihood by ~100%.”

DeSci Labs re-estimated Shi and Evans's model using 58 million open-access articles from the OpenAlex database, using concepts and topics instead of keywords to estimate content novelty.

Analysis of Nature, Science, and Cell

Figures B and C show the relationship between the novelty scores of papers and the number of citations those papers received, grouping the data by deciles of the citation count distribution (x-axis) and the novelty score (y-axis).

Both figures restrict the analysis to papers that were published in 2010. The figures show a positive relationship between novelty scores and citation counts at the aggregate level: The higher the novelty score decile, the higher the citation decile. However, there is lot of variation of individual observations around the medians, implying that the novelty scores by themselves are not sufficient to predict the future impact for any individual paper with high accuracy.

Figure B: Relationship between context novelty percentiles and citation count deciles

A graph with orange and blue linesDescription automatically generated

^Note:^{The figure is based on 305,462 papers published in 2010, with whiskers showing the spread between the 25th and 75th percentiles.}

The observed patterns are very similar to papers published in other years, and they resemble the results reported by Shi and Evans in 2023.

Figure C: Relationship beetween content novelty percentile and citation count deciles

^Note^{: Figure is based on 1,118,871 papers published in 2010, with whiskers showing the spread between the 25th and 75th percentiles.}

Figures D and E add observations from papers published in three high-impact journals to the graph (Nature, Science, Cell). Cell outperforms Science and Nature in context novelty while underperforming in converting this novelty into citation counts compared to Science and Nature (Figure D).

Figure D: Context novelty and citations for papers published in Nature, Science and Cell in 2010

A graph with colorful dots and linesDescription automatically generated

^Note^{: Mean context novelty was estimated for 299,832 papers published in 2010. The size of the circles is proportional to the number of papers published. Citations percentiles are relative to the three focal journals in this figure (Nature, Science, Cell).}

Science outperforms the other two journals on content novelty but underperforms on converting this content novelty into citations - except for papers in the top decile of the content novelty distribution. Cell is well-aligned with the average correlation between content novelty and citations. Surprisingly, Nature publishes many papers with very low content novelty but outperforms Cell and Science in converting low content novelty scores into citations (Figure E).

All three journals have a remarkable concentration of papers in the top deciles of the context and content novelty score distribution, reflecting that their editors are consciously aiming to select highly novel work for publication. This editorial policy is also reflected in the aims and scope statements of these journals and several editorial statements. Contrary to the aim to publish primarily highly innovative research, the referee and the editorial selection process also selected many papers for publication that are neither novel in content nor context, indicating that the current curation process of these journals is suboptimal.

All three journals follow the same trend, which we see on average—papers with higher novelty scores published in these three journals systematically receive more citations. This suggests that even high-impact journals could improve their impact factor further if they took the novelty scores of the manuscripts submitted to them into account.

Figure E: Content novelty and citations for papers published in Nature, Science and Cell in 2010

A graph with a line and dotsDescription automatically generated with medium confidence

^Note^{: Mean content novelty was estimated for 1,069,055 papers published in 2010. The size of the circles is proportional to the number of papers published. Citations percentiles are relative to the three focal journals in this figure (Nature, Science, Cell).}

The future with novelty scores

Objective and widely available novelty scores could play an essential part in how science is evaluated and incentivised. For example, journal editors and funding agencies could consult this metric in addition to the evaluations they receive from referees. Doing so systematically is expected to increase the impact factor of journals and the innovativeness of the research that receives funding. This may help reverse the reported trend that major scientific breakthroughs are getting rarer over time despite increased spending on research (see here and here).

Furthermore, users can explore the newest preprints for topics they are interested in, helping them to find the most innovative scientific contributions. They can search for authors, topics, journals, or institutions they care about and create ad-hoc novelty rankings.

Scientists will soon be able to see novelty scores for every version of their manuscript they upload on DeSci Publish; while future premium features will allow users to calculate novelty scores for any scientific manuscript they care about, including grant applications.

Get started with exploring truly novel scientific breakthroughs on publish.desci.com!