In this blog post, we will look at the problems with the journal impact factor-based method of evaluating scientific research and consider whether it truly reflects the true value of research.
In the scientific community, the quality of researchers and the importance of papers are evaluated by the “Journal Citation Index.” However, there has been growing criticism of this practice in recent years. Last May, more than 100 researchers in science and technology gathered in San Francisco and announced the “San Francisco Declaration on Research Assessment.” Currently, tens of thousands of researchers have joined the movement. They point out several fatal flaws in the evaluation method based on the journal impact factor. This article will specifically examine the content of the criticism of the journal impact factor and argue that the journal impact factor cannot be used as a yardstick for evaluating scientific research.
The Journal Impact Factor is a number that quantifies the influence of a journal. It was originally a scale designed for librarians. This was because each library needed to evaluate the relative importance of various candidates in order to select which journals to regularly subscribe to. The method for calculating a journal’s citation index is simple. For example, if a journal called Science and Technology has published 20 papers in the past two years, and these papers have been cited 200 times in total, the citation index of Science and Technology is 200/20, or 10. In other words, the “Citation Index 10” means that the average number of citations of papers published in “Science and Technology” over the past two years is 10. As such, the journal citation index is a quantitative measure of the importance of a journal.
However, the problem is that the journal citation index is applied as is to evaluate the importance of individual papers. The value of the journal citation index is the impact score of the papers published in that journal. For example, if there are two journals, Science and Technology with a citation index of 10 and Monthly Engineering with a citation index of 90, all papers published in Science and Technology will be rated at 10, and all papers published in Monthly Engineering will be rated at 90. The scores evaluated in this way are naturally applied to the authors of the papers. Therefore, researcher A, who has published a paper in Science and Technology, will receive 10 points, while researcher B, who has published a paper in Monthly Engineering, will receive 90 points. There will be an 80-point gap between A and B, regardless of their qualifications as researchers or the originality of their papers.
The first problem with journal citation indexes is the “statistical trap.” Since journal citation indexes are averages, the number of citations for papers published in the same journal can vary. The influence of an individual paper is not simply proportional to the citation index of the journal in which the paper is published. For example, a paper with a citation index of 10 may have been cited dozens or even hundreds of times. However, the overall average may have been lowered to 10 because the number of citations of other papers in Science and Technology is very low. On the other hand, although B’s paper was cited only once or twice, the citation index may have jumped to 90 points due to the very high number of citations of other papers in Monthly Engineering. In this situation, it is meaningless to consider the journal impact factor to compare the research of A and B. Instead, it is reasonable to compare the number of citations for each of the papers A and B and see that the paper A has a much greater impact. According to the San Francisco Declaration, more than 25% of papers in a journal account for 90% of all citations. In other words, if the journal impact factor is applied to individual papers as it is, there will inevitably be papers that suffer and papers that enjoy a windfall gain.
Second, research evaluation based on the journal impact factor does not reflect the unique characteristics of each discipline. For example, in the fields of medicine and biology, when a theory is proposed, numerous experiments are conducted to verify its validity. Since clinical trials related to a single paper are repeated through follow-up studies, the citation index of biology and medical journals is inevitably higher than that of other natural sciences or engineering fields. On the other hand, in the field of pure mathematics, a single paper is usually complete in itself. No follow-up research or experiments are needed. Therefore, the number of citations of mathematical papers is relatively low, and the journal citation index of pure mathematics journals is also inevitably low. In addition, since the research is very specialized, the number of citations is relatively small even if the number of researchers is small. On the other hand, if the field is large, research is active, and there are many researchers, the number of citations will be relatively high. As such, the evaluation method through the journal citation index has the limitation of not considering the fundamental differences in the nature of each discipline.
The third problem is the adverse effect of the journal citation index, which causes papers to be concentrated in a few popular journals. If you are a researcher about to publish a paper, you naturally want your paper to be published in journals such as Cell, Nature, and Science, which boast world-class authority. This is because the journal citation index of these journals is the highest. If the bias towards certain journals becomes excessive, publishing a paper may become more important to researchers than the research itself. The culture of only recognizing papers published in top journals has become a global phenomenon. If the “luxury principle,” which undervalues the diligent research process and emphasizes only visible results, continues, it could distort the very nature of science. In addition, some journals promote “self-citation,” or citation of papers published in their own journals, in order to increase their citation index. As such, evaluation through journal citation index is creating irrational and unethical competition.
Of course, the Journal Impact Factor is not without its drawbacks. The reason why the Journal Impact Factor is widely used is because it has the advantage of enabling quick and easy evaluation of researchers. The editors of each journal act as “expert evaluators.” They quickly sort out important and noteworthy research from the flood of research results. In today’s rapidly changing and expanding scientific community, this is an indispensable benefit.
However, we have already seen that there are fatal problems behind this convenience. The first is the statistical trap, which means that the journal citation index may differ from the actual number of citations of individual papers. The second is that the citation index does not take into account the characteristics of individual disciplines. It has been pointed out that in some disciplines, the number of citations is almost irrelevant to influence. The last problem is that this evaluation method is creating a monopoly structure of a few well-known journals, causing wasteful competition in the scientific community.
To overcome the above problems and achieve proper scientific development, the scientific community is currently seeking new evaluation standards. The simplest solution is to reflect the number of citations of individual researchers in the evaluation. This is an alternative to correcting the statistical pitfalls. Another method is to use a correction index that reflects the special characteristics of each discipline. If the journal index of a paper is divided by the average number of citations of the top 20% of journals in the field to which the paper belongs, the correction index can be used to normalize the special characteristics of each discipline. Encouraging peer review to develop qualitative evaluation criteria rather than quantitative evaluation criteria such as journal impact factor or number of citations would be a more fundamental alternative. What is required of the scientific community now is self-reflection and communication to devise a reasonable evaluation method.