Citation : criterion and it's measurements

From DrugPedia: A Wikipedia for Drug discovery

(Difference between revisions)
Jump to: navigation, search
Line 32: Line 32:
Given a set of researchers ranked in decreasing order of their g-index, the g1-index is the (unique) largest number such that the top g1 researchers have on average at least a g-index of g1.  
Given a set of researchers ranked in decreasing order of their g-index, the g1-index is the (unique) largest number such that the top g1 researchers have on average at least a g-index of g1.  
 +
 +
==H-index==
 +
 +
 +
 +
===Advantages of H-index===
 +
 +
The h-index was intended to address the main disadvantages of other bibliometric indicators, such as total number of papers or total number of citations. Total number of papers does not account for the quality of scientific publications, while total number of citations can be disproportionately affected by participation in a single publication of major influence. The h-index is intended to measure simultaneously the quality and sustainability of scientific output, as well as, to some extent, the diversity of scientific research. The h-index is much less affected by methodological papers proposing successful new techniques, methods or approximations, which can be extremely highly cited. For example, one of the most cited condensed matter theorists, John P. Perdew, has been very successful in devising new approximations within the widely used density functional theory. He has published 3 papers cited more than 5000 times and 2 cited more than 4000 times. Several thousand papers utilizing the density functional theory are published every year, most of them citing at least one paper of J.P. Perdew. His total citation count is close to 39 000, while his h-index is large, 51, but not unique. In contrast, the condensed-matter theorist with the highest h-index (94), Marvin L. Cohen, has a lower citation count of 35 000. One can argue that in this case the h-index reflects the broader impact of Cohen's papers in solid-state physics due to his larger number of highly-cited papers.
 +
 +
===Criticism of H-index===
 +
 +
* Michael Nielsen points out that "...the h-index contains little information beyond the total number of citations, and is not properly regarded as a new measure of impact at all". According to Nielsen, to a good approximation, h ~ sqrt(T)/2, where T is the total number of citations.
 +
 +
There are a number of situations in which h may provide misleading information about a scientist's output:
 +
 +
* The h-index is bounded by the total number of publications. This means that scientists with a short career are at an inherent disadvantage, regardless of the importance of their discoveries. For example, Évariste Galois' h-index is 2, and will remain so forever. Had Albert Einstein died in early 1906, his h-index would be stuck at 4 or 5, despite his being widely acknowledged as one of the most important physicists, even considering only his publications to that date.
 +
 +
* The h-index does not consider the context of citations. For example, citations in a paper are often made simply to flesh-out an introduction, otherwise having no other significance to the work. h also does not resolve other contextual instances: citations made in a negative context and citations made to fraudulent or retracted work. (This is true for other metrics using citations, not just for the h-index.)
 +
 +
* The h-index does not account for confounding factors. These include the practice of "gratuitous authorship", which is still common in some research cultures, the so-called Matthew effect, and the favorable citation bias associated with review articles.
 +
 +
* The h-index has been found to have slightly less predictive accuracy and precision than the simpler measure of mean citations per paper. However, this finding was contradicted by another study.
 +
 +
* The h-index is a natural number and thus lacks discriminatory power. Ruane and Tol therefore propose a rational h-index that interpolates between h and h+1.
 +
 +
* While the h-index de-emphasizes singular successful publications in favor of sustained productivity, it may do so too strongly. Two scientists may have the same h-index, say, h = 30, but one has 20 papers that have been cited more than 1000 times and the other has none. Clearly scientific output of the former is more valuable. Several recipes to correct for that have been proposed, such as the g-index, but none has gained universal support.
 +
 +
* The h-index is affected by limitations in citation data bases. Some automated searching processes find citations to papers going back many years, while others find only recent papers or citations. This issue is less important for those whose publication record started after automated indexing began around 1990. Citation data bases contain some citations that are not quite correct and therefore will not properly match to the correct paper or author.
 +
 +
* The h-index does not account for the number of authors of a paper. If the impact of a paper is the number of citations it receives, it might be logical to divide that impact by the number of authors involved. (Some authors will have contributed more than others, but in the absence of information on contributions, the simplest assumption is to divide credit equally.) Not taking into account the number of authors could allow gaming the h-index and other similar indices: for example, two equally capable researchers could agree to share authorship on all their papers, thus increasing each of their h-indices. Even in the absence of such explicit gaming, the h-index and similar indices tend to favor fields with larger groups, e.g. experimental over theoretical. An individual h-index normalized by the average number of co-authors in the h-core has been introduced by Batista et al. They also found that the distribution of the h-index, although depends of the field, can be normalized by a simple reescaling factor. For example, assuming as standard the hs for Biology, the distribution of h for mathematics colapse with it if this h is multiplied by three, that is, a mathematician with h = 3 is equivalent to a biologist with h = 9.

Revision as of 11:59, 7 May 2009

Contents

Purpose and importance of Citation

In all types of scholarly and research writing it is necessary to document the source works that underpin particular concepts, positions, propositions and arguments with citations. These citations serve a number of purposes:

Help readers identify and relocate the source work

Readers often want to relocate a work you have cited, either to verify the information, or to learn more about issues and topics addressed by the work. It is important that readers should be able to relocate your source works easily and efficiently from the information included in your citations (see the “Citation Structure” topic on the following page for details), in the sources available to them - which may or may not be the same as the sources available to you .

Provide evidence that the position is well-researched

Scholarly writing is grounded in prior research. Citations allow you to demonstrate that your position or argument is thoroughly researched and that you have referenced, or addressed, the critical authorities relevant to the issues.

Give credit to the author of an original concept or theory presented

Giving proper attribution to those whose thoughts, words, and ideas you use is an important concept in scholarly writing. For these reasons, it is important to adopt habits of collecting the bibliographic information on source works necessary for correct citations in an organized and thorough manner.

G-index

The g-index is an index for quantifying the scientific productivity of physicists and other scientists based on their publication record. It was suggested in 2006 by Leo Egghe.

The index is calculated based on the distribution of citations received by a given researcher's publications.

Given a set of articles ranked in decreasing order of the number of citations that they received, the g-index is the (unique) largest number such that the top g articles received (together) at least g2 citations.

An alternative definition is

Given a set of articles ranked in decreasing order of the number of citations that they received, the g-index is the (unique) largest number such that the top g articles received on average at least g citations.

This index is very similar to the h-index, and attempts to address its shortcomings. Like the h-index, the g-index is a natural number and thus lacks in discriminatory power. Therefore, Richard Tol proposed a rational generalisation.

Tol also proposed a successive g-index.

Given a set of researchers ranked in decreasing order of their g-index, the g1-index is the (unique) largest number such that the top g1 researchers have on average at least a g-index of g1.

H-index

Advantages of H-index

The h-index was intended to address the main disadvantages of other bibliometric indicators, such as total number of papers or total number of citations. Total number of papers does not account for the quality of scientific publications, while total number of citations can be disproportionately affected by participation in a single publication of major influence. The h-index is intended to measure simultaneously the quality and sustainability of scientific output, as well as, to some extent, the diversity of scientific research. The h-index is much less affected by methodological papers proposing successful new techniques, methods or approximations, which can be extremely highly cited. For example, one of the most cited condensed matter theorists, John P. Perdew, has been very successful in devising new approximations within the widely used density functional theory. He has published 3 papers cited more than 5000 times and 2 cited more than 4000 times. Several thousand papers utilizing the density functional theory are published every year, most of them citing at least one paper of J.P. Perdew. His total citation count is close to 39 000, while his h-index is large, 51, but not unique. In contrast, the condensed-matter theorist with the highest h-index (94), Marvin L. Cohen, has a lower citation count of 35 000. One can argue that in this case the h-index reflects the broader impact of Cohen's papers in solid-state physics due to his larger number of highly-cited papers.

Criticism of H-index

  • Michael Nielsen points out that "...the h-index contains little information beyond the total number of citations, and is not properly regarded as a new measure of impact at all". According to Nielsen, to a good approximation, h ~ sqrt(T)/2, where T is the total number of citations.

There are a number of situations in which h may provide misleading information about a scientist's output:

  • The h-index is bounded by the total number of publications. This means that scientists with a short career are at an inherent disadvantage, regardless of the importance of their discoveries. For example, Évariste Galois' h-index is 2, and will remain so forever. Had Albert Einstein died in early 1906, his h-index would be stuck at 4 or 5, despite his being widely acknowledged as one of the most important physicists, even considering only his publications to that date.
  • The h-index does not consider the context of citations. For example, citations in a paper are often made simply to flesh-out an introduction, otherwise having no other significance to the work. h also does not resolve other contextual instances: citations made in a negative context and citations made to fraudulent or retracted work. (This is true for other metrics using citations, not just for the h-index.)
  • The h-index does not account for confounding factors. These include the practice of "gratuitous authorship", which is still common in some research cultures, the so-called Matthew effect, and the favorable citation bias associated with review articles.
  • The h-index has been found to have slightly less predictive accuracy and precision than the simpler measure of mean citations per paper. However, this finding was contradicted by another study.
  • The h-index is a natural number and thus lacks discriminatory power. Ruane and Tol therefore propose a rational h-index that interpolates between h and h+1.
  • While the h-index de-emphasizes singular successful publications in favor of sustained productivity, it may do so too strongly. Two scientists may have the same h-index, say, h = 30, but one has 20 papers that have been cited more than 1000 times and the other has none. Clearly scientific output of the former is more valuable. Several recipes to correct for that have been proposed, such as the g-index, but none has gained universal support.
  • The h-index is affected by limitations in citation data bases. Some automated searching processes find citations to papers going back many years, while others find only recent papers or citations. This issue is less important for those whose publication record started after automated indexing began around 1990. Citation data bases contain some citations that are not quite correct and therefore will not properly match to the correct paper or author.
  • The h-index does not account for the number of authors of a paper. If the impact of a paper is the number of citations it receives, it might be logical to divide that impact by the number of authors involved. (Some authors will have contributed more than others, but in the absence of information on contributions, the simplest assumption is to divide credit equally.) Not taking into account the number of authors could allow gaming the h-index and other similar indices: for example, two equally capable researchers could agree to share authorship on all their papers, thus increasing each of their h-indices. Even in the absence of such explicit gaming, the h-index and similar indices tend to favor fields with larger groups, e.g. experimental over theoretical. An individual h-index normalized by the average number of co-authors in the h-core has been introduced by Batista et al. They also found that the distribution of the h-index, although depends of the field, can be normalized by a simple reescaling factor. For example, assuming as standard the hs for Biology, the distribution of h for mathematics colapse with it if this h is multiplied by three, that is, a mathematician with h = 3 is equivalent to a biologist with h = 9.


References

1. An index to quantify an individual's scientific research output by J. E. Hirsch (PNAS) [1]

2. Does the h index have predictive power? by J. E. Hirsch (PNAS) [2]