Citation : criterion and it's measurements

From DrugPedia: A Wikipedia for Drug discovery

(Difference between revisions)
Jump to: navigation, search
(Google Scholar)
Line 35: Line 35:
==Google Scholar==
==Google Scholar==
-
'''Google Scholar: The New Generation of Citation''' Indexes [http://www.librijournal.org/pdf/2005-4pp170-180.pdf]
+
'''Google Scholar: The New Generation of Citation Indexes''' [http://www.librijournal.org/pdf/2005-4pp170-180.pdf]
==G-index==
==G-index==

Revision as of 12:36, 7 May 2009

Contents

Purpose and importance of Citation

In all types of scholarly and research writing it is necessary to document the source works that underpin particular concepts, positions, propositions and arguments with citations. These citations serve a number of purposes:

Help readers identify and relocate the source work

Readers often want to relocate a work you have cited, either to verify the information, or to learn more about issues and topics addressed by the work. It is important that readers should be able to relocate your source works easily and efficiently from the information included in your citations (see the “Citation Structure” topic on the following page for details), in the sources available to them - which may or may not be the same as the sources available to you .

Provide evidence that the position is well-researched

Scholarly writing is grounded in prior research. Citations allow you to demonstrate that your position or argument is thoroughly researched and that you have referenced, or addressed, the critical authorities relevant to the issues.

Give credit to the author of an original concept or theory presented

Giving proper attribution to those whose thoughts, words, and ideas you use is an important concept in scholarly writing. For these reasons, it is important to adopt habits of collecting the bibliographic information on source works necessary for correct citations in an organized and thorough manner.

Misuse of Impact Factors

  • The impact factor is often misused to predict the importance of an individual publication based on where it was published. This does not work well since a small number of publications are cited much more than the majority - for example, about 90% of Nature's 2004 impact factor was based on only a quarter of its publications, and thus the importance of any one publication will be different and on the average less than the overall number. The impact factor, however, averages over all articles and thus underestimates the citations of the top cited while exaggerating the number of citations of the average publication.
  • Academic reviewers involved in programmatic evaluations, particularly those for doctoral degree granting institutions, often turn to ISI's proprietary IF listing of journals in determining scholarly output. This builds in a bias which automatically undervalues some types of research and distorts the total contribution each faculty member makes.
  • The absolute value of an impact factor is meaningless. A journal with an IF of 2 would not be very impressive in Microbiology, while it would in Oceanography. Such values are nonetheless sometimes advertised by scientific publishers.
  • The comparison of impact factors between different fields is invalid. Yet such comparisons have been widely used for the evaluation of not merely journals, but of scientists and of university departments. It is not possible to say, for example, that a department whose publications have an average IF below 2 is low-level. This would not make sense for Mechanical Engineering, where only two review journals attain such a value.
  • Outside the sciences, impact factors are relevant for fields that have a similar publication pattern to the sciences (such as economics), where research publications are almost always journal articles, that cite other journal articles. They are not relevant for literature, where the most important publications are books citing other books. Therefore, Thomson Scientific does not publish a JCR for the humanities. Nor are they relevant for many areas of computer science, where the majority of the important publications appear in refereed conference proceedings and cite other conference proceedings.
  • Even though in practice they are applied this way, impact factors cannot correctly be the only thing to be considered by libraries in selecting journals. The local usefulness of the journal is at least equally important, as is whether or not an institution's faculty member is editor of the journal or on its editorial review board.
  • Though the impact factor was originally intended as an objective measure of the reputability of a journal (Garfield), it is now being increasingly applied to measure the productivity of scientists. The way it is customarily used is to examine the impact factors of the journals in which the scientist's articles have been published. This has obvious appeal for an academic administrator who knows neither the subject nor the journals.
  • The absolute number of researchers, the average number of authors on each paper, and the nature of results in different research areas, as well as variations in citation habits between different disciplines, particularly the number of citations in each paper, all combine to make impact factors between different groups of scientists incommensurable. Generally, for example, medical journals have higher impact factors than mathematical journals and engineering journals. This limitation is accepted by the publishers; it has never been claimed that they are useful between fields--such a use is an indication of misunderstanding.
  • HEFCE was urged by the Parliament of the United Kingdom Committee on Science and Technology to remind Research Assessment Exercise (RAE) panels that they are obliged to assess the quality of the content of individual articles, not the reputation of the journal in which they are published.

Google Scholar

Google Scholar: The New Generation of Citation Indexes [1]

G-index

The g-index is an index for quantifying the scientific productivity of physicists and other scientists based on their publication record. It was suggested in 2006 by Leo Egghe.

The index is calculated based on the distribution of citations received by a given researcher's publications.

Given a set of articles ranked in decreasing order of the number of citations that they received, the g-index is the (unique) largest number such that the top g articles received (together) at least g2 citations.

An alternative definition is

Given a set of articles ranked in decreasing order of the number of citations that they received, the g-index is the (unique) largest number such that the top g articles received on average at least g citations.

This index is very similar to the h-index, and attempts to address its shortcomings. Like the h-index, the g-index is a natural number and thus lacks in discriminatory power. Therefore, Richard Tol proposed a rational generalisation.

Tol also proposed a successive g-index.

Given a set of researchers ranked in decreasing order of their g-index, the g1-index is the (unique) largest number such that the top g1 researchers have on average at least a g-index of g1.

H-index

he h-index is an index that attempts to measure both the scientific productivity and the apparent scientific impact of a scientist. The index is based on the set of the scientist's most cited papers and the number of citations that they have received in other people's publications. The index can also be applied to the productivity and impact of a group of scientists, such as a department or university or country. The index was suggested by Jorge E. Hirsch, a physicist at UCSD, as a tool for determining theoretical physicists' relative quality and is sometimes called the Hirsch index or Hirsch number.

Hirsch suggested that, for physicists, a value for h of about 10-12 might be a useful guideline for tenure decisions at major research universities. A value of about 18 could mean a full professorship, 15–20 could mean a fellowship in the American Physical Society, and 45 or higher could mean membership in the United States National Academy of Sciences.

Advantages of H-index

The h-index was intended to address the main disadvantages of other bibliometric indicators, such as total number of papers or total number of citations. Total number of papers does not account for the quality of scientific publications, while total number of citations can be disproportionately affected by participation in a single publication of major influence. The h-index is intended to measure simultaneously the quality and sustainability of scientific output, as well as, to some extent, the diversity of scientific research. The h-index is much less affected by methodological papers proposing successful new techniques, methods or approximations, which can be extremely highly cited. For example, one of the most cited condensed matter theorists, John P. Perdew, has been very successful in devising new approximations within the widely used density functional theory. He has published 3 papers cited more than 5000 times and 2 cited more than 4000 times. Several thousand papers utilizing the density functional theory are published every year, most of them citing at least one paper of J.P. Perdew. His total citation count is close to 39 000, while his h-index is large, 51, but not unique. In contrast, the condensed-matter theorist with the highest h-index (94), Marvin L. Cohen, has a lower citation count of 35 000. One can argue that in this case the h-index reflects the broader impact of Cohen's papers in solid-state physics due to his larger number of highly-cited papers.

Criticism of H-index

  • Michael Nielsen points out that "...the h-index contains little information beyond the total number of citations, and is not properly regarded as a new measure of impact at all". According to Nielsen, to a good approximation, h ~ sqrt(T)/2, where T is the total number of citations.

There are a number of situations in which h may provide misleading information about a scientist's output:

  • The h-index is bounded by the total number of publications. This means that scientists with a short career are at an inherent disadvantage, regardless of the importance of their discoveries. For example, Évariste Galois' h-index is 2, and will remain so forever. Had Albert Einstein died in early 1906, his h-index would be stuck at 4 or 5, despite his being widely acknowledged as one of the most important physicists, even considering only his publications to that date.
  • The h-index does not consider the context of citations. For example, citations in a paper are often made simply to flesh-out an introduction, otherwise having no other significance to the work. h also does not resolve other contextual instances: citations made in a negative context and citations made to fraudulent or retracted work. (This is true for other metrics using citations, not just for the h-index.)
  • The h-index does not account for confounding factors. These include the practice of "gratuitous authorship", which is still common in some research cultures, the so-called Matthew effect, and the favorable citation bias associated with review articles.
  • The h-index has been found to have slightly less predictive accuracy and precision than the simpler measure of mean citations per paper. However, this finding was contradicted by another study.
  • The h-index is a natural number and thus lacks discriminatory power. Ruane and Tol therefore propose a rational h-index that interpolates between h and h+1.
  • While the h-index de-emphasizes singular successful publications in favor of sustained productivity, it may do so too strongly. Two scientists may have the same h-index, say, h = 30, but one has 20 papers that have been cited more than 1000 times and the other has none. Clearly scientific output of the former is more valuable. Several recipes to correct for that have been proposed, such as the g-index, but none has gained universal support.
  • The h-index is affected by limitations in citation data bases. Some automated searching processes find citations to papers going back many years, while others find only recent papers or citations. This issue is less important for those whose publication record started after automated indexing began around 1990. Citation data bases contain some citations that are not quite correct and therefore will not properly match to the correct paper or author.
  • The h-index does not account for the number of authors of a paper. If the impact of a paper is the number of citations it receives, it might be logical to divide that impact by the number of authors involved. (Some authors will have contributed more than others, but in the absence of information on contributions, the simplest assumption is to divide credit equally.) Not taking into account the number of authors could allow gaming the h-index and other similar indices: for example, two equally capable researchers could agree to share authorship on all their papers, thus increasing each of their h-indices. Even in the absence of such explicit gaming, the h-index and similar indices tend to favor fields with larger groups, e.g. experimental over theoretical. An individual h-index normalized by the average number of co-authors in the h-core has been introduced by Batista et al. They also found that the distribution of the h-index, although depends of the field, can be normalized by a simple reescaling factor. For example, assuming as standard the hs for Biology, the distribution of h for mathematics colapse with it if this h is multiplied by three, that is, a mathematician with h = 3 is equivalent to a biologist with h = 9.

References

1. An index to quantify an individual's scientific research output by J. E. Hirsch (PNAS) [2]

2. Does the h index have predictive power? by J. E. Hirsch (PNAS) [3]

3. Reflections on the h-index by Prof. Anne-Wil Harzing [4]