Wednesday, March 17. 2010
In announcing Alma Swan's Review of Studies on the Open Access Impact Advantage, I had suggested that the growing number of studies on the OA Impact Advantage were clearly ripe for a meta-analysis. Here is an update:
Sunday, February 21. 2010
The following is a (belated) critique of:
"Impact Assesment," by Paul Chrisp (publisher, Core Medical Publishing) & Kevin Toale (Dove Medical Press). Pharmaceutical Marketing September 2008
"Open access has emerged in the last few years as a serious alternative to traditional commercial publishing models, taking the benefits afforded by technology one step further. In this model, authors are charged for publishing services, and readers can access, download, print and distribute papers free at the point of use."Incorrect.
Open Access (OA) means free online access and OA Publishing ("Gold OA") is just one of the two ways to provide OA (and not the fastest, cheapest or surest):
The fastest, cheapest and surest way to provide OA is OA Self-Archiving (of articles published in conventional non-OA journals: "Green OA") in the author's Institutional Repository.
"Although its ultimate goal is the free availability of information online, open access is not the same as free access – publishing services still cost money."Incorrect.
There are two forms of OA: (1) Gratis OA (free online access) and (2) Libre OA (free online access plus certain re-user rights)
"Other characteristics of open access journals are that authors retain copyright and they must self-archive content in an independent repository."Incorrect.
This again conflates Green and Gold OA:
Gold OA journals make their own articles free online.
In Green OA, articles self-archive their articles.
"researchers are depositing results in databases rather than publishing them in journal articles"Incorrect.
This conflates unrefereed preprint self-archiving with refereed, published postprint self-archiving.
Green OA is the self-archiving of refereed, published postprints.
The self-archiving of unrefereed preprints is an optional supplement to, not a substitute for, postprint OA.
"a manuscript may be read more times than it is cited, and research shows that online hits per article do not correlate with IF".Incorrect.
"Research shows" that online hits (downloads) do correlate with citations (and hence with citation impact factors).
See references cited below.
"Faculty of 1000 (www.f1000medicine.com)... asks opinion leaders in clinical practice and research to select the most influential articles in 18 medical specialties. Articles are evaluated and ranked..."Expert rankings are rankings and metrics (such as hit or citation counts) are metrics.
Metrics can and should be tested and validated against expert rankings. Validated metrics can then be used as supplements to -- or even substitutes for -- rankings. But the validation has to be done a much broader and more systematic basis than Faculty of 1000, and on a much richer set of candidate metrics.
Nor is the purpose of metrics "pharmaceutical marketing": It is to monitor, predict, navigate, analyze and reward research influence and importance.
Bollen, J., Van de Sompel, H., Hagberg, A. and Chute, R. (2009) A principal component analysis of 39 scientific impact measures in PLoS ONE 4(6): e6022,
Brody, T., Harnad, S. and Carr, L. (2006) Earlier Web Usage Statistics as Predictors of Later Citation Impact. Journal of the American Association for Information Science and Technology (JASIST) 57(8) 1060-1072.
Harnad, S. (2008) Validating Research Performance Metrics Against Peer Rankings . Ethics in Science and Environmental Politics 8 (11) doi:10.3354/esep00088 The Use And Misuse Of Bibliometric Indices In Evaluating Scholarly Performance
Harnad, S. (2009) Open Access Scientometrics and the UK Research Assessment Exercise. Scientometrics 79 (1) Also in Proceedings of 11th Annual Meeting of the International Society for Scientometrics and Informetrics 11(1), pp. 27-33, Madrid, Spain. Torres-Salinas, D. and Moed, H. F., Eds. (2007)
Lokker, C., McKibbon, K. A., McKinlay, R.J., Wilczynski, N. L. and Haynes, R. B. (2008) Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study BMJ, 2008;336:655-657
Moed, H. F. (2005) Statistical Relationships Between Downloads and Citations at the Level of Individual Documents Within a Single Journal. Journal of the American Society for Information Science and Technology 56(10): 1088- 1097
O'Leary, D. E. (2008) The relationship between citations and number of downloads Decision Support Systems 45(4): 972-980
Watson, A. B. (2009) Comparing citations and downloads for individual articles Journal of Vision 9(4): 1-4
Tuesday, January 5. 2010
Whether Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research
Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research
Authors: Yassine Gargouri, Chawki Hajjem, Vincent Larivière, Yves Gingras, Les Carr, Tim Brody, Stevan Harnad
Abstract: Articles whose authors make them Open Access (OA) by self-archiving them online are cited significantly more than articles accessible only to subscribers. Some have suggested that this "OA Advantage" may not be causal but just a self-selection bias, because authors preferentially make higher-quality articles OA. To test this we compared self-selective self-archiving with mandatory self-archiving for a sample of 27,197 articles published 2002-2006 in 1,984 journals. The OA Advantage proved just as high for both. Logistic regression showed that the advantage is independent of other correlates of citations (article age; journal impact factor; number of co-authors, references or pages; field; article type; or country) and greatest for the most highly cited articles. The OA Advantage is real, independent and causal, but skewed. Its size is indeed correlated with quality, just as citations themselves are (the top 20% of articles receive about 80% of all citations). The advantage is greater for the more citeable articles, not because of a quality bias from authors self-selecting what to make OA, but because of a quality advantage, from users self-selecting what to use and cite, freed by OA from the constraints of selective accessibility to subscribers only.
Thursday, March 19. 2009
The Times Higher Education Supplement (THES) has reported the results of a study they commissioned by Evidence Ltd that found that the ranking criteria for assessing and rewarding research performance in the UK Research Assessment Exercise (RAE) changed from RAE 2001 to RAE 2008. The result is that citations, which correlated highly with RAE 2001, correlated less highly with RAE 2008, so a number of universities whose citation counts had decreased were rewarded more in 2008, and a number of universities whose citation counts had increased were rewarded less.
(1) Citation counts are only one (though an important one) among many potential metrics of research performance.
(2) If the RAE peer panel raters' criteria for ranking the universities varied or were inconsistent between RAE 2001 and RAE 2008 then that is a problem with peer ratings rather than with metrics (which, being objective, remain consistent).
(3) Despite the variability and inconsistency, peer ratings are the only way to initialise the weights on metrics: Metrics first have to be jointly validated against expert peer evaluation by measuring their correlation with the peer rankings, discipline by discipline; then the metrics' respective weights can be updated and fine-tuned, discipline by discipline, in conjunction with expert judgment of the resulting rankings and continuing research activity.
(4) If only one metric (e.g., citation) is used, there is the risk that expert ratings will simply echo it. But if a rich and diverse battery of multiple metrics is jointly validated and initialized against the RAE 2008 expert ratings, then this will create an assessment-assistant tool whose initial weights can be calibrated and used in an exploratory way to generate different rankings, to be compared by the peer panels with previous rankings as well as with new, evolving criteria of research productivity, uptake, importance, influence, excellence and impact.
(5) The dawning era of Open Access (free web access) to peer-reviewed research is providing a wealth of new metrics to be included, tested and assigned initial weights in the joint battery of metrics. These include download counts, citation and download growth and decay rates, hub and authority scores, interdisciplinarity scores, co-citations, tag counts, comment counts, link counts, data-usage, and many other openly accessible and measurable properties of the growth of knowledge in our evolving "Cognitive Commons."
Brody, T., Kampa, S., Harnad, S., Carr, L. and Hitchcock, S. (2003) Digitometric Services for Open Archives Environments. In Proceedings of European Conference on Digital Libraries 2003, pp. 207-220, Trondheim, Norway.
Brody, T., Carr, L., Harnad, S. and Swan, A. (2007) Time to Convert to Metrics. Research Fortnight pp. 17-18.
Brody, T., Carr, L., Gingras, Y., Hajjem, C., Harnad, S. and Swan, A. (2007) Incentivizing the Open Access Research Web: Publication-Archiving, Data-Archiving and Scientometrics. CTWatch Quarterly 3(3).
Carr, L., Hitchcock, S., Oppenheim, C., McDonald, J. W., Champion, T. and Harnad, S. (2006) Extending journal-based research impact assessment to book-based disciplines. Technical Report, ECS, University of Southampton.
Hajjem, C., Harnad, S. and Gingras, Y. (2005) Ten-Year Cross-Disciplinary Comparison of the Growth of Open Access and How it Increases Research Citation Impact. IEEE Data Engineering Bulletin 28(4) pp. 39-47.
Harnad, S. (2001) Research access, impact and assessment. Times Higher Education Supplement 1487: p. 16.
Harnad, S. (2007) Open Access Scientometrics and the UK Research Assessment Exercise. In Proceedings of 11th Annual Meeting of the International Society for Scientometrics and Informetrics 11(1), pp. 27-33, Madrid, Spain. Torres-Salinas, D. and Moed, H. F., Eds.
Harnad, S. (2008) Self-Archiving, Metrics and Mandates. Science Editor 31(2) 57-59
Harnad, S. (2008) Validating Research Performance Metrics Against Peer Rankings. Ethics in Science and Environmental Politics 8 (11) doi:10.3354/esep00088 The Use And Misuse Of Bibliometric Indices In Evaluating Scholarly Performance
Harnad, S. (2009) Multiple metrics required to measure research performance. Nature (Correspondence) 457 (785) (12 February 2009)
Harnad, S., Carr, L., Brody, T. & Oppenheim, C. (2003) Mandated online RAE CVs Linked to University Eprint Archives: Improving the UK Research Assessment Exercise whilst making it cheaper and easier. Ariadne 35.
Harnad, S., Carr, L. and Gingras, Y. (2008) Maximizing Research Progress Through Open Access Mandates and Metrics. Liinc em Revista.
Thursday, February 19. 2009
The portion of Evans & Reimer's (2009) study (E & R) is valid is timely and useful, showing that a large portion of the Open Access citation impact advantage comes from providing the developing world with access to the research produced by the developed world. Using a much bigger database, E & R refute (without citing!) a recent flawed study (Frandsen 2009) that reported that there was no such effect (as well as a premature response hailing it as "Open Access: No Benefit for Poor Scientists").
E & R found the following. (Their main finding is number #4):
#1 When articles are made commercially available online their citation impact becomes greater than when they were commercially available only as print-on-paper. (This is unsurprising, since online access means easier and broader access than just print-on-paper access.)
#2 When articles are made freely available online their citation impact becomes greater than when they were not freely available online. (This confirms the widely reported "Open Access" (OA) Advantage.)
(E & R cite only a few other studies that have previously reported the OA advantage, stating that those were only in a few fields, or within just one journal. This is not correct; there have been many other studies that likewise reported the OA advantage, across nearly as many journals and fields as E & R sampled. E & R also seem to have misunderstood the role of prepublication preprints in those fields (mostly physics) that effectively already have post-publication OA. In those fields, all of the OA advantage comes from the year(s) before publication -- "the Early OA Advantage", which is relevant to the question, raised below, about the harmful effects of access embargoes. And last, E&R cite the few negative studies that have been published -- mostly the deeply flawed studies of Phil Davis -- that found no OA Advantage or even a negative effect (as if making papers freely available reduced their citations!).#3 The citation advantage of commercial online access over commercial print-only access is greater than the citation advantage of free access over commercial print plus online access only. (This too is unsurprising, but it is also somewhat misleading, because virtually all journals have commercial online access today: hence the added advantage of free online access is something that occurs over and above mere online (commercial) access -- not as some sort of competitor or alternative to it! The comparison today is toll-based online access vs. free online access.)
(There may be some confusion here between the size of the OA advantage for journals whose contents were made free online after a pospublication embargo period, versus those whose contents were made free online immediately upon publication -- i.e., the OA journals. Commercial online access is of course never embargoed: you get access as soon as its paid for! Previous studies have made within-journal comparisons, field by field, between OA and non-OA articles within the same journal and year. These studies found much bigger OA Advantages because they were comparing like with like and because they were based on a longer time-span: The OA advantage is still small after only a year, because it takes time for citations to build up; this is even truer if the article becomes "OA" only after it has been embargoed for a year or longer!)#4 The OA Advantage is far bigger in the Developing World (i.e., Developing-World first-authors, when they cite OA compared to non-OA articles). This is the main finding of this article, and this is what refutes the Frandsen study.
What E & R have not yet done (and should!) is to check for the very same effect, but within the Developed World, by comparing the "Harvards vs. the Have-Nots" within, say the US: The ARL has a database showing the size of the journal holdings of most research university libraries in the US. Analogous to their comparison's between Developed and Developing countries, E & R could split the ARL holdings into 10 deciles, as they did with the wealth (GNI) of countries. I am almost certain this will show that a large portion of the OA impact advantage in the US comes from the US's "Have-Nots", compared to its Harvards.
The other question is the converse: The OA advantage for articles authored (rather than cited) by Developing World authors. OA does not just give the Developing World more access to the input it needs (mostly from the Developed World), as E & R showed; but OA also provides more impact for the Developing World's research output, by making it more widely accessible (to both the Developing and Developed world) -- something E & R have not yet looked at either, though they have the data! Because of what Seglen (1992) called the "skewness of science," however, the biggest beneficiaries of OA will of course be the best articles, wherever their authors: 90% of citations go to the top 10% of articles.
Last, there is the crucial question of the effect of access embargoes. It is essential to note that E & R's results are not based on immediate OA but on free access after an embargo of up to a year or more. Theirs is hence not an estimate of the increase in citation impact that results from immediate Open Access; it is just the increase that results from ending Embargoed Access.
It will be important to compare the effect of OA on embargoed versus unembargoed content, and to look at the size of the OA Advantage after an interval of longer than just a year. (Although early access is crucial in some fields, citations are not instantaneous: it may take a few years' work to generate the cumulative citation impact of that early access. But it is also true in some fast-moving fields that the extra momentum lost during a 6-12-month embargo is never really recouped.)
Evans, JA & Reimer, J. (2009) Open Access and Global Participation in Science Science 323(5917) (February 20 2009)Stevan Harnad
American Scientist Open Access Forum
Thursday, January 29. 2009
Peter Suber wrote in Open Access News:
Notifying authors when they are citedIt is clear who should notify whom -- once the global research community's (Green OA ) task is done. Our task is first to get all refereed research journal articles self-archived in their authors' Institutional Repositories (IRs) immediately upon acceptance for publication. (To accomplish that we need universal Green OA self-archiving mandates to be adopted by all institutions and funders, worldwide.)
Once all current and future articles are being immediately deposited in their authors' IRs, the rest is easy:
The articles are all in OAI-compliant IRs. The IR software treats the articles in the reference list of each of its own deposited articles as metadata, to be linked to the cited article, where it too is deposited in the distributed network of IRs. A citation harvesting service operating over this interlinked network of IRs can then provide (among many, many other scientometric services) a notification service, emailing each author of a deposited article whenever a new deposit cites it. (No proporietary firewalls, no toll- or access-barriers: IR-to-IR, i.e., peer-to-peer.)
American Scientist Open Access Forum
Thursday, January 22. 2009
The fundamental importance of capturing cited-reference metadata in Institutional Repository deposits
On 22-Jan-09, at 5:18 AM, Francis Jayakanth wrote on the eprints-tech list:
"Till recently, we used to include references for all the uploads that are happening into our repository. While copying and pasting metadata content from the PDFs, we don't directly paste the copied content onto the submission screen. Instead, we first copy the content onto an editor like notepad or wordpad and then copy the content from an editor on to the submission screen. This is specially true for the references.The items in an article's reference list are among the most important of metadata, second only to the equivalent information about the article itself. Indeed they are the canonical metadata: authors, year, title, journal. If each Institutional Repository (IR) has those canonical metadata for every one of its deposited articles as well as for every article cited by every one of its deposited articles, that creates the glue for distributed reference interlinking and metric analysis of the entire distributed OA corpus webwide, as well as a means of triangulating institutional affiliations and even name disambiguation.
Yes, there are some technical problems to be solved in order to capture all references, such as they are, filtering out noise, but those technical problems are well worth solving (and sharing the solution) for the great benefits they will bestow.
The same is true for handling the numerous (but finite) variant formats that references may take: Yes, there are many, including different permutations in the order of the key components, abbreviations, incomplete components etc., but those too are finite, can be solved once and for all to a very good approximation, and the solution can be shared and pooled across the distributed IRs and their softwares. And again, it is eminently worthwhile to make the relatively small effort to do this, because the dividends are so vast.
I hope the IR community in general -- and the EPrints community in particular -- will make the relatively small, distributed, collaborative effort it takes to ensure that this all-important OA glue unites all the IRs in one of their most fundamental functions.
(Roman Chyla has replied to eprints-tech with one potential solution: "The technical solution has been there for quite some time, look at citeseer where all the references are extracted automatically (the code of the citeseer, the old version, was available upon request - I dont know if that is the case now, but it was in the past). That would be the right way to go, imo. I think to remember one citeseer-based library for economics existed, so not only the computer-science texts with predictable reference styles are possible to process. With humanities it is yet another story.")Stevan Harnad
American Scientist Open Access Forum
Wednesday, January 14. 2009
"[A]n investigation of the use of open access by researchers from developing countries... show[s] that open access journals are not characterised by a different composition of authors than the traditional toll access journals... [A]uthors from developing countries do not cite open access more than authors from developed countries... [A]uthors from developing countries are not more attracted to open access than authors from developed countries. [underscoring added]"(Frandsen 2009, J. Doc. 65(1))Open Access is not the same thing as Open Access Journals.
Articles published in conventional non-Open-Access journals can also be made Open Access (OA) by their authors -- by self-archiving them in their own Institutional Repositories.
The Frandsen study focused on OA journals, not on OA articles. It is problematic to compare OA and non-OA journals, because journals differ in quality and content, and OA journals tend to be newer and fewer than non-OA journals (and often not at the top of the quality hierarchy).
Some studies have reported that OA journals are cited more, but because of the problem of equating journals, these findings are limited. In contrast, most studies that have compared OA and non-OA articles within the same journal and year have found a significant citation advantage for OA. It is highly unlikely that this is only a developed-world effect; indeed it is almost certain that a goodly portion of OA's enhanced access, usage and impact comes from developing-world users.
It is unsurprising that developing world authors are hesitant about publishing in OA journals, as they are the least able to pay author/institution publishing fees (if any). It is also unsurprising that there is no significant shift in citations toward OA journals in preference to non-OA journals (whether in the developing or developed world): Accessibility is a necessary -- not a sufficient -- condition for usage and citation: The other necessary condition is quality. Hence it was to be expected that the OA Advantage would affect the top quality research most. That's where the proportion of OA journals is lowest.
The Seglen effect ("skewness of science") is that the top 20% of articles tend to receive 80% of the citations. This is why the OA Advantage is more detectable by comparing OA and non-OA articles within the same journal, rather than by comparing OA and non-OA journals.
We will soon be reporting results showing that the within-journal OA Advantage is higher in "higher-impact" (i.e., more cited) journals. Although citations are not identical with quality, they do correlate with quality (when comparing like with like). So an easy way to understand the OA Advantage is as a quality advantage -- with OA "levelling the playing field" by allowing authors to select which papers to cite on the basis of their quality, unconstrained by their accessibility. This effect should be especially strong in the developing world, where access-deprivation is greatest.
American Scientist Open Access Forum
Tuesday, January 13. 2009
Harnad, Stevan (2009) Multiple metrics required to measure research performance. Nature (Correspondence) 457 (785) (12 February 2009) doi :10.1038/457785a;Nature's editorial "Experts still needed" (Nature 457: 7-8, 1 January 2009) is right that no one metric alone can substitute for the expert evaluation of research performance (based on already-published, peer-reviewed research), because no single metric (including citation counts) is strongly enough correlated with expert judgments to take their place. However, some individual metrics (such as citation counts) are nevertheless significantly correlated with expert judgments; and it is likely that a battery of multiple metrics, used jointly, will be even more strongly correlated with expert judgments. That is the unique opportunity that the current UK Research Assessment Exercise (RAE) -- and our open, online age, with its rich spectrum of potential performance indicators -- jointly provide: the opportunity to systematically cross-validate a rich and diverse battery of candidate metrics of research productivity, performance and impact (including citations, co-citations, downloads, tags, growth/decay metrics, etc.) against expert judgments, field by field. The rich data that the 2008 RAE returns have provided make it possible to do this validation exercise now too, for all disciplines, on a major nation-sized database. If successfully validated, the metric batteries can then not only pinch-hit for experts in future RAEs, but they will provide an open database that allows anyone, anywhere, any time to do comparative evaluations of research performance: continuous assessment and answerability.
(Note that what is at issue is whether metrics can substitute for costly and time-consuming expert rankings in the retrospective assessment of published, peer-reviewed research. It is of course not peer review itself -- another form of expert judgment -- that metrics are being proposed to replace [or simplify and supplement], for either submitted papers or research proposals.)
Harnad, S. (2008) Validating Research Performance Metrics Against Peer Rankings. Ethics in Science and Environmental Politics 8 (11) doi:10.3354/esep00088 Special Issue: The Use And Misuse Of Bibliometric Indices In Evaluating Scholarly Performance
American Scientist Open Access Forum
Saturday, November 22. 2008
In "Open Access: The question of quality," Richard Poynder writes:
"Open Access scientometrics... raise the intriguing possibility that if research becomes widely available on the Web the quality of papers published in OA journals may start to overtake, not lag [behind], the quality of papers published in TA journals... Why? Because if these tools were widely adopted the most important factor would no longer be which journal you managed to get your paper published in, but how other researchers assessed the value of your work — measured by a wide range of different indicators, including for instance when and how they downloaded it, how they cited it, and the different ways in which they used it."All true, but how does it follow from this that OA journals will overtake TA journals? As Richard himself states, publishing in an OA journal ("Gold OA") is not the only way to make one's article OA: One can publish in a TA journal and self-archive ("Green OA"). OA scientometrics apply to all OA articles, Green and Gold; so does the OA citation advantage.
Is Richard perhaps conflating TA journals in general with top-TA journals (which may indeed lose some of their metric edge because OA scientometrics is, as Richard notes, calculated at the article- rather than the journal-level)? The only overtaking I see here is OA overtaking TA, not OA journals overtaking TA journals. (Besides, there are top-OA journals too, as Richard notes, and bottom-rung TA ones as well.)
It should also be pointed out that the top journals differ from the rest of the journals not just in their impact factor (which, as Richard points out, is a blunt instrument, being based on journal averages rather than individual-article citation counts) but in their degree of selectivity (peer revew standards): If I am selecting members for a basketball team, and I only accept the tallest 5%, I am likely to have a taller team than the team that is less selective on height.
Selectivity is correlated with impact factor, but it is also correlated with quality itself. The Seglen "skewness" effect (that about 80% of citations go to the top 20% of articles) is not just a within-journal effect: it is true across all articles across all journals. There is no doubt variation within the top journals, but not only are their articles cited more on average, but they are also better quality on average (because of their greater selectivity). And the within-journal variation around the mean is likely to be tighter in those more selective journals than the less-selective journals.
OA will give richer and more diverse metrics; it will help the cream (quality) to rise to the top (citations) unconstrained by whether the journal happens to be TA or OA. But it is still the rigor and selectivity of peer review that does the quality triage in the quality hierarchy among the c. 25,000 peer reviewed journals, not OA.
(And performance evaluation committees are probably right to place higher weight on more selective journals -- and on journals with established, longstanding track-records.)
American Scientist Open Access Forum
Syndicate This Blog
Materials You Are Invited To Use To Promote OA Self-Archiving:
The American Scientist Open Access Forum has been chronicling and often directing the course of progress in providing Open Access to Universities' Peer-Reviewed Research Articles since its inception in the US in 1998 by the American Scientist, published by the Sigma Xi Society.
The Forum is largely for policy-makers at universities, research institutions and research funding agencies worldwide who are interested in institutional Open Acess Provision policy. (It is not a general discussion group for serials, pricing or publishing issues: it is specifically focussed on institutional Open Acess policy.)
You can sign on to the Forum here.
Last entry: 2017-03-27 13:12
1125 entries written
238 comments have been made