IUBio GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP

Central versus institutional self-archiving

Stevan Harnad harnad at ecs.soton.ac.uk
Sun Nov 23 08:08:28 EST 2003


Dear colleagues,

I would like to make some supplementary comments to the message from
my much admired and very dear friend Arun:

> There are two trends: open access journals and open access archives. 

There is actually only *one* trend, of which these are merely two
aspects. The trend is: To provide free full-text online access to the
peer-reviewed research literature.

The two key properties are (1) "free, full-text, online access,"
which we now abbreviate as "open access," and (2) "peer-reviewed,"

The very specific target of the open-access movement is 
peer-reviewed research, rather than anything and everything! We
know exactly what this target is: It is the annual contents of
the planet's 24,000 peer-reviewed journals -- across fields,
languages, and countries: about 2,500,000 peer-reviewed articles
per year.

The *principal* goal is to provide open access to all of those articles,
present, future and past. They are the literature to which access
is currently blocked by publishers' access-tolls (subscriptions,
site-licenses, payment-per-item).

A secondary goal is open access to the pre-peer-review "preprints"
of those articles (as well as to any postpublication "postprints"
with updates and corrections). But we must not mix up the primary
goal, which is open access to the toll-gated peer-reviewed literature, with
the secondary goal, which is enhanced access to pre-peer-review or
non-peer-reviewed research. 

The reason it is so important to distinguish these is that, in a sense,
the only problem with access to the pre- or non-peer-reviewed research
is that it is mostly not made public at all (unlike the the peer-reviewed
PUBLICation). So the problem there is not access-blocking tolls, but simply
the absence of the author practise of making pre-peer-reviewed research public.

The author's prerogative*not* to make his pre-peer-review research public
must not be challenged. This is the author's choice. Authors can be
*encouraged* to make their preprints public, for the sake of the great
benefits (in the speed of their scientific progress and the size of
their scientific impact) that this has been shown to provide, for example,
for physicists, because of their systematic practise (pre-dating the
Internet) of publicising their preprints.

But it is important not mix up this secondary goal -- of encouraging all
researchers to emulate physicists, where possible, by making their
preprints openly accessible -- with the primary goal of getting all
researchers to make their peer-reviewed postprints openly accessible.

It is to the validated, quality-controlled and certified journal
literature that we currently lack access, and that is what is needlessly
losing so much of its potential impact and progress because most of
its would-be users worldwide are denied access by the unaffordable
access-toll-barriers.

Remedying *that* condition, as soon as possible, is what this is all about.

The two (complementary) ways to remedy it are for the author to provide
open access to his peer-reviewed research by (1) publishing it (whenever
possible) in an open-access rather than a toll-access journal (i.e.,
whenever a suitable open-access journal exists -- which is unfortunately
only for <5% of research output today) and (2) publishing it in a
suitable toll-access journal otherwise (<95%) but also *self-archiving*
it in an open-access archive.

> There are two main kinds of archives:
> Centralised (like arXiv for physics, and CiteSeer for
> computer science) and self archiving. 

Actually, there are *eight* main kinds of archives, but some of the
cells of this 2 x 2 x 2 matrix are empty, for logical reasons: 

Open-Access Archives can be (1) OAI+ or OAI- (i.e., tagged or not tagged
according to the OAI protocol for making all the different archives'
contents interoperable, and hence seamlessly integrated across archives);
their full-texts can be (2) harvested or self-deposited (H+/H-); and
they can be (3) central or institutional (I+/I-).

ArXiv is OAI+, H-, and I-: It is OAI-interoperable, self-deposited, and central.

Citeseer is OAI-,H+ and I-: *not* OAI-interoperable, harvested, and central.

Eprints Archives are all OAI+, H- and I+: OAI-interoperable, self-deposited,
and institutional.

The logically empty categories are H+/I+ (institutional archives are just
for self-depositing: they do not harvest their full-texts from other 
institutions' archives) and also OAI+/H+ (if the metadata are OAI-interoperable,
there is no need to harvest the full-texts, just the metadata).

The trouble with harvested, non-OAI archives like citeseer is that,
because their contents are harvested and not OAI-interoperable, their
useability and usefulness is far more limited.

And the trouble with central archives like ArXiv is that the central
"entity" behind them -- whether it is a voluntary subset of individuals
in the discipline, as with ArXiv, or even a Learned Society, or some
national organisation, as with PubMed -- these central entities are either
just virtual, or they are the wrong entities for being in a position to
mandate or monitor open access-provision. Hence central archiving grows
too slowly and uncertainly.

In contrast, researchers' own employing institutions *are* in a position
to mandate and monitor open-access provision, and could do so most
naturally and effectively by implementing their own departmental OAI+
archives. Researchers' own institutions already mandate publication itself,
with the carrot/stick of their "publish or perish" policies. Moreover,
these existing publish-or-perish policies have already evolved, quite
naturally, into ones in which they take into account not merely the *number*
of each researcher's publications, but also their research *impact*:
i.e., how much are they read, used and cited by other researchers? So the
institutional rule is already "publish impactfully."
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html

Hence it means taking only one more small and natural further step in
the evolution of institutional publication policy in the online era to
extend institutions' existing publication mandate to "publish
open-accessibly" -- in order to maximize research impact, progress
and productivity (and its rewards: salary, promotion, tenure,
research-funding, prizes, prestige).

Researchers' institutions can mandate open-access provision, and they
can monitor compliance. Central archives are growing far too slowly,
because they are not the kinds of entity that can mandate self-archiving.

There is one other central entity that *can* and should mandate
self-archiving, though, and that is the research-funding councils and
agencies, especially the governmental ones, which can mandate maximising
research impact by maximising research access through mandated open-access
for the results of all research that has been funded by the tax-payer,
so as to maximize the benefits to the tax-payer. But even for mandated
self-archiving of centrally funded research, it probably makes more
sense to self-archive, and monitor compliance, at the institutional
level -- especially with the help of OAI-interoperability, allowing the
harvesting and monitoring of the relevant metadata by all the parties
(home-institution, funders) concerned.
http://www.ariadne.ac.uk/issue35/harnad/

The policy to mandate open-access provision is also quite simple:
Researchers must provide open access to all peer-reviewed research
publications, as described above by

    (1) publishing them (whenever possible) in an open-access rather
    than a toll-access journal (i.e., whenever a suitable open-access
    journal exists: <5% of research output today)

    and

    (2) publishing them otherwise (>95%) in a suitable toll-access
    journal but *also* self-archiving them in their own institutional
    open-access OAI archives.

> Self archiving can be by an individual, a small group or by an
> institution. IISc, B'lore. has set up an institutional archive, but the
> IISc faculty and students are NOT posting all their papers. May be we
> need to persuade them.

Precisely. And that is why existing publish-or-perish policy must be given
this small further extension: That policy already protects researchers
from the temptation to do the research, write up the results, and then
put them in a desk-drawer and move on to the next piece of research! If
researchers were allowed to do that, then the research may as well
not have been done, for it would not be accessible to any potential
users to apply and build upon. 

Researchers are accordingly induced by the publish-or-perish carrot/stick
to take the further step of first revising their results to meet the
peer-review standards of a suitable journal, which then certifies and
publishes the final, peer-reviewed outcome. Now, in the online/open-access
era, researchers must also provide open access to the paper (through just
the few extra keystrokes required to deposit it in their institutional
OAI-compliant open-access research archives).

> CiteSeer run by Steve Lawrence of NEC, Princeton, is a different kind
> of server; no one needs to post one's article; Steve's team spiders the
> net and gathers the articles automatically (through a special software,
> I guess)!

Yes the article does have to be "posted: for a harvester like citeseer
to be able to harvest it! It must be deposited by *someone* (usually
the author!) on any website, in any form (i.e., not necessarily
OAI-compliantly) for citeseer to be able to find and harvest it. The
non-OAI-compliance -- hence the lack of interoperability with the other
kinds of archive, and the uncertainty of the detection and the quality
of the untagged contents (is it a preprint or a reprint: is it research
at all?) -- makes this generic, google-style harvesting nonoptimal for
the peer reviewed research literature:

Think of OAI-interoperability as generating google-power that is
restricted to all and only the peer-reviewed research literature
(preprint and postprint). Oaister gives a foretaste of what this will
be like -- once we have implemented policies that mandate the filling of
the institutional OAI archives: http://oaister.umdl.umich.edu/o/oaister/

> All we in India want is that scientists and scholars all over the world
> should make their findings available through interoperable open archives
> and may be publish their papers in refereed OPEN ACCESS journals.

Charity begins at home (and here the Golden Rule of reciprocity will
inspire emulation!): Let India mandate its own open-access-provision policy.
She will thereby immediately begin reaping the benefits
of maximizing the access to and hence the impact of her own
research output. And that will inspire other nations to do the same:
"Self-Archive Unto Others As Ye Would Have Them Self-Archive Unto You"
http://www.universityaffairs.ca/current_issue/articles/opinion_e.html

> We should a set a target of say 31 December 2004 and
> persuade S&T and social science research and higher
> education institutions in India to set up institional
> archives (interoperable and compatible with other such
> archives around the world). For this we should talk to
> Academies, UGC, Councils, Heads of institutions and of
> course scientists and scholars. SARAI's Media Lab can
> produce some posters and persuasive messages. 

Bravo! From your lips to the ears of the policy-makers! In your efforts
to inform them, please feel free to use these powerpoints showing the
powerful and dramatic causal connection between research access and
research impact.
http://www.universityaffairs.ca/current_issue/articles/opinion_e.html

Stevan Harnad

NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98 & 99 & 00 & 01 & 02 & 03):
    http://amsci-forum.amsci.org/archives/september98-forum.html
    http://www.cogsci.soton.ac.uk/~harnad/Hypermail/Amsci/index.html
    Post discussion to: september98-forum at amsci-forum.amsci.org 

Dual Open-Access Strategy:
    BOAI-2 ("gold"): Publish your article in a suitable open-access
            journal whenever one exists.
    BOAI-1 ("green"): Otherwise, publish your article in a suitable
            toll-access journal and also self-archive it.
    http://www.soros.org/openaccess/read.shtml
    http://www.ecs.soton.ac.uk/~harnad/Temp/berlin.htm
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0026.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0021.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0024.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0028.gif





More information about the Jrnlnote mailing list

Send comments to us at archive@iubioarchive.bio.net