Exploring the development of the
independent, electronic, scholarly journal, by Alison
Wells
Why do we need to know who is reading electronic
journals?
The circulation of paper journals can be determined
and audited with precision due to the fact that the
number of copies printed can be counted, as can the
number of subscriptions, but without surveillance of
library shelves, it is very difficult to know who is
reading them, and in fact if they are being read at all.
Producing journals electronically offers the chance for
the publisher to get to know the user. Dawson
(1998) feels that "it is certainly easier
and cheaper to collect usage figures for electronic
periodicals than for printed ones." The
most important reason to know who is reading is to get
funding through advertising or sponsorship.
Advertisers demand this kind of information, which is why
a lot of commercial publications, e.g. electronic
newspapers, require compulsory registration involving a
long list of personal questions. Other funding bodies
such as universities, tend to want to know how
many people are reading rather than who they
are.
What methods could be used to determine readership?
- Using software to get hit rates on each
page :
There are a lot of easily available programs which
can be run on any server to get statistics as to what
pages have been downloaded etc. The logfiles produced
are designed to be of most use to the person running
the server the web site is mounted on, because the
statistics include a lot of information relating to
the amount of work the server is doing sending and
receiving data. Dawson (1998)
analyses the data obtained from the BUBL journals
service and uses it to infer user behaviour such as
browsing, reading or searching
Advantages :
Useful pieces of information include the number
of unique hosts visiting a site in a certain
time period. From this information the country of
origin of the reader can be determined with caution
from the last part of the address (e.g. .uk for the
UK, or .fr for France), however addresses ending with
.com or .net for example could be from anywhere, even
though they are usually assumed to be from the US.
The number of hits on each page
can be determined. In the case of electronic
journals, this is useful because it is easy to tell
which papers are most popular, particularly if the
reader has an abstract to read first before
downloading the full paper. However, this does not
tell you how many unique readers you had, as one
person could have downloaded four papers, or four
people could have downloaded one.
Other useful pieces of information
which some software produce, which are not directly
related to the determination of readership, are the referring
page that the reader came from, usually from
search engines. The data usually includes the search
that was carried out on the search engine which gives
an idea of the subject they were looking for. Also
information can be found on the type of
browser used, and its release number (e.g.
Netscape 4.5, Lynx etc.) which is interesting,
particularly if new features which take advantage of
the capabilities of the newer browsers are being
considered.
Disadvantages :
Proxies and caches : Many large
sites, for example Universities, try to manage their
network traffic by these methods. Using a proxy means
that only the address of the proxy is registered for
each request sent by a member of that site, so even
though 50 people from the University of Sheffield may
have looked at a site, they will all have been lumped
under the address "vampire.shef.ac.uk",
which is the address of the proxy. Caching, i.e.
keeping a local copy of a frequently visited site,
also causes problems, as the hits to the local copy
will not be counted at all on the original site. Even
where you can identify individual addresses, these
relate only to a computer and not to an individual
who may share a computer, or access it from work and
home.
Spiders : Otherwise known as
'robots', these are small programs run by search
engines which download pages to the database of the
search engine. Each page they download counts in the
hit rates, but they are not readers.
Mirrors : Many sites have mirrors
- identical copies of the original site mounted on
servers across the world - to decrease the load on
the home server and increase response times. However,
this means that now there are as many sets of
statistics as there are mirrors, which need to be
added together to get an idea of readership.
Frequently the home site of the journal does not have
access to these statistics anyway.
These statistics cannot tell all the information
needed without analysis - it is not
true to say that a page which has been
"hit" 10000 times has been read that many
times. As a large proportion of the referring pages
are notoriously inaccurate search engines, the reader
may have taken one look at the front page and decided
that it was not what they wanted. Also, each hit
refers to one file, including graphics, so one person
looking at the front page of this dissertation may
have generated 20 hits, because of the number of
graphics on the page.
Finally and most importantly, you cannot tell from
these statistics how many regular
readers you have, and who they are.
- Asking readers to register voluntarily
Advantage :
Using a registration form can help to gather more
information about a reader, such as their job /
position, country of origin, etc.
Disadvantage :
Most people will not fill these in because they do
not want to give out personal information
unnecessarily
- Making registration compulsory with
access by password
Advantage :
This is the only way to get concrete
information about who is reading your
journal and what they are reading. By making access
by password, or by setting a "cookie"
(sending a code to your computer which can be checked
when you load the page to see who you are), you get
full details on each reader's session. You can also
tell how many regular readers you have and how often
they visit.
Disadvantages :
Compulsory registration tends to put
people off, even if it is free.
Users can give incorrect information
deliberately, and may not like the "Big
Brother" implications of having their
reading monitored.
Passwords are easily forgotten,
leading users to reregister, and duplicating
information.
- Providing a mailing list which sends out
contents / abstracts for each issue as it is
published.
Advantage:
The incentive of the alerting
service encourages people to register who would not
have registered voluntarily.
The mailing list should tend to include the
readers who are interested in the subject and will
read it regularly.
There is also the added bonus of advertising new
issues of your journal to people who are interested
and reminding them that it is still there.
Disadvantage:
Many people feel overloaded by e-mail
anyway and do not want the extra information.
- Using a counter
This is one of the crudest methods to count
readers, but also one of the easiest to set up.
Advantage:
Easy to add a counter to a page. Can be useful to
add them to articles so that you can check interest
in each paper.
Disadvantage:
Does not register a hit if it not loaded, and they
can often slow down page loading rates considerably,
leading to annoyance of users.
Does not give you any of the detailed data
available from the software in (i)
Reference
Dawson, A. (1998).
Inferring user behaviour from journal access figures.
[http://bubl.ac.uk/journals/lis/oz/serlib/v35n0399/dawson.htm].
Site visited at 14.6.99
Title Page Next section
|