Electronic Dissertations Library

Exploring the development of the independent, electronic, scholarly journal, by Alison Wells

Why do we need to know who is reading electronic journals?

The circulation of paper journals can be determined and audited with precision due to the fact that the number of copies printed can be counted, as can the number of subscriptions, but without surveillance of library shelves, it is very difficult to know who is reading them, and in fact if they are being read at all. Producing journals electronically offers the chance for the publisher to get to know the user. Dawson (1998) feels that "it is certainly easier and cheaper to collect usage figures for electronic periodicals than for printed ones." The most important reason to know who is reading is to get funding through advertising or sponsorship. Advertisers demand this kind of information, which is why a lot of commercial publications, e.g. electronic newspapers, require compulsory registration involving a long list of personal questions. Other funding bodies such as universities, tend to want to know how many people are reading rather than who they are.

What methods could be used to determine readership?

  1. Using software to get hit rates on each page :

There are a lot of easily available programs which can be run on any server to get statistics as to what pages have been downloaded etc. The logfiles produced are designed to be of most use to the person running the server the web site is mounted on, because the statistics include a lot of information relating to the amount of work the server is doing sending and receiving data. Dawson (1998) analyses the data obtained from the BUBL journals service and uses it to infer user behaviour such as browsing, reading or searching

Advantages :

Useful pieces of information include the number of unique hosts visiting a site in a certain time period. From this information the country of origin of the reader can be determined with caution from the last part of the address (e.g. .uk for the UK, or .fr for France), however addresses ending with .com or .net for example could be from anywhere, even though they are usually assumed to be from the US.

The number of hits on each page can be determined. In the case of electronic journals, this is useful because it is easy to tell which papers are most popular, particularly if the reader has an abstract to read first before downloading the full paper. However, this does not tell you how many unique readers you had, as one person could have downloaded four papers, or four people could have downloaded one.

Other useful pieces of information which some software produce, which are not directly related to the determination of readership, are the referring page that the reader came from, usually from search engines. The data usually includes the search that was carried out on the search engine which gives an idea of the subject they were looking for. Also information can be found on the type of browser used, and its release number (e.g. Netscape 4.5, Lynx etc.) which is interesting, particularly if new features which take advantage of the capabilities of the newer browsers are being considered.

Disadvantages :

Proxies and caches : Many large sites, for example Universities, try to manage their network traffic by these methods. Using a proxy means that only the address of the proxy is registered for each request sent by a member of that site, so even though 50 people from the University of Sheffield may have looked at a site, they will all have been lumped under the address "vampire.shef.ac.uk", which is the address of the proxy. Caching, i.e. keeping a local copy of a frequently visited site, also causes problems, as the hits to the local copy will not be counted at all on the original site. Even where you can identify individual addresses, these relate only to a computer and not to an individual who may share a computer, or access it from work and home.

Spiders : Otherwise known as 'robots', these are small programs run by search engines which download pages to the database of the search engine. Each page they download counts in the hit rates, but they are not readers.

Mirrors : Many sites have mirrors - identical copies of the original site mounted on servers across the world - to decrease the load on the home server and increase response times. However, this means that now there are as many sets of statistics as there are mirrors, which need to be added together to get an idea of readership. Frequently the home site of the journal does not have access to these statistics anyway.

These statistics cannot tell all the information needed without analysis - it is not true to say that a page which has been "hit" 10000 times has been read that many times. As a large proportion of the referring pages are notoriously inaccurate search engines, the reader may have taken one look at the front page and decided that it was not what they wanted. Also, each hit refers to one file, including graphics, so one person looking at the front page of this dissertation may have generated 20 hits, because of the number of graphics on the page.

Finally and most importantly, you cannot tell from these statistics how many regular readers you have, and who they are.

  1. Asking readers to register voluntarily

Advantage :

Using a registration form can help to gather more information about a reader, such as their job / position, country of origin, etc.

Disadvantage :

Most people will not fill these in because they do not want to give out personal information unnecessarily

  1. Making registration compulsory with access by password

Advantage :

This is the only way to get concrete information about who is reading your journal and what they are reading. By making access by password, or by setting a "cookie" (sending a code to your computer which can be checked when you load the page to see who you are), you get full details on each reader's session. You can also tell how many regular readers you have and how often they visit.

Disadvantages :

Compulsory registration tends to put people off, even if it is free.

Users can give incorrect information deliberately, and may not like the "Big Brother" implications of having their reading monitored.

Passwords are easily forgotten, leading users to reregister, and duplicating information.

  1. Providing a mailing list which sends out contents / abstracts for each issue as it is published.

Advantage:

The incentive of the alerting service encourages people to register who would not have registered voluntarily.

The mailing list should tend to include the readers who are interested in the subject and will read it regularly.

There is also the added bonus of advertising new issues of your journal to people who are interested and reminding them that it is still there.

Disadvantage:

Many people feel overloaded by e-mail anyway and do not want the extra information.

  1. Using a counter

This is one of the crudest methods to count readers, but also one of the easiest to set up.

Advantage:

Easy to add a counter to a page. Can be useful to add them to articles so that you can check interest in each paper.

Disadvantage:

Does not register a hit if it not loaded, and they can often slow down page loading rates considerably, leading to annoyance of users.

Does not give you any of the detailed data available from the software in (i)


Reference

Dawson, A. (1998). Inferring user behaviour from journal access figures. [http://bubl.ac.uk/journals/lis/oz/serlib/v35n0399/dawson.htm]. Site visited at 14.6.99


Title Page    Next section


Exploring the development of the independent, electronic, scholarly journal, by Alison Wells
MSc in Information Management, 1998/1999
Electronic Dissertations Library
© University of Sheffield - Department of Information Sudies (All Rights Reserved)