Scientific research on the process of peer review and editorial decision-making is rather like trying to grasp the intangible. There is a certain amount that can be quantified, such as the number of characteristics of the paper (specified by the journal) which the assessor rates. But as with so many numerical formulations, whatis expressed in numbers may leave out something of what is really most important. There has been an increasing tendency, largely imported from America, to measure aspects of quality and then use these numbers to compare individuals or institutions. So we now have successful schools, hospital league tables, efficient police forces, and academic departments which do well or ill on the research assessment exercise. One fire brigade said that the only way they could get a good rating would be to start some fires and then put them out, since their successful preventive work was not counted. There is a possible danger that similar distortions could apply to the publication process.
Like democracy, peer review is the worst possible systemexcept for all the others. Until about the 1960s, editors of learned journals usually made up their own minds as towhether or not to accepta paper (that was certainly my experience as a neophyte author). Then they began to use a small circle of colleagues to help with more specialised topics, as every discipline began to diversify. So the process has gone on, until Nature has some 14 000 possible assessors on its books. Assessors are all busy people, reviewing a paper conscientiously may take several hours, and there is no pay for this anonymous work; people do it for the collective good of the scientific community. Any editor soon finds out who is good at it and who skimps it or refuses to do his or her share.
The overwhelming fact of life for all reputable journals is that there is never enough space, so it is not surprising that Howard & Wilkinson found that assessors and editors tended to agree more on what is clearly not suitable for publication. Since at least three papers out of four sent to the British Journal Psychiatry (BJP) will have to be rejected, manuscripts have to be assessed with a predominantly negative mental set. Not all assessors, and very few authors, understand that only the editor can make the final decision about publication, because only he (or eventually she) knows how one particular manuscript compares with all the others under review. An assessor can only say whether a submission would be suitable for that particular journal, not whether it should be accepted at that time.
Serious interest in peer review has been shown only for about the past decade or so, particularly since the publication of Stephen Locks (1985) A Difficult Balance. It is right that there should be concern about the process, but how much we have usefully learnt from the studies done during this time is perhaps rather uncertain. There are some obvious dangers in peer review, such as a clique with a particular viewpoint obtaining a stranglehold on all publications in one field. Cases like that have occurred, but they seem to be rare, and it is very difficult to see how any formal sanctions can prevent them. It is really up to the practitioners themselves to preserve free speech.
Howard & Wilkinson ask whether an increase in the number of assessors per paper would improve matters. My experience was that three conscientious reports per paper is the maximum useful number. After that, they tend to start contradicting themselves and confusing the author. Furthermore, it wastes the time of assessors, who could be reviewing other submissions.
This paper has raised some very useful issues and it is good to hear that further research is under way. The latest report on the BJPs high impact factor shows that they must be doing something right. But in the end, a journals quality depends on editorial flair, which is very difficult to quantify.
Lock, S. (1985) A Difficult Balance: Editorial Peer Reviews in Medicine. London: Nuffield Provincial Hospitals Trust.