June highlights from the world of scientific publishing

The launch of Cofactor and what I learnt from Twitter in June: an unusual journal, a very large journal, a new journal and a very slow journal, plus accusations of publisher censorship and more.

The biggest highlight of the month for me was (of course) the launch Cofactor, my company that aims to help researchers publish their work more effectively. The launch event in London was a great success – over 50 people from science and publishing came to hear about the company and from four speakers on the theme ‘What difference will changes in peer review make to authors and journals?’. Do have a look at the Cofactor website and see if we can help you improve your papers, learn about scientific publishing or decide on your publishing strategy.

On Twitter there was lots of news too!

Journals

Nature published a feature by Peter Aldhous (@paldhous) on ‘contributed’ papers in Proceedings of the National Academy of Sciences USA (PNAS). Members of the National Academy of Sciences can submit up to four papers per year using this track, and they are not peer reviewed after submission – rather, authors must obtain reviews themselves and submit them along with the paper, and the paper and reviews are assessed by members of the editorial board. In 2013, more than 98% of contributed papers were published, compared with only 18% of direct submissions (for which the review process is like that of most conventional journals).

Aldhous analysed papers from the contributed and directly submitted track and compared their citation rates:

…the difference between citation rates for directly submitted and contributed papers was not large — controlling for other factors such as discipline, contributed papers garnered about 4.5% fewer citations — but it was statistically significant. Nature‘s analysis also suggests that the gap in citation rates between directly submitted and contributed papers has been narrowing, and this does not seem to be because more-recent papers have yet to acquire enough citations for the difference to show.

The analysis is described in a supplementary information file, but the full dataset is available only on request. I questioned whether it would be better to release the full dataset so that others can reanalyse it. Others agreed, and I have collated the resulting Twitter discussion using Storify. In response, Peter Aldhous kindly put the full dataset on his website. This is a great example of the power of Twitter to get rapid responses to questions.

Meanwhile, a new journal was launching and another was celebrating an incredible milestone. The new journal is The Winnower (@TheWinnower), which describes itself as an open access online science publishing platform that uses open post-publication peer review. There is a charge of US$100 to publish an article, which is not editorially checked but is published straight away, after which anyone can add a review. There aren’t many papers there at the moment but The Winnower is an interesting experiment. It differs from the recently launched ScienceOpen (@Science_Open) in that the latter has editorial checks and only scientists can review papers.

The largest journal in the world is of course PLOS One, and this month it published its 100,000th article. There is no doubt that this megajournal is changing scientific publishing and will continue to do so.

Finally, Texas biologist Zen Faulkes (@DoctorZen) posted his experience of publishing in a small regional journal, The Southwestern Naturalist. He submitted the paper in 2011, got reviews back 9 months later, submitted a revised version within a week and it was finally accepted for the December 2013 issue… which actually came out in early June 2014. Although Zen wasn’t in a hurry, he says

But after this experience, I think I would have been much better off submitting this paper to PLOS ONE or PeerJ or a similar venue.
Except… wait, PeerJ didn’t exist when I submitted this paper. With publications like PeerJ, journals like The Southwestern Naturalist are going to be in trouble soon.

I too wonder why authors would choose a small journal like this, especially if it is closed access, over one that can reach a much broader audience.

Difficulties with publishing criticism of the publishing industry

In late May an article entitled ‘Publisher, be damned! From price gouging to the open road‘ was published by four Leicester University management researchers in a fairly obscure Taylor & Francis journal, Prometheus: Critical Studies in Innovation. I would not have noticed it had it not been for coverage by Paul Jump (@PaulJump) in Times Higher Education telling the extraordinary story behind the publication of the article. A senior manager at the publisher demanded that the article be cut, the issue containing it was delayed by 8 months, and the editorial board threatened to resign in protest at the publisher’s actions. There is more detail and analysis in this Library Journal article. Eventually it was published with the names of particular publishers removed. It seems to me that the publishers’ attempt at suppressing the article had the opposite effect and is a good example of the Streisand Effect. (Via @LSEImpactBlog, @RickyPo and @Stephen_Curry)

Paper writing resources

Two very useful resources came to my attention this month. The first (via @digiwonk) was a blog post on The Chronicle of Higher Education’s Careers hub by Kirsten Bell, a research associate from the University of British Columbia. The post, entitled ‘The Really Obvious (but All-Too-Often-Ignored) Guide to Getting Published‘, gives five simple tips for getting your article published. They are

  1. Familiarize yourself with the journal you want to submit to
  2. Make sure you nominate reviewers, if the journal gives you the option to do so
  3. Don’t make it glaringly obvious that your paper has been rejected by another journal
  4. Learn how to write a paper before actually submitting one
  5. Be persistent (when it’s warranted)

The second was actually a guide to reading a scientific paper, but it is well worth reading when you are writing too, so as to understand how it is likely to be read. The article was in the Huffington Post by @JenniferRaff, a postdoc at the University of Texas, entitled ‘How to Read and Understand a Scientific Paper: A Step-by-Step Guide for Non-Scientists‘. She tells readers to identify the ‘big question’ and the ‘specific question’ that the authors are trying to answer with their research, and then by reading the methods and results determine whether the results answer the specific question. If your paper is written so that the specific question and the answer to it are easy to find, you will get more readers and probably more citations. (Via @kirkenglehardt)

Altmetrics

You’ve seen the Altmetric donut – now here’s the Plum Analytics PlumPrint! (via @PlumAnalytics).

ImpactStory (@ImpactStory) have a good roundup of news about altmetrics from June, mentioning the PlumPrint, responses to a UK Higher Education Funding Council for England (HEFCE) consultation on the use of metrics in assessment, and highlights from a recent altmetrics conference.

Miscellany

A post by Manchester University computational biologist Casey Bergman (@CaseyBergman) was popular among the scientists I follow: ‘The Logistics of Scientific Growth in the 21st Century‘ argued that the exponential growth in academic research is over, and explored what that might mean for those at different stages in their careers.

On the BioMed Central blog, David Stern describes his unsuccessful efforts to replicate a published effect and the realisation that it was an artefact of data binning. He recommends not just providing the full data but displaying it too. (Via @emckiernan13 and @mbeisen)

There was a great post on blog ‘The Rest of the Iceberg’ (@ECRPublishing) entitled ‘What To Do When You Are Rejected‘, focusing on what you can learn from receiving peer reviews.

February highlights from the world of scientific publishing

Some of what I learned about scientific publishing last month from Twitter: new open access journals, data release debates, paper writing tips, and lots more

New journals

Two important announcements this month, both of open access sister journals to well established ones.

First, at the AAAS meeting it was announced that Science is going to have an online-only open access sister journal, called Science Advances, from early 2015. This will be selective (not a megajournal), will publish original research and review articles in science, engineering, technology, mathematics and social sciences, and will be edited by academic editors. The journal will use a Creative Commons license, which generally allows for free use, but hasn’t decided whether to allow commercial reuse, according to AAAS spokeswoman Ginger Pinholster. The author publishing charge hasn’t yet been announced.

Second, the Royal Society announced that, in addition to their selective open access journal Open Biology, they will be launching a megajournal, Royal Society Open Science, late in 2014. It will cover the entire range of science and mathematics, will offer open peer review as an option, and will also be edited by academic editors. Its criteria for what it will publish include “all articles which are scientifically sound, leaving any judgement of importance or potential impact to the reader” and “all high quality science including articles which may usually be difficult to publish elsewhere, for example, those that include negative findings”; it thus fits the usual criteria for a megajournal in that it will not select for ‘significance’ or potential impact.

These two announcements show that publishers without an open access, less selective journal in its stable are now unusual. Publishers are seeing that there is a demand for these journals and that they can make money. Publishers also see that they can gain a reputation for being friendly to open access by setting up such a journal. This also means that papers rejected by their more selective journals can stay within the publisher (via cascading peer review), which, while saving time for the authors by avoiding the need to start the submission process from scratch, also turn a potential negative for the publisher (editorial time spent on papers that are not published) into a positive (author charges). The AAAS has been particularly slow to join this particular bandwagon; let’s see if the strong brand of Science is enough to persuade authors to publish in Science Advances rather than the increasingly large number of other megajournals.

PLOS data release policy

On 24 February, PLOS posted an updated version of the announcement about data release that they made in December (and which I covered last month). I didn’t pay much attention as the change had already been trailed, but then I had to sit up and take notice because I started seeing posts and tweets strongly criticising the policy. The first to appear was an angry and (in my opinion) over-the-top post by @DrugMonkeyblog entitled “PLoS is letting the inmates run the asylum and this will kill them”.  A more positive view was given by Michigan State University evolutionary geneticist @IanDworkin, and another by New Hampshire genomics researcher Matt MacManes (@PeroMHC). Some problems that the policy could cause small, underfunded labs were pointed out by Mexico-based neuroscience researcher Erin McKiernan (@emckiernan13). The debate got wider, reaching Ars Technica and Reddit – as of 3 March there have been 1045 comments on Reddit!

So what is the big problem? The main objections raised seem to me to fall into six categories:

  1. Some datasets would take too much work to get into a format that others could understand
  2. It isn’t always clear what kind of data should be published with a paper
  3. Some data files are too large to be easily hosted
  4. The concern that others might publish reanalyses that the originators of the data were intending to publish, so they would lose the credit from that further research
  5. Some datasets contain confidential information
  6. Some datasets are proprietary

I won’t discuss these issues in detail here, but if you’re interested it’s worth reading the comments on the posts linked above. But it does appear (particularly from the update on their 24 February post and the FAQ posted on 28 February) that PLOS is very happy to discuss many of these issues with authors that have concerns, but analyses of proprietary data may have to be published elsewhere from now on.

I tend to agree with the more positive views of this new policy, who argue that data publication will help increase reproducibility, help researchers to build on each other’s work and prevent fraud. In any case, researcher who disagree are free to publish in other journals with less progressive policies. PLOS is a non-profit publisher who say that access to research results, immediately and without restriction, has always been at the heart of their mission, so they are being consistent in applying this strict policy.

Writing a paper

Miscellaneous news

  • Science writer @CarlZimmer explained eloquently at the AAAS meeting why open access to research, including open peer review and preprint posting, benefit science journalists and their readers.
  • Impactstory profiles now show proportion of a researcher’s articles that are open access and gives gold, silver and bronze badges, as well as showing how highly accessed, discussed and cited their papers are.
  • A new site has appeared where authors can review their experience with journals: Journalysis. It looks promising but needs reviews before it can become a really useful resource – go add one!
  • An interesting example of post-publication peer review starting on Twitter and continuing in a journal was described by @lakens here and his coauthor @TimSmitsTim here.
  • Cuban researcher Yasset Perez-Riverol (@ypriverol) explained why researchers need Twitter and a professional blog.
  • I realised when looking at an Elsevier journal website that many Elsevier journals now have very informative journal metrics, such as impact factors, Eigenfactor, SNIP and SJR for several years and average times from submission to first decision and from acceptance to publication. An example is here.
  • PeerJ founder @P_Binfield posted a Google Docs list of standalone peer review platforms.

December highlights from the world of scientific publishing

Some of what I learned last month from Twitter: takedowns, luxury journals, moves in peer review services and more.

‘Luxury journals’

A big talking point on my Twitter feed in December was the provocative comments about journal publishing made by Randy Sheckman as he received his Nobel Prize for Physiology and Medicine. Writing in the Guardian on 9 December, he criticised the culture of science that rewards publications in ‘luxury journals’ (which he identified as Nature, Cell and Science). He said “I have now committed my lab to avoiding luxury journals, and I encourage others to do likewise.” Although many applauded this, some pointed out that more junior researchers may not have the freedom to do likewise, and also mentioned Schekman’s potential conflict of interest as Editor-in-Chief of the new, highly selective journal eLife, which aims to compete for the best research with these journals (some responses are summarized by Stephen Curry). Schekman responded to the criticisms in a post on The Conversation, and suggested four ways in which the research community could improve the situation.

Elsevier steps up takedown notices

Subscription journals generally require the author to sign a copyright transfer agreement that, among other things, commits them not to share their paper widely before any embargo period has passed. It appears that in December Elsevier decided to increase their enforcement of this by sending takedown notices to sites where Elsevier papers were posted. Guy Leonard described what happened to him and the reaction on Twitter and elsewhere.

Various peer review developments

Jeremy Fox (@DynamicEcology), a population ecologist at the University of Calgary, explained why he likes the journal-independent peer review service Axios Review (and is joining their editorial board).

Publons, which gives researchers credit for post-publication peer review, announced that researchers can now also get credit for their pre-publication peer reviews for journals.

Wiley announced a pilot of transferable peer review for their neuroscience journals, in which reviews for papers rejected from one journal can be transferred to another journal in the scheme, thus saving time.

F1000Research announced that its peer reviewed articles are now visible in PubMed and PubMed Central, together with their peer reviews and datasets. Articles on F1000Research are published after a quick check and then peer reviewed, and indexing by PubMed and PubMed Central happens once an article has a sufficient number of positive reviews.

Jennifer Raff, an Anthropology Research Fellow at the University of Texas at Austin, published “How to become good at peer review: A guide for young scientists“, a very useful and comprehensive guide that she intends to keep updated as she receives comment on it.

Miscellaneous news

Jeffrey Beall, a librarian who has been curating a useful list of ‘predatory’ open access journals for several years, revealed his antagonism to open access as a whole in an article that surprised many with its misconceptions about the motivations of open access advocates. PLOS founder Mike Eisen has rebutted the article point by point. Although I feel that Beall’s list is still useful for checking out a new journal, it should be taken only as a starting point, together with a detailed look at the journal’s website, what it has already published, and its membership of the Open Access Scholarly Publishers Association (OASPA).

James Hayton (@JamesHaytonPhD) wrote a great post on his 3 Month Thesis blog on the seven deadly sins of thesis writing, which should also all be avoided in paper writing: lies, bullshit, plagiarism, misrepresentation, getting the basics wrong, ignorance and lack of insight.

Finally, I was alerted (by @sciencegoddess) to a new site called Lolmythesis, where students summarize their thesis in one (not always serious) sentence. Worth a look for a laugh – why not add your own?

Why do journals insist that data ‘are’?

Given the controversy over this grammatical point, I argue that journal style guides should allow both ‘data is’ and ‘data are’.

I was recently directed (via @blefurgy and @deb_lavoy on Twitter) to an old blog post on something that frequently bugs me: the question of whether the word ‘data’ is singular or plural. The post, by Norman Gray, an astronomical data management researcher at Glasgow University, UK, dates from 2005 but I haven’t seen a better one on the topic. Gray argues that:

…the word ‘data’, in english, is a singular mass noun. It is thus a grammatical and stylistic error to use it as a plural.

Plural use is barbaric: amongst other crimes, it is a deliberate archaism, and thus a symptom of bad writing.

Strong stuff.

An alternative view is given by Peter Coles (@telescoper), another astronomer at Cardiff University, UK, who also explains the issue clearly:

For those of you who aren’t up with such things, English nouns can be of two forms: “count” and “non-count” (or “mass”). Count nouns are those that can be enumerated and therefore have both plural and singular forms: one eye, two eyes, etc. Non-count nouns (which is a better term than “mass nouns”) are those which describe something which is not enumerable, such as “furniture” or “cutlery”. Such things can’t be counted and they don’t have a different singular and plural forms. You can have two chairs (count noun) but can’t have two furnitures (non-count noun)…

…Norman Gray asserts that (a) “data” is a non-count noun and that (b) it should therefore be singular.

I tend to look and listen out for instances of ‘data’, and I have very rarely heard someone say ‘the data are’ in natural speech. As Gray says:

The majority of writers who would dutifully pluralise ‘data’ in writing naturally and consistently use it as a mass noun in conversation: they ask how much data an instrument produces, not how many; they talk of how data is archived, not how they are archived; they talk of less data rather than fewer; and they always talk of data with units, saying they have a megabyte of data, or 10 CDs, or three nights, and never saying ‘I have 1000 data’ and expecting to be understood.

You may wonder why this matters at all. Well, practically every scientific paper contains the word ‘data’ somewhere, and all the journals I edit for insist that it is made plural every time. I spend a ridiculous amount of my editing time looking out for instances of ‘the data is’ and similar. And they can’t be found automatically using a macro, either, because the subject and verb can be separated by other words, or the verb may be something else like ‘shows’ or ‘illustrates’ rather than ‘is’. This is ‘mistake’ that a lot of authors make.

So is ‘data is’ really a mistake? Are the journals right to insist on this change?

The argument from etymology

The main argument used for ‘data are’ is that the word is derived from a plural Latin word. Gray dismantles this thoroughly by showing that it never was a simple plural in Latin. It is:

…the neuter plural past participle of the first conjugation verb dare, ‘to give’ (it’s actually also the feminine singular past participle, but that really, really, doesn’t matter).

…there was almost certainly no latin word for the concept that we now identify by the english word ‘data’….

…Put another way, that means that the word ‘data’, as a technical term referring to the ore of observations, which can be painstakingly reduced to extract knowledge, is not a latin word at all. It’s a native english word with a latin past, which means, bluntly, that we get to choose how to use it, and if its meaning changes over time – as it has – then its grammatical analysis can reasonably and properly migrate also.

I find this a convincing argument. It reminds me of the pedants who don’t like split infinitives (‘to boldly go’) because Latin infinitive verbs couldn’t be split, which is pretty irrelevant to how we should treat them in English (see Wikipedia for current views on this issue).

Gray goes on to compare ‘data’ with other similar Latin-derived words, such as ‘agenda’, ‘stamina’, ‘media’ and ‘phenomena’. ‘Stamina’ is at one end of a spectrum: it is never used in the singular (‘stamen’) except in a specialist botanical sense, and it is a singular noun. ‘Phenomena’ is at the other end – the singular ‘phenomenon’ is frequently used and ‘phenomena’ is a plural noun. ‘Agenda’ is almost the same as ‘stamina’ but the singular ‘agendum’ just about makes sense (although ‘agenda item’ would be more usual). ‘Media’ is moving from being a plural of ‘medium’ to being a separate singular noun in its own right. Gray says:

In this spectrum (not ‘spectra’, of course), ‘data’ is clearly located near ‘agenda’.

I would agree with this assessment on the whole, though I disagree with Gray that ‘datum’ is ‘certainly not one of the things that makes up data’. But like ‘agenda item’, a more commonly used term would be ‘data point’.

In fact, there is a technical use of the word ‘datum’, which Gray has dug out: it is a surveying term. But the plural of this usage of ‘datum’ is ‘datums’, not ‘data’.

Peter Coles doesn’t in fact completely agree with the journal publishers’ stipulation that ‘data’ is never singular – rather, he argues that there are contexts in which the plural use makes sense, and others in which singular use is better:

“If I had less data my disk would have more free space on it.” (Non-count)

“If I had fewer data I would not be able to obtain an astrometric solution.” (Count).

I’m fine with this distinction if people want to use it. But why, then, should journals insist that the singular use is incorrect?

A proposal: stop being prescriptive about data

You may or may not agree with Norman Gray (and me) that ‘data are’ is incorrect. But you can surely agree that there is controversy about the issue. The reasons to insist on plural data are hotly contested, to say the least.

So I propose that publishers remove the stipulation in their style guides that ‘data is’ is incorrect and should be changed to ‘data are’. In fact there is no need to be prescriptive on the issue at all: if the author writes ‘data are’, it can stay, but if they write ‘data is’, that can stay too. This would save a not insignificant amount of time for copyeditors, in searching and replacing ‘data is’ and in arguing the point with authors. It would probably save authors some time and annoyance too. And it would also make journals look more modern in this age of terabytes of data.

Who is going to be the first publisher to take a leap into the unknown? You have nothing to lose but your fuddy-duddy reputation.

Your opinions

Grammatical issues like this usually generate more heat than light, so I expect there will be comments on this post. I would particularly like to hear from journal editors who have been involved in discussions about this issue for their style guides, and from authors who have railed against the ‘data are’ rule imposed by a journal. I reserve the right to remove comments that simply rehash old arguments or only say that one or other construction is ‘ugly’ or ‘just wrong’.

Tips on scientific writing from European Science Editors

The European Association of Science Editors guidelines for scientific writing are a great resource.

I have recently joined the European Association of Science Editors (EASE). They have a valuable document on their website: EASE Guidelines for Authors and Translators of Scientific Articles to be Published in English (pdf; published June 2011). These guidelines are full of useful tips for those who write or edit scientific articles in English or translate them into English. They will be especially useful for non-native speakers of English.

Some highlights:

  • A description of what is needed in each of the main sections of a research paper
  • Various things to watch out for where English differs from other languages; for example, full stops should be used for decimal points (not commas), and Roman numerals should not be used for months
  • A recommendation to avoid phrasal verbs, such as ‘find out’ or ‘pay off’, where possible, as they are often difficult for non-native speakers to understand
  • An appendix on how to write an abstract
  • An appendix on ’empty’ words and sentences that should be avoided, both for conciseness but also to avoid ambiguity (such as ‘good’ or ‘big’ when a more specific term could be used)
  • An appendix on how to use linking words and phrases to give cohesion to an article
  • Examples of expressions that can be simplified or deleted (for example, ‘conducted inoculation experiments on’ can be changed to ‘inoculated’)
  • Examples of differences between British and American spelling.

I was particularly intrigued by a note in the guidelines (page 4) that parallel constructions are allowed in English, though not in some other languages. An example of a parallel construction is ‘It was high in A, medium in B, and low in C’. Native speakers of some other languages might (incorrectly) want to change to something like ‘It was high in A, medium for B, and low in the case of C’.

I hadn’t realised that avoiding parallel constructions was recommended in any other languages. I’d be curious to know which languages these are, and how strict this rule is. If it is Japanese or Chinese, this might explain why I frequently see ‘respectively’ being overused by native speakers of these languages, as it would be one way to avoid a parallel construction (‘it was high, medium, and low in A, B, and C, respectively’). If you are tempted to use ‘respectively’ in this way, consider changing it to the parallel construction, which is more standard English usage.

The guidelines have been translated into many languages by volunteers, including into Japanese, Chinese, Korean, Arabic and most Western European and Eastern European languages. They are free for all to read, and non-commercial printing is allowed.

Do go and read them!