Here are two questions I recently realized I couldn’t answer:
- What counts as a successful article in my field (English)?
- How long does it take before people start citing a published article?
For the first question I’m really thinking about the number of citations an article has. There are other ways to measure success, but this is a big one—especially if, like me, you’d like to get hired somewhere someday—and I suspect that a lot of the other ones wind up correlating with this one anyway. But how many citations does a successful article have? 5? 50? 500? This varies widely by discipline, and I had no idea what the right answer was for English / literary criticism.
The second question is related, but mostly born out of morbid fascination with the glacial pace of knowledge sharing in my field. Obviously we talk to each other like normal people, so ideas get spread around through informal means as quickly as they do in any other walk of life, but our peer-reviewed publication process is notoriously slow. Unless you’re already a well-known scholar, the best timeline you can really hope for when you set out to publish an article is about a year from submission to print, and that’s if you write really fast and get accepted on your first try—it’s not unheard of for an article to exist for years before it finally shows up in a journal.
Given that pace, I wondered how long it takes for an article to start being cited by other scholars. If it takes a year to get published, does that mean it takes another year to get cited? Is the print version of the discipline effectively operating at a two-year lag relative to people’s ideas?
To test both questions, I did some quick and dirty data analysis. This is by no means conclusive of anything; but I think it tells us more than we (or at least I) knew before.
I took article titles from PMLA, arguably the flagship journal in the field, and definitely one of the most important journals, even if you prefer to put something else in the top slot. I used editions running from 2010 to the present, because I wanted to see what happens to an article early in its existence; also collecting the data was kind of time-consuming, so I wanted to keep it limited. I also only looked at what I took to be the main articles, so no notes from the editors, nothing organized under subsections like “Theories and Methodologies”, “Our Changing Discipline”, “Criticism in Translation”, “Little-Known Documents”, etc. Things under “Cluster on” whatever, or “Special Topics” I did use. Basically, if it looked like it was in the middle of the edition, I took it. It’s possible this skews the results somehow, but at the end of the day I just wanted a bunch of articles from this decade in a prominent journal, and I definitely got that—specifically, 152 of them. Still, it’s worth saying that this is not a comprehensive look at PMLA.*
I then used Google Scholar to figure out how many citations each article has so far. I’ve never verified the accuracy of Google’s numbers, but spot-checks have usually panned out, and I expect that they’re within acceptable range of the truth over this many articles. It’s possible that there are little errors here and there, as I logged the numbers by hand while listening to music, and was briefly kicked off Google Scholar because they suspected I was a robot.** But I think they’re accurate, and haven’t noticed any disparities so far.
It turns out that the answer to Question 2 has quite a substantial impact on the answer to Question 1, so let’s start by looking at the relationship between citations and the passage of time.
Here we’ve got circles representing articles at various citation levels; the size tells you how many articles there are at that level. So, for example, that big circle at 0 in 2015 is big because there are 24 articles published that year that have never been cited anywhere. Meanwhile one article from 2013 has been cited 39 times, the most of anything in my corpus. (The article is Valerie Traub’s “The New Unhistoricism in Queer Studies”.)
As you can see, there’s a strong correlation between year of publication and number of citations. If you just correlate Years with Total Citations, you get a coefficient of -.98 (the trend line above tells the same story). Here’s that data in a table:
|Year||Total Citations||Total Articles||Citations per Article|
So far we’ve only had one issue in 2016, but leaving it out actually raises the correlation coefficient between Years and Citations to -.99. This story does get a little more complicated if you break things out by issues of the journal, rather than lumping things together by year:
The basic trend holds, though the correlation coefficient decreases to -.83. This suggests that citations are not sensitive to time at the level of three months or particular issues, which sounds intuitively right to me; but none of these correlations are based on very long time periods, and the first few are based on very small data sets, so I wouldn’t read too much into them aside from the headline finding.
The Citation Time Lag (CTL) is quite powerful, and appears to exert a strong pressure against any citations within the first year of publication. The average number of citations for an article published in 2011 is higher than the total number of citations for all 28 articles published in 2015. Two years out, the situation is much less bleak: Of the 22 articles published in 2014, 18 have at least one citation. This might be a little hint in favor of the timeline I mentioned at the beginning of the post; that is, if it takes about a year to publish things, then we’re too early for 2015 articles to have put up much of a showing, and just right for 2014 articles to break out.
There are two questions about the CTL that this data does not satisfactorily answer. First, why the brief plateau in citations-per-article (C/A) in 2012-2013? The technical answer is that Traub’s 2013 article is such an outlier that it skews the whole year up; in a world of 1-5 citations, having 39 is huge. If you artificially lower the number to 25 (equivalent to the second-most successful article in the corpus), 2013 has 3.11 C/A, more in line with the rest of the trend. But to me this really reveals just how limited this corpus is; if one article can have such a strong effect, I’d really like to see more articles in the data to even things out. That’s a good reason to expand this research in the future.
Second, when does the CTL abate? Obviously the citations per article aren’t likely to increase at this rate forever. A random issue from 50 years ago may well contain no articles that are still cited. The superstar effect would be strong in older issues, too—the one article in a year so prominent that it has stood the test of time would skew things for its issue compared to the others. Of course I can’t answer this question based on this data; that’s another good reason to dig deeper.
Still, we have enough here to offer a provisional answer to Question 1. Fortunately for we humanists, the answer is, It depends. If your article is one year old and has no citations, you’re not a failure; you’re everyone. Articles that are two years old top out at 2-3 citations. After that, the sky’s the limit; Traub’s article is just three years old (though of course, the average across articles continues to rise with time). For articles published in the last five years, 40 citations is about as successful as you can be. Two other articles break 20 citations; the top 5% have at least 15 citations; the top 10% have at least 11.
This seems to be the order of magnitude for success within this time frame: an article with ten or more citations. The community of scholars in my field appears, at least in print, to evolve very slowly and to form relatively few connections. It’s a bittersweet pill for young scholars; on one hand, your ideas won’t be in this particular kind of circulation anytime soon. On the other hand, if you’re worried about lack of interest in something you’ve published, well, just check back in a few years—the peak of popularity is just a few like minds away.
* I did look at “Theories and Methodologies” (TM) articles for 2013. They averaged slightly fewer citations than articles I categorized as “main” (i.e., not in a subsection), although the main articles average was bolstered by Valerie Traub’s article’s 39 citations; aside from that the citation numbers were similar. Based on this limited sample, TM articles appear to be shorter and to cite fewer things themselves (i.e., their own bibliographies are shorter), but they also might be written by more prominent scholars. At least, I felt that I recognized a higher percentage of them off the bat. In theory this could give them a leg up as far as generating citations more quickly; that could be interesting to test further.
** I was not. Sources used during the collection process include Harvey, P.J., Let England Shake; and Simpson, Sturgill, A Sailor’s Guide to Earth