If I’m Right You Can Respond in Two Years

Here are two questions I recently realized I couldn’t answer:

  1. What counts as a successful article in my field (English)?
  2. How long does it take before people start citing a published article?

For the first question I’m really thinking about the number of citations an article has. There are other ways to measure success, but this is a big one—especially if, like me, you’d like to get hired somewhere someday—and I suspect that a lot of the other ones wind up correlating with this one anyway. But how many citations does a successful article have? 5? 50? 500? This varies widely by discipline, and I had no idea what the right answer was for English / literary criticism.

The second question is related, but mostly born out of morbid fascination with the glacial pace of knowledge sharing in my field. Obviously we talk to each other like normal people, so ideas get spread around through informal means as quickly as they do in any other walk of life, but our peer-reviewed publication process is notoriously slow. Unless you’re already a well-known scholar, the best timeline you can really hope for when you set out to publish an article is about a year from submission to print, and that’s if you write really fast and get accepted on your first try—it’s not unheard of for an article to exist for years before it finally shows up in a journal.

Given that pace, I wondered how long it takes for an article to start being cited by other scholars. If it takes a year to get published, does that mean it takes another year to get cited? Is the print version of the discipline effectively operating at a two-year lag relative to people’s ideas?

To test both questions, I did some quick and dirty data analysis. This is by no means conclusive of anything; but I think it tells us more than we (or at least I) knew before.

Corpus:

I took article titles from PMLA, arguably the flagship journal in the field, and definitely one of the most important journals, even if you prefer to put something else in the top slot. I used editions running from 2010 to the present, because I wanted to see what happens to an article early in its existence; also collecting the data was kind of time-consuming, so I wanted to keep it limited. I also only looked at what I took to be the main articles, so no notes from the editors, nothing organized under subsections like “Theories and Methodologies”, “Our Changing Discipline”, “Criticism in Translation”, “Little-Known Documents”, etc. Things under “Cluster on” whatever, or “Special Topics” I did use. Basically, if it looked like it was in the middle of the edition, I took it. It’s possible this skews the results somehow, but at the end of the day I just wanted a bunch of articles from this decade in a prominent journal, and I definitely got that—specifically, 152 of them. Still, it’s worth saying that this is not a comprehensive look at PMLA.*

Method:

I then used Google Scholar to figure out how many citations each article has so far. I’ve never verified the accuracy of Google’s numbers, but spot-checks have usually panned out, and I expect that they’re within acceptable range of the truth over this many articles. It’s possible that there are little errors here and there, as I logged the numbers by hand while listening to music, and was briefly kicked off Google Scholar because they suspected I was a robot.** But I think they’re accurate, and haven’t noticed any disparities so far.

Results:

It turns out that the answer to Question 2 has quite a substantial impact on the answer to Question 1, so let’s start by looking at the relationship between citations and the passage of time.

Figure 1

CitationsPerYearLK

Here we’ve got circles representing articles at various citation levels; the size tells you how many articles there are at that level. So, for example, that big circle at 0 in 2015 is big because there are 24 articles published that year that have never been cited anywhere. Meanwhile one article from 2013 has been cited 39 times, the most of anything in my corpus. (The article is Valerie Traub’s “The New Unhistoricism in Queer Studies”.)

As you can see, there’s a strong correlation between year of publication and number of citations. If you just correlate Years with Total Citations, you get a coefficient of -.98 (the trend line above tells the same story). Here’s that data in a table:

Table 1

Year Total Citations Total Articles Citations per Article
2010 210 27 7.78
2011 162 24 6.75
2012 109 28 3.89
2013 70 18 3.89
2014 30 22 1.36
2015 5 28 0.18
2016 0 5 0.00
Total 586 152 3.41

 

So far we’ve only had one issue in 2016, but leaving it out actually raises the correlation coefficient between Years and Citations to -.99. This story does get a little more complicated if you break things out by issues of the journal, rather than lumping things together by year:

Figure 2

CitationsPerIssueLK

The basic trend holds, though the correlation coefficient decreases to -.83. This suggests that citations are not sensitive to time at the level of three months or particular issues, which sounds intuitively right to me; but none of these correlations are based on very long time periods, and the first few are based on very small data sets, so I wouldn’t read too much into them aside from the headline finding.

Analysis:

The Citation Time Lag (CTL) is quite powerful, and appears to exert a strong pressure against any citations within the first year of publication. The average number of citations for an article published in 2011 is higher than the total number of citations for all 28 articles published in 2015. Two years out, the situation is much less bleak: Of the 22 articles published in 2014, 18 have at least one citation. This might be a little hint in favor of the timeline I mentioned at the beginning of the post; that is, if it takes about a year to publish things, then we’re too early for 2015 articles to have put up much of a showing, and just right for 2014 articles to break out.

There are two questions about the CTL that this data does not satisfactorily answer. First, why the brief plateau in citations-per-article (C/A) in 2012-2013? The technical answer is that Traub’s 2013 article is such an outlier that it skews the whole year up; in a world of 1-5 citations, having 39 is huge. If you artificially lower the number to 25 (equivalent to the second-most successful article in the corpus), 2013 has 3.11 C/A, more in line with the rest of the trend. But to me this really reveals just how limited this corpus is; if one article can have such a strong effect, I’d really like to see more articles in the data to even things out. That’s a good reason to expand this research in the future.

Second, when does the CTL abate? Obviously the citations per article aren’t likely to increase at this rate forever. A random issue from 50 years ago may well contain no articles that are still cited. The superstar effect would be strong in older issues, too—the one article in a year so prominent that it has stood the test of time would skew things for its issue compared to the others. Of course I can’t answer this question based on this data; that’s another good reason to dig deeper.

Still, we have enough here to offer a provisional answer to Question 1. Fortunately for we humanists, the answer is, It depends. If your article is one year old and has no citations, you’re not a failure; you’re everyone. Articles that are two years old top out at 2-3 citations. After that, the sky’s the limit; Traub’s article is just three years old (though of course, the average across articles continues to rise with time). For articles published in the last five years, 40 citations is about as successful as you can be. Two other articles break 20 citations; the top 5% have at least 15 citations; the top 10% have at least 11.

This seems to be the order of magnitude for success within this time frame: an article with ten or more citations. The community of scholars in my field appears, at least in print, to evolve very slowly and to form relatively few connections. It’s a bittersweet pill for young scholars; on one hand, your ideas won’t be in this particular kind of circulation anytime soon. On the other hand, if you’re worried about lack of interest in something you’ve published, well, just check back in a few years—the peak of popularity is just a few like minds away.

 


Notes:

* I did look at “Theories and Methodologies” (TM) articles for 2013. They averaged slightly fewer citations than articles I categorized as “main” (i.e., not in a subsection), although the main articles average was bolstered by Valerie Traub’s article’s 39 citations; aside from that the citation numbers were similar. Based on this limited sample, TM articles appear to be shorter and to cite fewer things themselves (i.e., their own bibliographies are shorter), but they also might be written by more prominent scholars. At least, I felt that I recognized a higher percentage of them off the bat. In theory this could give them a leg up as far as generating citations more quickly; that could be interesting to test further.

** I was not. Sources used during the collection process include Harvey, P.J., Let England Shake; and Simpson, Sturgill, A Sailor’s Guide to Earth

Advertisements

The Edible Ox

By way of introducing this blog, I thought I’d just explain the name. It comes from the hilarious and bizarre satirical novel The Good Soldier Švejk, written by the Czech author Jaroslav Hašek just after World War I. Hašek lived a very fast and chaotic life consisting largely of anarchism, alcoholism, vagrancy, literature, and the kind of lunatic life-consuming humor that makes you wonder exactly how in-on-the-joke the guy living it actually was.

At on point, because of his love for one Jarmila Mayerová, whose parents were respectable enough to be basically horrified by his interest in her, Hašek cleaned up his act a little, cranking out 64 stories in one year and securing a job at a journal called The Animal World. I have no idea what this journal ordinarily did—lists of animals?—but, in the words of Cecil Parrott, who edited the volume I have, Hašek “was soon dismissed for writing articles about non-existent animals which he had invented” (ix). The unforgivable sin at the animal magazine is inventing the animals. Pretty soon Hašek was back to vagrancy and other adventures, like selling dogs, faking his suicide, and founding a political party called “The Party of Moderate and Peaceful Progress Within the Limits of the Law” and which actually railed against the monarchy and prevailing political system. (As Parrott explains, “Of course it was only another hoax, designed partly to satisfy  Hašek’s innate thirst for exhibitionism and partly to bolster the finances of the pub where the election meetings were held” (x).)

The obvious question here is: What were those animals? I can’t find any information about Hašek’s inventions in the real magazine, but fortunately a character named Marek in The Good Soldier Švejk has an experience suspiciously similar to Hašek’s. These are the animals he invents:

  • The Sulphur-Bellied Whale, “the size of a cod” and “equipped with a bladder full of formic acid” which he can shoot at fish
  • The Artful Prosperian, “a mammal of the kangaroo family”
  • The Edible Ox, “the ancient prototype of the cow”
  • The Sepia Infusorian, “which I characterized as a sort of sewer rat”
  • The Faraway Bat, a “bat from Iceland”
  • The Irritable Bazouky Stag-Puss, a “domestic cat from the peak of Mount Kilimanjaro”
  • Engineer Khun’s Flea, found in amber and blind “because it lived on an underground prehistoric mole, which was also blind” (all from page 325)

Of these, I thought the ones that sounded most like a blog title were the Artful Prosperian, the Edible Ox, and the Faraway Bat. The first is probably the most apt, but it sounded too stuffy to me. I was worried people would assume it was a reference to an 18th-century satirical newspaper full of inscrutable jokes about Whigs (obviously a reference to a Czech satirical novel is completely different). The Faraway Bat is my favorite joke in the list, but it reminds me too much of Batman. But the Edible Ox has it all.

In practice most of the posts on this blog will probably be about literature, politics, basketball, etc., rather than nonexistent animals. But I’m hoping I can retain the Spirit of Marek:

‘I can say that I did my best and kept to my action programme for running the magazine as far as lay within my own powers. But I soon discovered that my articles went beyond my capabilities.
‘Wishing to offer the public something completely new I invented animals.’
(324)