Future History

Lately I’ve been thinking a lot about the best historical comparisons for Clinton and Trump as potential Presidents. For Clinton, I think it’s clear that she’d fit right into the legacy of Bill and Barack–the expected center-left progress characterized by a lot of little policy achievements, ordinary failures, and one or two big signature moves. At worst, I see her as someone like Jimmy Carter, well meaning and competent but stymied by forces outside the President’s control. At best, the guy who comes to mind is Teddy Roosevelt: Too hawkish (though, like her, also a better diplomat than he often gets credit for) and hilariously wonkish (he was so into managing everything that he got involved in arguments about new rules for football and the spelling of words in English), but a tireless worker, an advocate for fair play, and a brilliant politician and institutional architect.

Where micromanagement leads.

For Trump, the ceiling and floor are somewhat difficult to imagine. For the former I guess I can see a case for a Coolidge-like Presidency, where his utter inattention allows things to just sort of chug along while people of color are the victims of constant terrorist violence and the economy careens toward the Great Depression. At worst I can see him incorporating a sort of greatest-hits of the worst things Presidents have ever done–Watergate is a nice model for his paranoia about election rigging; Wilson’s casual stance toward the re-emergence of the KKK is basically already underway at Trump Campaign headquarters; the Alien & Sedition Acts look like the basis for his position on freedom of the press and libel laws; Harding’s notorious corruption is low-hanging fruit for a guy who literally goes on trial for fraud later this month; and given Trump’s cavalier stance on nuclear weapons, he could end everything by revisiting the Cuban Missile Crisis, except this time swapping out Robert F. Kennedy for Donald Trump, Jr.

If I had to tie Trump to just one President, though, I think I’d go with Andrew Jackson: A populist who combined violent racism with distrust of elites, financiers, and government involvement in the economy. The latter stuff led Jackson to cancel the central bank charter in the U.S. and generally issue policies now thought to have contributed to the Panic of 1837, one of the worst economic recessions in American history. The violent racism, of course, led to the Indian Removal Act, which in turn led to the Trail of Tears.

These kinds of things seem like they’re in play in a Trump Presidency. He rails against money in politics, corrupt insiders, and “this Janet Yellen of the Fed” in particular. It’s easy to imagine him deploying his angry incomprehension of economics to put his thumb on the scales (he has basically indicated that he would like to do this) and essentially wage a destructive war against the modern economy. Certainly his utterly indefensible use of anti-Semitism is partly about its historical role as the socialism of idiots. But it also points to Trump’s real campaign emphasis, the racism. He has based his career on a sort of modern-day Mexican Removal Act, with a Muslim ban thrown in for good measure. It’s important not to be too haphazard with a comparison like that; the Trail of Tears was a human catastrophe, an ethnic cleansing that killed thousands. I don’t think Trump is going to march immigrants through the snow until they die. But it’s not too much of a stretch to say that he advocates ethnic cleansing—the definition of the term varies depending on where you look, but it’s basically the removal of an ethnic or religious group from a territory with the aim of making it ethnically homogenous.1 Trump does not explicitly advocate for the violence that often accompanies ethnic cleansing, but the purging, deportations, and vision of purity are  right in his wheelhouse, and on a scale that our country has not seen in decades (perhaps a scale large enough that, practically, it would require Internment Camps, along the models pioneered by enthusiastic Trump supporter Joe Arpaio, thus checking off another of our Worst Hits).

I think America would survive a Trump Presidency; we survived Jackson’s, and that was in an era where a national collapse was much more thinkable than it is now. But I also think it would do lasting, terrible damage, to the white people duped by a man who understands their problems even less than he cares about them, and even worse to the people of color who would experience the worst of a history most of us thought we were done repeating. I used to wonder how on earth we still had Jackson on our money, given all that he did. I chalked it up to some combination of historical ignorance and apathy. I still think it was that—but, apparently, for a lot of people it was also a kind of aspiration.


1. It’s kind of chilling, when you read different definitions of the term “ethnic cleansing”, how closely they echo Trump’s ideas. On Wikipedia, it’s “the systematic forced removal of ethnic or religious groups from a given territory by a more powerful ethnic group, with the intent of making it ethnically homogeneous.” The Encyclopedia Britannica calls it, even more aptly, “the attempt to create ethnically homogeneous geographic areas through the deportation or forcible displacement of persons belonging to particular ethnic groups.”  In the Oxford English Dictionary (that link is behind a paywall), it’s “The purging, by mass expulsion or killing, of one ethnic or religious group by another, esp. from an area of former cohabitation.” Wikipedia has some more legal definitions here. It’s fair to say that Trump is not explicitly calling for terrorist violence, but he does favor ethnically and religiously targeted mass expulsions of people (especially Mexicans and Muslims), and I think it’s pretty clear that this is based on the misguided dream of homogeneous American whiteness that underlies his “Make America Great Again” slogan.

The President Was Here

This post uses Most Distinctive Words to analyze what we talk about when we talk about Presidents.*


I begin with the Wikipedia pages for each U.S. President. I downloaded these in January and then got distracted with work, so they’re a few months out of date, but still relatively fresh compared to most of the texts I work on. I wasn’t too strict about what I took; basically I started at the top of the article and stopped when I felt the article was over. Just having this much gives you access to an underrated form of quantitative textual analysis: checking how long things are. Here are the word counts for each President’s article:

President Word Count
LBJ 18485
JFK 17098
Ike 16458
FDR 16334
Lincoln 15765
Reagan 15374
Wilson 15234
Harding 15220
Grant 15107
Teddy 14868
Nixon 14366
W 14200
Washington 13809
Andrew Johnson 13674
McKinley 12988
Ford 12764
Jackson 12007
Carter 11958
Tyler 11944
Truman 11905
Jefferson 11643
Garfield 11555
Pierce 11537
Clinton 11497
Obama 11437
Hoover 11420
Madison 11008
Adams 10836
George H.W. Bush 10832
Cleveland 10060
Taft 9512
Coolidge 9239
Arthur 9162
JQA 8917
Hayes 8906
Ben Harrison 8423
Buchanan 7035
Van Buren 6966
Monroe 6801
WHH 6714
Taylor 6194
Polk 6096
Fillmore 4774

To me this variation appears to have barely any rhyme or reason. LBJ is a solid contender for the top spot; his Presidency is very tough to rank, because it includes both an incredible domestic agenda (Civil Rights Act, Medicare) and arguably the worst foreign policy agenda (Vietnam). But if you take the “absolute value” of everything he did, there’s no denying he’s one of the most consequential Presidents. Fillmore is also a decent contender for last place, with less than a fourth of LBJ’s word count; I think he’s probably high in the running for “most forgotten President”.** But in between, things quickly get strange. Eisenhower ahead of 4-termer FDR? John Tyler ahead of Thomas Jefferson? Harding ahead of Teddy Roosevelt? Monroe near the bottom?

The big lesson here is that these pages are pretty weird artifacts. Their authors will have stylistic tics (maybe Tyler got a verbose guy, and Monroe got an Imagiste), and editorial decisions might displace whole sections into other articles. For example, in Jefferson’s article, the Louisiana Purchase gets about 250 words, but there’s also a standalone article about the Louisiana Purchase that’s about 5,000 words long—i.e., more worthy of discussion than the entire administration and life of Millard Fillmore, according to random Wikipedia editors.

Most Distinctive Words

Still, even with these idiosyncrasies, we ought to be able to extract something interesting from the language of these articles. For instance, which Presidents’ write-ups have the most to do with slavery, or war? What are the most remarked-upon aspects of, say, Teddy’s life, or the founding fathers, or the Gilded Age? What words, if any, set apart the discourse surrounding an icon like Lincoln from that around a tremendous moral failure like Andrew Jackson?

To explore these questions I turned to Most Distinctive Words (MDWs). This is basically a measure of the words that appear more frequently in a given text than we would expect, based on their frequency in some comparison corpus. In my case, that means checking which words appear disproportionately often in one guy’s article, compared to what we’d see if the words were distributed evenly across all articles.*** So, for instance, we might expect to see “atomic” appear distinctively often for Truman, since he dropped more atom bombs than anyone else—and, in fact, “atomic” is a distinctive word for him (though “bombing” gets you Reagan and LBJ as well).

A few notes about the MDWs you’ll see in the rest of this post: To make life easier, I converted everything to lowercase (that way “train” and “Train” aren’t different words, just because one appears at the beginning of a sentence). I also removed stop words (things like “the” and “of”, which are so frequent that they can skew things, and also are often boring), numbers, and symbols. Finally, I took out the ordinarily used names of Presidents (so, “andrew”, “jackson”, and “jacksons”, the latter to catch possessives), because otherwise they dominate the data, since they are naturally very distinctive of their articles.

The System Works

When you check the MDWs for a particular guy, you usually find a pretty nice encapsulation of his Presidency’s Greatest Hits. Here are the top few for Lincoln:†

Lincoln MDWs

You start with his two signature issues, pick up his home states, roll through his political acts and opponents, and even capture his assassin and, three cells later, one after the other, the reason he was killed. Another good example is Andrew Jackson:

Andrew Jackson MDWs

You’ve got his famous battle (“orleans”), his refusal to understand finance (“banks”), and his penchant for genocide—rendered all the more striking when you realize that “creek” refers to the Creek tribe (now called Muscogee), who lost a brutal war against Jackson and years later were also victims of the Indian Removal Act.

Since the MDWs work pretty often, it’s pretty striking when they depart from expectations. For some guys, this means a focus on the pre-Presidency—Madison’s top word is “constitution”, Reagan’s are littered with California and Hollywood terms, and Eisenhower’s focus on war terminology for eight straight words until they arrive at “interstate”, before jumping back to “ii”. Ulysses S. Grant is similar—unsurprising, since his own memoir barely mentions that he was President.

In another case that surprised me a little, the focus is on the post-Presidency:

William Howard Taft MDWs

Taft was the only President who ever went on to become a Supreme Court justice. That’s distinguishing in either sense of the word, and a nice legacy for a guy whose is probably best known to the public for being too fat to get out of a bathtub. (The article I have says that the evidence for this actually happening is unclear, but gives two sources for the distressingly ambiguous sentence “However, he once did overflow a bathtub.” I’m surprised and a little disappointed to say this whole sequence has been removed from the current version of the article.)

Another guy who surprised me was JFK. The word “assassination” is just 12th on his list; but on reflection, this may have something to do with the 8,000 word separate article on it, not to be confused with the 19,000 wordJohn F. Kennedy assassination conspiracy theories” article, which is longer than any Presidential article.††

Rules of Distinction

One feature of MDWs is that they privilege proper nouns. This makes sense when you consider just how specific (i.e., distinct) proper nouns are: all sorts of kids have dogs, but only Oblio has Arrow. This means there are a few things that define you if you get a Wikipedia page:

  • Your home. A President’s home state usually appears in his top few MDWs. If a guy has two home states, they both appear: Lincoln gets Illinois and Kentucky, Obama gets Illinois and Hawaii (and, even higher, Chicago). This isn’t a universal rule (JFK doesn’t have “massachusetts”), but it’s quite common.
  • Your wife. George has Martha, John has Abigail, Abe has Mary, Rutherford has Lucy, Herbert has Lou, Dwight has Mamie, Dick has Pat, Ron has Nancy, Bill has Hillary. You’re known by the person you love. But, there’s also:
  • You enemy. The first word for Washington is “british”; “confederate” makes the top five for Lincoln and Grant; Polk has his “mexico” and Truman his “korea”. Booth, Guiteau, Czolgosz, and Oswald make their expected lists. LBJ has not just “vietnam” but “goldwater”. And look back at the Jackson list above: creek, indian, indians, calhoun, bank, banks, seminole, tribes—that’s eight enemies in just 16 words (and another, “orleans”, is the site of a battle). For everyone, but especially for bloodthirsty maniacs, distinction is conferred by who and what we choose to fight.

Eras, In So Many Words

Another cool option with these MDWs is approaching from the other direction. Once we have them, we can pick a word and see who it encompasses. For instance, take the word “gold”. This turns out to be an MDW for Grant, Hayes, Garfield, Cleveland, Harrison, and McKinley—in other words, every President but one (Arthur) from 1868-1901. This is probably a function of the currency debates that dominated that era (the last three guys also have “silver” as an MDW), but it’s also a nice, very literal way to capture the Gilded Age.

Or take another definitive American word: “slave”. That word and “slaves” appear as MDWs for Washington, Jefferson, Madison, Monroe, John Quincy Adams, and Jackson—six of the first seven Presidents, and all of the ones who owned slaves themselves. (JQA, like his father, didn’t own any slaves, and the two words appear in his article in the context of his fierce opposition to slavery; for the rest of them, the words are there mainly because they owned slaves.) After this crew, those two words largely disappear, with the exceptions of Fillmore (he had “moderate anti-slavery views”, according to the article) and Lincoln (for obvious reasons).

But the issue does not disappear. The words “slavery” or “antislavery” appear as MDWs for JQA, Jackson, Van Buren, Polk, Taylor, Fillmore, Pierce, and Buchanan, before coming to a close with Lincoln. That’s everyone between the Founding Fathers and the close of the Civil War with the exceptions of William Henry Harrison (who served one month) and John Tyler (who was in office, but didn’t exactly serve at all). Many of these Presidents were slave-owners themselves, but we see a shift away from personal ownership as the focus (with a few overlap cases), and toward the rise of a political cause—from slaves to slavery. It’s a striking lexical marker of the transition from one paradigm to another, maybe somehow indicating the point at which Wikipedia writers and readers feel that Presidents were “of their time” instead of responsible for it.

A Final Mystery

I want to end with something I noticed but can’t quite explain. The word “president” actually appears as an MDW in several cases. Here they are:

word frequency p value President
president 101 0.000131294 Tyler
president 102 0.001869553 Andrew Johnson
president 74 0.002524355 Taft
president 105 0.006078532 W
president 80 0.008887996 George HW Bush
president 52 0.00954079 WH Harrison
president 96 0.016850757 Nixon
president 86 0.018566542 Ford
president 98 0.038807297 Reagan

In some of these cases, it seems like the word might have to do with unique relationships to the office. Harrison died immediately, Tyler took over even though no one wanted him (he was known as “His Accidency“), while succession laws were still untested, and Johnson abused the office to veto Congress until they impeached him (note: if you include “presidential” in these results, you add Clinton to the mix, suggesting impeachment may play a role). Still, even if this is right, it only explains a few articles. I have no idea what any of this has to do with Taft.

And then there’s this: Every Republican President since 1968 has the word “president” as an MDW. What’s more, in this era it’s only Republicans—Carter, Clinton, and Obama are all missing from that list. Why is this happening? Is it some sort of conservative preference for hierarchy/authority? A right-wing love of the institution? The tendency of these Presidents to wield presidential authority in problematic ways (Watergate, the pardon of the guy who did Watergate, Iran-Contra, the Decider and his father)? Just a random tic from a prolific Wikipedia editor? (Even then, it might interesting that the editor of these articles has that tic.)

I looked at the word’s usage in the articles in hope of clarity, but the answer wasn’t immediately obvious. I did notice that, in the George W. Bush article, for instance, there was a tendency to call him “President Bush” in photo captions (which are included in the articles I analyzed)—but this doesn’t explain why other articles don’t follow the same practice. This all put me in mind of a bumper sticker I used to see in Texas, that looked roughly like this:


I never knew how to interpret it. What’s the point of stating that the current President is the President? I am being completely honest when I say that I don’t know if this is supposed to be combative, reassuring, snarky, patriotic, a sign of the tribe, or something else I haven’t even thought of. So it’s interesting to see a sort of version of it replicated in these MDWs—105 uses of the word President††† in an article that tells you, right at the top, that it’s about a President. It’s an interesting form of distinction for the modern Republican President—the simple confirmation that they held the job.



*It was very tempting to use this as the title of the post, but I think you just can’t do that anymore. If you Google “what we talk about when we talk about” -love (the last part is so that you don’t get any actual references to Raymond Carver’s short story), you get 211,000 results. Based on those results, here are a few of the things about which we talk about what we talk about when we talk about them:

  • Apple and Compelled Speech
  • Gun Violence
  • “The Uyghurs” (quotation marks in original)
  • Indicators
  • Clone Club
  • Causality
  • GIFs
  • God
  • Minimalism

** I doubt he wins though; his name is too weird. My guess is Ben Harrison.

***Specifically, I used word frequencies from all articles to set expected values, and word frequencies in given articles to set observed values. I then used a Fisher’s exact test to determine which words were significantly more present than expected. I did not look for words that were missing (e.g., if a President’s article says “war” much less than ordinary). My thanks to Mark Algee-Hewitt for helping me write the R code used in this project, and for explaining MDWs to me in the first place.

† In all cases, the words are ordered by p-value, where lower is taken to mean “more distinctive”. Here and below, I’m pasting in partial lists for space purposes.

†† This makes it longer than Macbeth, as well as 7 other Shakespeare plays. See also the 2,800 word “Assassination of John F. Kennedy in Popular Culture” article.

††† W’s article has 105 occurrences of the word “president”, more than three times as many as George Washington, who not only has a roughly equal-length article, but practically invented the office.