Threat from the White House: Using corpus linguistics to look at White House press briefings

nsa word cloud

Since the intelligence breach by Edward Snowden at the National Security Agency in the United States, I’ve been interested in the messages coming out of the White House to counter the claims made concerning the issues surrounding the widespread surveillance which has been taking place.

For this study, I have looked at a corpus of texts of transcription of press briefings given by the house from June 2013 until January 2014. This gave me a corpus of 1,142,774 tokens.

When I looked at the most frequent non-function words, I found the following:

president (8,315), people (3,122), house (2,335), congress (2,002), government (1,985), states (1,758), united (1,670), care (1,611), right (1,526), security (1,496), work (1,468), important (1,441), insurance (1,400), republicans (1,352), American (1,324), white (1,311), health (1,259), president’s (1,254), affordable (1,246), issue (1,171)

By looking at the most frequent non-function words, it appears that the White House briefings contained a large amount of information related to the health care insurance program which the Obama administration has been trying to implement. It was quite interesting that the issues related to the NSA revelations were not more prominent, although the word security may have been used in relation to this issue.



security interests

Although security was prominent, I want to focus on the usage of another word within the corpus: threat. Lemmas of this word occurred 629 times.



threat that

Not all threat were associated with terrorism, as economic issues and concerns were also discussed in terms of threat. However, when looking at the collocates of the lemma THREAT, terrorism does appear to be a major concern.

collocates of threat

threat is

By looking at the selected concordance lines above, threat is described as vague, current, changing, imminent, real, significant, ongoing, and global.

What the threat actually consists of is not apparent in these concordance lines, but the administration appears to consider them all good reasons to allow the NSA to keep going with its work.






“open and honest debate” A Corpus Linguistic Analysis of a White House Press Briefing during the NSA Prism Scandal

briefings Capture 2

As we are witnessing the current NSA Prism scandal, I thought it would be interesting to have a look at a White House press briefing (transcripts can be found at:

I used the transcription for the briefing held on June 11th, 2013. I cleaned the text very slightly by removing the words Mr. and Carney, as the press secretary’s name appears regularly on the transcript to indicate the speaker, and as this alters the data readings, I decided to remove them. The text in question contains 7,294 words; I was interested in looking at the keywords of the text, and in order to obtain a keywords list, I used Wordsmith. As a reference corpus, I used a section of the Westbury Lab Usernet Corpus, a 30 billion word corpus of news texts. (This corpus is available for download, but a BitTorrent and patience is required!:

When I looked at the keywords, I found the following:


As can be seen, the keyword list is able to demonstrate the principle themes of the text, which include not only the NSA controversy, but also topics such as Syria and Nelson Mandela.

The word with the fourth highest level of keyness is debate, and I would like to focus some time on this.

And just to remind ourselves before continuing, the Oxford dictionary defines debate as follows:


A search of the concordance lines of this word gives the following:


I find it quite fascinating that the NSA Prism project, which until very recently remained secret, is now being constructed by the White House as something which:

he is interested and believes in a debate

spirited and animated debate

healthy debate

honest debate

important debate

merits debate

welcomes the debate

If this debate is so healthy, honest, important, merited and welcomed, why has it taken the actions of a whistleblower to make it happen?