This tutorial was written by Katherine Walden, Digital Liberal Arts Specialist at Grinnell College. The tutorial framework was created by Sarah Purcell (L.F. Parker Professor of History, Grinnell College) and Papa Ampim-Darko, a student research assistant at Grinnell College
This tutorial was reviewed by Gina Donovan (Instructional Technologist. Grinnell College).
This tutorial is adapted from the Programming Historian’s Corpus Analysis with AntConc tutorial.
Text Analysis in AntConc is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Developed by Laurence Anthony, AntConc is a free, closed-source program that runs on Windows, OS, and Linux. At the most basic level, AntConc is a concordancer, or a program that constructs a concordance based on terms in a text or collection of texts. AntConc also allows users to visualize concordance calculations and generate word and keyword lists based on terms present in the text. AntConc also supports cluster and collocation analysis and visualization.
With Voyant, we explored a graphical user interface option for conducting textual analysis. AntConc offers a somewhat more hands-on, customizable approach to analyzing a text.
The computers in the Dlab have AntConc pre-installed.
If you want to work with AntConc on your own computer, select the appropriate version for your operating system and following the installation instructions.
1-Launch AntConc by double clicking on the Desktop icon or searching for the program in the Start menu.
2-AntConc allows you to open single files, as well as open an entire file directory. For this tutorial, we will be working with a large number of oral history text files, so opening the directory makes more sense than loading these files individually.
3-Select File->Open Dir and navigate to the cleaned_txt_files folder. Click OK.
4-The loaded files will be listed on the left-hand window in AntConc, and the total number of files will display at the bottom of that window.
5-The main AntConc screen gives you access to seven different textual analysis tools.
- Concordance searches for and displays keywords in context (KWIC).
- Concordance Plot presents a preliminary, basic visualization of a KWIC search.
- File View is like the Reader panel in Voyant—it shows you a full file view to see a search result in the larger context of a text.
- Clusters highlights terms that appear together frequently in the text.
- Collocates calculates the statistical likelihood of terms appearing together in the text. Clusters looks for term patterns as they are represented in the text. Collocates looks at the likelihood of terms appearing together in the text.
- Word List calculates how frequency words appear in your text.
- Keyword List compares keywords from two text sources (a reference text and an analysis text).
Searching Keywords in Context
6-AntConc (and other computing tools) excel at identifying patterns in language that are not always detected by the average reader. For example, function words like a, an, the, he, she, I, etc. (often called stopwords in textual analysis) don’t frequently catch our attention as readers. A computational tool focuses on analyzing the words as term objects, rather than interpreting them based on meaning, context, or function.
7-Type “the” in the Search Term box at the bottom of the Concordance window and click Start.
8-The Concordance tab shows key words in context (KWIC), with the search term highlighted.
9-The Kwic Sort options allow you to change how AntConc displays or sorts the context for your search term. 1R includes the term immediately to the right of your search term, 2R includes the second term to the right from your search term, etc. 1L includes the term immediately to the left of your search term, 2L includes the second term to the left from your search term, etc.
10-Change the Kwic Sort options, and click on the Sort icon. How did your search results change? What happens if you continue to customize or edit the Kwic Sort options? How do you understand a key word differently based on how you tell the program to calculate context?
Visualizing Keywords in Context
11-Search for “school” in the Concordance Tab.
12-Once your search has loaded, click on the Concordance Plot tab to visualize your search results.
13-Each instance of the keyword is represented as a vertical black line. AntConc visualizes how keyword appearances are distributed across each file in the Corpus Files.
14-Clicking on a specific line takes you to that passage of the text in File View.
15-How useful do you find these preliminary visualizations? How do they compare to the types of visualizations generated on Voyant? How do these visualizations impact your understanding of the text? What questions do you have based on these visualizations?
16-If you’re familiar with Boolean searching, you know symbols can be used in a search to customize or focus your search results. AntConc uses a series of wildcard operators to allow you to further customize your search.
17-Go to Global Settings-> Wildcard Settings to view or edit the full list of available wildcard operators.
18-Search for m?n and wom?n and compare your results.
Note on operators:
The * operator is often used in Boolean searching. The ? operator is more specific because it stands in for only one character. For example, searching m*n will bring back results that include men, mean, mellon, etc. Searching m?n will return men, man, and min. Similarly, wom?n will return woman and women.
Clusters and N-Grams
19-Click on the Clusters/N-Grams tab and search for sport.
20-AntConc ranks your search results, calculates frequency, and range (number of files in which the cluster appears), while also displaying the text in the cluster.
21-The default Search Term Position places the search term on the left side of the cluster. Change the Search Term Position selection to On Right and click Start to re-run the search. How did your search results change?
22-Cluster Size determines the range for the number of terms AntConc searches and displays. How are your search results different when you change this range?
Exporting in AntConc
23-After you are satisfied with a search result, click File->Save Output to save the search result as a text file (*.txt).
24-Save the file as [SEARCH TERM]_cluster_search or another descriptive name.
25-Conduct another Cluster search for study, customize your results, and export as a text file.
26-Right click on the exported text files and open in Notepad or Notepad++ to compare search results.
Collocates and Word Lists
27-As mentioned earlier in the tutorial, Clusters analyzes what words appear most frequently alongside your search term.
28-Collocates calculates what terms are statistically probable to appear near your search term. Freq calculates overall frequency, Freq(L) looks at frequency for terms to the left of your search term, and Freq(R) calculates frequency for terms to the right of your search term. Stat uses the Mutual Information (MI) and T-score calculations outlined in Stubbs (1995) to calculate the statistical probability of term collocation.
29-Use family as your search term.
30-AntConc will display a pop-up window message about needing to generate a Word List. Click OK to have AntConc generate that list automatically.
31-What terms are statistically likely to appear in proximity to your search term? What happens when you change the Window Span (number of words to the right and left of your search term AntConc will include in the analysis)?
- What do you notice is similar about Voyant Tools and AntConc as digital tools for textual analysis? What are the differences?
- Which did you prefer working with? Why?
- How was your understanding of the text impacted by the analysis we did in AntConc? What questions do you still have?
- What would be your next step in analyzing this text, using Voyant or AntConc?
- What types of research questions can you see textual analysis being useful to answer or respond to?