On this page:
What is JSTOR text analysis support?
JSTOR text analysis support accommodates text analysis and digital humanities research by providing datasets of full-text for journals, books, research reports, and pamphlets on JSTOR. Text analysis—also known as text analytics or text mining—is the process of using technology to find valuable insights, trends, and patterns in text data to create new information.
You can use text analysis support to download bibliographic metadata for local analysis or to request a custom, full-text dataset for intensive research.
Content available for text analysis
The items available to you for text analysis are not tied to the content you may have access to for reading through institutional or individual participation on JSTOR. You may have access to additional metadata and full-text formats of content for text analysis.
Content available for text analysis includes:
- Most journals on JSTOR
- Most Open Access books on JSTOR
- Items from Reveal Digital’s open collections that have OCR data
- Most pamphlets on JSTOR
- Most research reports on JSTOR
Research requirements
When requesting the full text of items for analysis, JSTOR will review your request to confirm that your research complies with the following requirements:
-
Your project is in support of research or teaching.
-
Your project is not to train or otherwise enhance the output of an AI model. (Using a large language model (LLM) to perform text analysis is acceptable.)
-
Your end goal is not to create a product.
-
Your end goal is not to create a database that would substitute for JSTOR access or access to the publishers’ websites.
-
You agree to limit access to the content during the length of your project to yourself and researchers supporting your project with a need to access such content. You will delete the content when you are done.
-
The total number of JSTOR items you are requesting is under 1.5 million items.
-
Your use of this data is governed by the terms and conditions of use.
Requesting full-text data for analysis
To request the full text of items for analysis, you must submit a dataset request which includes a list of requested item IDs from the JSTOR bibliographic metadata file.
To download the JSTOR bibliographic metadata file:
- Log into the JSTOR text analysis support page with your personal JSTOR account.
- If you don't already have a JSTOR account, you may register for a JSTOR account for free.
- Under Download metadata of JSTOR content available for text analysis, read and agree to the Terms and Conditions and then select the Download JSONL button.
- Using the downloaded file, create a list of the IDs of the items for which you want to download the full-text.
- See Working with JSTOR Bibliographic Metadata for item ID list formatting requirements and an example of how to create an item ID list with Python.
To submit a full-text dataset request:
- Log into the JSTOR text analysis support page with your personal JSTOR account.
- Under Request item full-text, select the Create request button.
- Fill out the request form, upload your item ID list by selecting the Upload TXT button, and agree to the Terms and Conditions.
- Select the Submit button to submit your dataset request to JSTOR.
You'll receive email updates regarding the status of your request and a link to download your dataset file, if approved. See Working with JSTOR Full-Text Datasets.
Citing data or results from JSTOR text analysis
To cite the JSTOR bibliographic metadata file:
JSTOR Bibliographic Metadata, distributed by JSTOR, https://jstor.org/ta-support/metadata (Accessed: DATE_OF_ACCESS)
To cite a full-text dataset, we suggest you include:
- Creator of the dataset
- Title or description (We recommend you describe the dataset in the citation, in place of the title.)
- Date of creation
- Indication the dataset came from JSTOR
- The JSTOR text analysis support URL: https://jstor.org/ta-support/
We also recommend publishing the list of item IDs included in your final dataset as an artifact of its own or as supplementary materials with the publication.