JSTOR Text Analysis Support: Working with JSTOR Full-Text Datasets – JSTOR Support

On this page:

Requesting full-text datasets for text analysis
Full-text dataset formatting

Requesting full-text datasets for text analysis

To request the full text of items for analysis, you must submit a dataset request which includes a list of requested item IDs from the JSTOR bibliographic metadata file. See Working with JSTOR Bibliographic Metadata for item ID list formatting requirements.

To submit a full-text dataset request:

Log into the JSTOR text analysis support page with your personal JSTOR account.
- If you don't already have a JSTOR account, you may register for a JSTOR account for free.
Under Request item full-text, select the Create request button.
Fill out the request form, upload your item ID list by selecting the Upload TXT button, and agree to the Terms and Conditions.
Select the Submit button to submit your dataset request to JSTOR.

You'll receive email updates regarding the status of your request and a link to download your full-text file, if approved. See JSTOR Text Analysis Support: Getting Started for requirements.

Note: The dataset file you request will be available to download on the related dataset request page for 60 days. To request the same items after a dataset expires, please create a new dataset request.

Full-text dataset formatting

The following fields are available in the JSTOR full-text file. Use the item_id to join the corresponding metadata for each item from the bibliographic metadata file.

Field	Type	Description	Example
item_id	UUID	Unique identifier for a JSTOR item.	2c0018ee-094b-3f3c-b677-2b56c0f73b7e
references	List of strings	If JSTOR has structured reference data associated with this item (such as endnotes) then they will be included here as a list of strings.	[“ref 1”, “ref 2”]
full_text	List of strings	For most items, OCR data is split out into one string per page.	[“page one of text…”, “page two of text…”]

Requesting full-text datasets for text analysis

Full-text dataset formatting

Articles in this section