Requesting full-text datasets for text analysis
To request the full text of items for analysis, you must submit a dataset request which includes a list of requested item IDs from the JSTOR bibliographic metadata file. See Working with JSTOR Bibliographic Metadata for item ID list formatting requirements.
To submit a full-text dataset request:
- Log into the JSTOR text analysis support page with your personal JSTOR account.
- If you don't already have a JSTOR account, you may register for a JSTOR account for free.
- Under Request item full-text, select the Create request button.
- Fill out the request form, upload your item ID list by selecting the Upload TXT button, and agree to the Terms and Conditions.
- Select the Submit button to submit your dataset request to JSTOR.
You'll receive email updates regarding the status of your request and a link to download your full-text file, if approved. See JSTOR Text Analysis Support: Getting Started for requirements.
Note: The dataset file you request will be available to download on the related dataset request page for 60 days. To request the same items after a dataset expires, please create a new dataset request.
Full-text dataset formatting
The following fields are available in the JSTOR full-text file. Use the item_id to join the corresponding metadata for each item from the bibliographic metadata file.
Field | Type | Description | Example |
---|---|---|---|
item_id | UUID | Unique identifier for a JSTOR item. | 2c0018ee-094b-3f3c-b677-2b56c0f73b7e |
references | List of strings | If JSTOR has structured reference data associated with this item (such as endnotes) then they will be included here as a list of strings. | [“ref 1”, “ref 2”] |
full_text | List of strings | For most items, OCR data is split out into one string per page. | [“page one of text…”, “page two of text…”] |