For Class
- Browse the online document/image collections listed in DH Toychest > Data Collections and Datasets > Document/Image Collections section. in order to get a sense of what digital texts are available. Concentrate on texts that are no longer in copyright or texts that can be used under a Creative Commons license; and that are available in plain-text or HTML format). Be sure to look especially at the larger, general purpose text collections that contain downloadable plaint-text, HTML, or XML files to see what is there–e.g.:
- EEBO-TCP Texts (see also catalog)
- Internet Archive (click on “Download” link on a book page for download format options)
- Open Library
- Oxford University Text Archive
- Project Gutenberg
———————-
- Examine the corpus Eighteenth Century Collections Online texts (2,198 plain-text English documents from Eighteenth Century Collections Online [TCP-ECCO]) (zip file) (The zip file contains the spreadsheet metadata.xls and a folder containing the full text of all the novels in plain-text form.)
- Instructions on unzipping (decompressing) zip files: Mac, Windows.
- See spreadsheet (metadata.xls) listing the authors and novels.
- Collect a list of 10 sample works (with links to them) that can be worked with in plain-text format and leave it as a souvenir of your experimentation on the course site (go to the page “Practicum 2 – Finding Digital Texts” and create a post there called “Your Name – Finding Digital Texts”).