TANDEM

Building Blocks

OCR

Text recognition generates readable text.

Natural Language Processing

Light Natural Language Processing Toolkits generate text data.

Image Feature Exctraction

Quantitative data generated based off of image values.

About

TANDEM is an online environment that generates quantitative image and text data from files submitted by the user. This output is intended to be used as source material for data visualization, quantitative analysis, and distant reading of multimodal print objects.

TANDEM compiles existing open source technologies including a version of OCR, image feature extraction, and light natural language processing packages to generate useful output.The output will be concatenated into a single document that can be saved as a .CSV file format.

To explore the functionality of TANDEM, we will employ a test corpus of Public Domain picture books. The test corpus will illustrate that TANDEM streamlines the ability to generate the kinds of data needed to make informed distant readings of multimodal print artifacts. TANDEM has an intended audience of scholars with a range of computational expertise and a need for quantitative insight into picture books, comics, illuminated manuscripts, and other images with overlaid text.

Team Members

Kelly Blanchat

UX/UI Designer

Stephen Real

Developer

Jojo Karlin

Outreach Coordinator

Christopher Vitale

Project Manager

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31