TANDEM in toto

  TANDEM: A Web-Based Text and Image Data Generator   Kelly Blanchat, Jojo Karlin, Stephen Real, Christopher Vitale DH Praxis Spring 2015   ABSTRACT   TANDEM is a Python-based Django web-application that generates text and image data from files submitted by the user. TANDEM is for scholars seeking quantitative insight into a corpus consisting of picture books, comics, advertisements, and other images with overlaid text. The TANDEM application compiles three existing open source technologies: Tesseract OCR, Open Source Computer Vision (OpenCV), and a natural language processing library called Natural Language Toolkit ... Read more

Project Update Week 8

TANDEM 0.5 will be moving from it’s heavy development phase into a testing and forward-facing design phase this week. At the time of this posting, Steve and Chris are still working out the specifics of functioning unified code, but testing of the independent scripts has begun to a certain degree of success. Text and image values are easily generated via independent processes. This week we also discussed the idea of data persistence with some depth. Simply put, would someone be able to access the ... Read more
Skip to toolbar