TANDEM in toto

  TANDEM: A Web-Based Text and Image Data Generator   Kelly Blanchat, Jojo Karlin, Stephen Real, Christopher Vitale DH Praxis Spring 2015   ABSTRACT   TANDEM is a Python-based Django web-application that generates text and image data from files submitted by the user. TANDEM is for scholars seeking quantitative insight into a corpus consisting of picture books, comics, advertisements, and other images with overlaid text. The TANDEM application compiles three existing open source technologies: Tesseract OCR, Open Source Computer Vision (OpenCV), and a natural language processing library called Natural Language Toolkit ... Read more

Week 5 Project Update

Excitement! Team TANDEM is working fast and furiously on all fronts. We've hit a few snags but all told, we feel like we've got a handhold on the mountains we're climbing. Here’s a brief overview of the ups and downs of the week: Our hope we might springboard off Lev’s tool proved somewhat castles in the air. Lev's feature extractor was coded in a day. When they went to try to run it again later they couldn't. Lev suggested we use OpenCV instead. OpenCV ... Read more
Skip to toolbar