TANDEM in toto

  TANDEM: A Web-Based Text and Image Data Generator   Kelly Blanchat, Jojo Karlin, Stephen Real, Christopher Vitale DH Praxis Spring 2015   ABSTRACT   TANDEM is a Python-based Django web-application that generates text and image data from files submitted by the user. TANDEM is for scholars seeking quantitative insight into a corpus consisting of picture books, comics, advertisements, and other images with overlaid text. The TANDEM application compiles three existing open source technologies: Tesseract OCR, Open Source Computer Vision (OpenCV), and a natural language processing library called Natural Language Toolkit ... Read more

Week 13 Project Update

WEEK 13 TANDEM PROJECT UPDATE: We are happy to announce that the initial version of our near-polished UI is up and functioning on http://dhtandem.com/. This development means that you can now go to the site and walk through uploading files as well as review some early versions of our documentation. Immediate next steps for our team include updating the text on the documentation pages to the more robust things we have patiently waiting in the wings while we finalize the connection of the ... Read more

Tandem Git Repository

The Python Script is available on Github. The repo is here. The core program is Tandem0.2.py in the tandem folder.   Read more

Project Update Week 8

TANDEM 0.5 will be moving from it’s heavy development phase into a testing and forward-facing design phase this week. At the time of this posting, Steve and Chris are still working out the specifics of functioning unified code, but testing of the independent scripts has begun to a certain degree of success. Text and image values are easily generated via independent processes. This week we also discussed the idea of data persistence with some depth. Simply put, would someone be able to access the ... Read more

Week 7 Project Update

Things are barreling ahead on TANDEM development! With our corpus defined and development goals set, the team is taking a two-pronged approach to the reaching the final project. While Chris and Steve focus on continuing to develop and code the working project, Kelly and Jojo have turned their attention to the work to be done with the corpus. Equally as important as building TANDEM is the ability to show a proof-of-concept and illustrate the value of the output TANDEM generates. While the ... Read more

week 6 project update

Development On the image processing side of things, Chris has identified the syntax for generating our key values. Now we are working toward stitching the pieces together in a way that makes sense for our output. The extreme minimum of computer vision is accessible via OpenCV and while the possibilities are tantalizing, we have continued to keep a direct focus on the key pieces we need to access for the mvp. TANDEM is still on track. We have also begun to reevaluate our ... Read more
Skip to toolbar