RegisterGet new password
ISMAR 2014 - Sep 10-12 - Munich, Germany
header

Training Detectors and Recognizers in Python and OpenCV

SCHEDULE INFORMATION

Session TitleRoomStartEnd
Training Detectors and Recognizers in Python and OpenCVTBATuesday 09 Sep, 2014 02:00 PM05:30 PM
Organizers: 
Joseph Howse, Nummist Media, Canada
Description

Abstract

Monty Python's Flying Circus had a "cat detector van" so, in this tutorial, we use Python and OpenCV to make our very own cat detector and recognizer. We also cover examples of human face detection and recognition. More generally, we cover a methodology that applies to training a detector (based on Haar cascades) for any class of object and a recognizer (based on LBPH, Fisherfaces, or Eigenfaces) for any unique objects. We build a small GUI app that enables an LBPH-based recognizer to learn new objects interactively in real time. Although this tutorial uses Python, the project could be ported to Android and iOS using OpenCV's Java and C++ bindings.

Attendees will gain experience in using OpenCV to detect and recognize visual subjects, especially human and animal faces. GUI development will also be emphasized. Attendees will be guided toward additional information in books and online. There is no formal evaluation of attendees' work but attendees are invited to demonstrate their work and discuss the results they have achieved during the session by using different detectors and recognizers and different parameters.

Schedule

  • 14:00 - 14:25 1. Setting up OpenCV and related libraries (25 minutes)
    Windows XP/Vista/7/8
    Mac 10.6+ using MacPorts
    Debian Linux and its derivatives, including Ubuntu
  • 14:25 - 14:50 2. Building a GUI app that processes and displays a live camera feed (25 minutes)
  • 14:50 - 15:10 3. Detecting human faces (and other subjects) using prebuilt Haar cascades (20 minutes)
    Concept of Haar cascades
    Implementation of a Haar-based detector in OpenCV
    Our GUI for detection
  • 15:10 - 15:25 4. Break (15 minutes)
  • 15:25 - 16:15 5. Training a custom Haar cascade to detect cat faces (50 minutes)
    Obtaining annotated training images
    Parsing annotation data and preprocessing the training images
    Using OpenCV's training tools
  • 16:15 - 16:40 6. Recognizing faces of individual humans and individual cats (25 minutes)
    Local binary pattern histograms (LBPH) – concept and OpenCV implementation
    Our GUI for incrementally training and testing an LBPH recognizer
    Fisherfaces – concept and OpenCV implementation
    Eigenfaces – concept and OpenCV implementation
  • 16:40 - 17:00 7. Demos and discussion of attendees' work (optional) or discussion of the project's portability to Android and iOS (20 minutes)

Form of Presentation

The tutorial will include a presentation of PDF slides, videos featuring detection/recognition of real cats, and a live demo featuring detection/recognition of humans and artificial cats. The project and documentation will be available for download during the tutorial. Because of the long processing time required to train a detector, attendees will not be able to fully execute this step during the tutorial. However, a variety of pre-trained detectors will be provided and attendees will be able to parameterize and combine detectors in original ways and train their own recognizers. Training of human recognizers is a good opportunity for attendees to mingle. The tutorial allows time for attendees to demonstrate and discuss their work if they wish.

Intended Audience

This tutorial is intended for any developer who wants an introduction to detection and recognition in OpenCV. No expertise in computer vision or algorithms is assumed. Familiarity with Python and shell scripting would be helpful.

Instructor Background

Joseph (Joe) Howse has worked in the AR industry since 2011. He is President of Nummist Media Corporation Limited (http://nummist.com), providing software development and training services to clients worldwide. His publications include OpenCV for Secret Agents (Packt Publishing, forthcoming), OpenCV Application Programming for Android (Packt Publishing, 2013), OpenCV Computer Vision with Python (Packt Publishing, 2013), and “Illusion SDK: An Augmented Reality Engine for Flash 11” (ISMAR Workshop on Authoring Solutions for Augmented Reality, 2012). Joe holds a Master of Computer Science, MA in International Development Studies, and MBA from Dalhousie University. He has no difficulty detecting or recognizing any of his four splendid cats.

Further Information

http://nummist.com/opencv