Abstract :
A smart
camera performs real-time analysis to recognize scenic elements. Smart cameras are useful in a variety of scenarios:
surveillance, medicine, etc.We have built a real-time system for recognizing
gestures. Our smart camera uses novel algorithms to recognize gestures
based on low-level analysis of body parts as well as hidden Markov
models for the moves that comprise the gestures. These algorithms run on a
Trimedia processor. Our system can recognize gestures at the rate of 20
frames/second. The camera can also fuse the results of multiple cameras. Recent technological advances are enabling a new
generation of smart cameras that represent a quantum leap in sophistication.
While today's
digital cameras capture images, smart cameras capture high-level descriptions of the
scene and analyze what they see. These devices could support a wide variety of
applications including human and animal detection, surveillance, motion analysis, and facial
identification. Video processing has an insatiable demand for real-time
performance. Fortunately, Moore's law provides an increasing pool of available
computing power to apply to realtime analysis. Smart cameras leverage very large-scale
integration (VLSI) to provide such analysis in a low-cost, low-power system with
substantial memory. Moving well beyond pixel processing and compression, these
systems run a wide range of algorithms to extract meaning from streaming video. Because they push the design space in so many
dimensions, smart cameras are a leading edge application for embedded system research.
Detection
and Recognition Algorithms
Although there are many approaches to real-time video analysis, we chose
to focus initially on human gesture
recognition—identifying whether a subject is walking, standing, waving his arms, and so on. Because much work remains to be
done on this problem, we sought to
design an embedded system that can incorporate future algorithms as well as use those we created exclusively for this application. Our algorithms use both low-level and high-level processing. The
low-level component identifies different
body parts and categorizes their movement in simple terms. The highlevel component, which is
application-dependent, uses this information to recognize each body part's action and the person's overall activity based on scenario
parameters. Human detection and
activity/gesture recognition algorithm has two major parts: Lowlevel processing (blue blocks in Figure 1) and high-level processing.
Download :
smart camera reportDownload :