Wachsmuth, Sven: Multi-modal scene understanding using probabilistic models. 2001
Inhalt
- Contents
- Introduction
- Problem Statement
- Robust Processing in Human-Computer Interaction
- Basic Principles of Computer Vision
- Basic Principles of Automatic Speech Understanding
- Integration of Speech and Image Processing -- An Overview
- Psychological experiments and the level of information processing
- Linguistics and the symbol grounding problem
- Spatial cognition
- A categorization of computational systems
- The Correspondence Problem
- Other Related Work
- Contributions
- A Model for Uncertainty
- Intensional and extensional models
- Bayesian Networks
- Definition of Bayesian networks
- Modeling in Bayesian networks
- How to get those numbers? Some simplification
- Modeling corresponding variables
- Inference in Bayesian Networks
- I-maps, moral graphs, and d-separation
- Singly connected networks
- Coping with loops
- A conditional bucket elimination scheme
- Relation to Graph Matching
- Applications of Bayesian Networks
- Bayesian networks for integration of speech and images
- Modeling
- Scenario and Domain Description
- Experimental Data
- The General System Architecture
- The speech understanding and dialog components
- The object recognition component
- Speech understanding and vision results
- Spatial Modeling
- A model for 3-d projective relations
- The spatial model in two dimensions
- The neighborhood graph
- Localization attributes
- Summary
- Object Identification using Bayesian Networks
- Previous work
- Starting points for improvements
- An extended Bayesian model for object classes
- A Bayesian model for spatial relations
- Modeling structural relationships
- Integrating the what and where
- Summary
- Inference and Learning
- Establishing referential links
- Interaction of speech and image understanding
- The most probable class of the intended object
- Interpretation of structural descriptions
- Unknown object names
- Disambiguating alternative interpretations of an utterance
- Disambiguating the selected reference frame
- Detection of neighborhood relations
- Further Learning Capabilities
- Results
- Test Sets
- Classification of System Answers
- Results on the Select-Obj test set
- Results on the Select-Rel test set
- Object Classification using Speech and Image Features
- Summary
- Summary and Conclusion
- The Integration of Speech and Images as a Probabilistic Decoding Process
- Contributions
- Future Work
- Final Remarks
- The elementary objects of baufix210
- Bibliography
- Index
