User Interfaces In the beginning: Wimpy Computing

Slides:



Advertisements
Ähnliche Präsentationen
Cadastre for the 21st Century – The German Way
Advertisements

Service Oriented Architectures for Remote Instrumentation
PRESENTATION HEADLINE
Finding the Pattern You Need: The Design Pattern Intent Ontology
Guten Tag! Good morning! ¡Buenos días! Bongiorno!.
E-Solutions mySchoeller.com for Felix Schoeller Imaging
Service Discovery in Home Environments
H - A - M - L - E - IC T Teachers Acting Patterns while Teaching with New Media in the Subjects German, Mathematics and Computer Science Prof. S. Blömeke,
DNS-Resolver-Mechanismus
Managing the Transition from School-to-Work Empirical Findings from a Mentoring Programme in Germany Prof. i.V. Dr. Martin Lang.
R. Zankl – Ch. Oelschlegel – M. Schüler – M. Karg – H. Obermayer R. Gottanka – F. Rösch – P. Keidler – A. Spangler th Expert Meeting Business.
Herzlich Willkommen zum Informations-Forum: SAP Interoperabilität
Die ZBW ist Mitglied der Leibniz-Gemeinschaft Copyright © ZBW 2010 Seite 1 Potenziale semantischer Technologien für die Bibliothek der Zukunft Klaus Tochtermann.
Fakultät für informatik informatik 12 technische universität dortmund Optimizations Peter Marwedel TU Dortmund Informatik 12 Germany 2009/01/17 Graphics:
Peter Marwedel TU Dortmund, Informatik 12
Regional Support in the context of LCG/EGEE
NUMEX – Numerical experiments for the GME Fachhochschule Bonn-Rhein-Sieg Wolfgang Joppich PFTOOL - Precipitation forecast toolbox Semi-Lagrangian Mass-Integrating.
Wozu die Autokorrelationsfunktion?
Institut für Verkehrsführung und Fahrzeugsteuerung > Technologien aus Luft- und Raumfahrt für Straße und Schiene Automatic Maneuver Recognition in the.
Institut für Verkehrsführung und Fahrzeugsteuerung > Technologien aus Luft- und Raumfahrt für Straße und Schiene Driving Manoeuvre Recognition > 19. Januar.
Lehrstuhl Informatik III: Datenbanksysteme AstroGrid-D Meeting Heidelberg, Informationsfusion und -Integrität: Grid-Erweiterungen zum Datenmanagement.
Thomas Herrmann Software - Ergonomie bei interaktiven Medien Step 6: Ein/ Ausgabe Instrumente (Device-based controls) Trackball. Joystick.
5th IAEA Technical Meeting on ECRH Gandhinagar – February 2009
Introduction to the topic. Goals: Improving the students essay style in general Finding special words and expressions that can be used in essay writing.
Deutsche Gesellschaft für Technische Zusammenarbeit GmbH Integrated Experts as interface between technical cooperation and the private sector – An Example.
Seminar Telematiksysteme für Fernwartung und Ferndiagnose Basic Concepts in Control Theory MSc. Lei Ma 22 April, 2004.
Methods Fuzzy- Logic enables the modeling of rule based knowledge by the use of fuzzy criteria instead of exact measurement values or threshold values.
Die Hausaufgaben: Machen Sie Ü. 7 auf S. 29
Institut für Umweltphysik/Fernerkundung Physik/Elektrotechnik Fachbereich 1 SADDU June 2008 S. Noël, K.Bramstedt,
Institut für Umweltphysik/Fernerkundung Physik/Elektrotechnik Fachbereich 1 Pointing Meeting Nov 2006 S. Noël IFE/IUP Elevation and Azimuth Jumps during.
KAE praxis verstehen chancen erkennen zukunft gestalten understanding reality facing challenges creating future Laboratory of Integrated Energy Systems.
Laurie Clarcq The purpose of language, used in communication, is to create a picture in the mind and/or the heart of another.
Case Study Session in 9th GCSM: NEGA-Resources-Approach
Institut AIFB, Universität Karlsruhe (TH) Forschungsuniversität gegründet 1825 Towards Automatic Composition of Processes based on Semantic.
Sanjay Patil Standards Architect – SAP AG April 2008
| DC-IAP/SVC3 | © Bosch Rexroth Pneumatics GmbH This document, as well as the data, specifications and other information set forth in.
A good view into the future Presented by Walter Henke BRIT/SLL Schweinfurt, 14. November 2006.
Centre for Public Administration Research E-Government for European Cities Thomas Prorok
BAS5SE | Fachhochschule Hagenberg | Daniel Khan | S SPR5 MVC Plugin Development SPR6P.
Bundesamt für Wehrtechnik und Beschaffung THE COST EFFECTIVE DEVELOPMENT OF HLA FEDERATIONS FOR COMPUTER- ASSISTED EXERCISES (CAX) K. Pixius 23-Sep-02.
Alp-Water-Scarce Water Management Strategies against Water Scarcity in the Alps 4 th General Meeting Cambery, 21 st September 2010 Water Scarcity Warning.
Institute for Computer Graphics and Vision 1 Dieter Schmalstieg ARVU Overview Augmented Reality VU Axel Pinz, Dieter Schmalstieg, Daniel Wagner VU 3.0.
INTAKT- Interkulturelle Berufsfelderkundungen als ausbildungsbezogene Lerneinheiten in berufsqualifizierenden Auslandspraktika DE/10/LLP-LdV/TOI/
Institut für Öffentliche Dienstleistungen und Tourismus Informal learning for regional development Manfred Walser Towards a Knowledge Society: Is Knowledge.
Fusszeilentext – bitte in (Ansicht – Master – Folienmaster, 1. Folie oben) individuell ändern! Danach wieder zurück in Normalansicht gehen! 1 OTR Shearography.
Tage der Woche German Early Level Montag Dienstag Mittwoch Donnerstag
Einführung Bild und Erkenntnis Einige Probleme Fazit Eberhard Karls Universität Tübingen Philosophische Fakultät Institut für Medienwissenschaft Epistemic.
Berner Fachhochschule Hochschule für Agrar-, Forst- und Lebensmittelwissenschaften HAFL Recent activities on ammonia emissions: Emission inventory Rindvieh.
4th Symposium on Lidar Atmospheric Applications
Ein Projekt des Technischen Jugendfreizeit- und Bildungsvereins (tjfbv) e.V. kommunizieren.de Blended Learning for people with disabilities.
Image Processing and Analysis Introduction. How do we see things ?
Talking about yourself
Cross-Polarization Modulation in DWDM Systems
ESSnet Workshop Conclusions.
1 von 10 ViS:AT Abteilung IT/3, IT – Systeme für Unterrichtszwecke ViS:AT Österreichische Bildung auf Europaniveau BM:UKK Apple.
Alltagsleben Treffpunkt Deutsch Sixth Edition
Adjectiv Endungen Lite: Adjective following articles and pre-ceeding nouns. Colors and Clothes.
HRM A – G. Grote ETHZ, WS 06/07 HRM A: Work process design Overview.
German Word Order explained!
Learning stages Concepts Rules Processes Learn the principles Solve problems Change behaviour Wie ich dann mit dieser Dame in das Nebenzimmer ging und.
1 Intern | ST-IN/PRM-EU | | © Robert Bosch GmbH Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung,
1 Stevens Direct Scaling Methods and the Uniqueness Problem: Empirical Evaluation of an Axiom fundamental to Interval Scale Level.
Lehrstuhl für Waldbau, Technische Universität MünchenBudapest, 10./11. December 2006 WP 1 Status (TUM) Bernhard Felbermeier.
Deutsches Zentrum für Luft- und Raumfahrt e.V. German Aerospace Center Institut für Physik der Atmosphäre Atmosphärische Aerosole The Multi-Angle Absorption.
Selectivity in the German Mobility Panel Tobias Kuhnimhof Institute for Transport Studies, University of Karlsruhe Paris, May 20th, 2005.
KGE Kommunalgrund GmbH Entwicklungsträger der Landeshauptstadt Magdeburg Presentation for the REDIS Project at the Interim Conference.
Technische Universität München 1 CADUI' June FUNDP Namur G B I The FUSE-System: an Integrated User Interface Design Environment Frank Lonczewski.
TUM in CrossGrid Role and Contribution Fakultät für Informatik der Technischen Universität München Informatik X: Rechnertechnik und Rechnerorganisation.
1.Usage/Purpose 2.Forms Present Tense Simple Past Tense 2.Meanings 3.Word Order/Placement modal + infinitive omission of infinitives 4. Saying what you.
Creating Web Documents
 Präsentation transkript:

Multimodale Räume “Smart Rooms” “Intelligent Environments” Seminar SS 03

User Interfaces In the beginning: Wimpy Computing Windows, Icons, Menus, Pointing

2nd Generation:Human-Machine Interaction Speaking Pointing, Gesturing Hand-Writing Drawing Presence/Focus of Attention Combination Sp+HndWrtg+Gestr. Repair Multimodal NLP & Dialog “Please show me… hm… all Hotels in THIS area.. er..part of the city"

“Perceptual” User Interfaces Perceptive human-like perceptual capabilities (what is the user saying, who is the user, where is the user, what is he doing?) Multimodal People use multiple modalities to communicate (speech, gestures, facial expressions, …) Multimedia Text, graphics, audio and video (Matthew Turk (Ed.), Proceedings of the 1998 Workshop on Perceptual User Interfaces)

Next: Pervasive Computing Human-Computer Interaction not the Only Exchange Humans Want to Interact with Other Humans Computers in the Human Interaction Loop (CHIL) The Transparent, Invisible Computer Computers Needs to be Context Aware Should Require little or no Learning or Attention Should be proactive rather than command driven Produce Little or No Distraction Permit a HCI and CHIL Mix

Smart/Intelligent Rooms Use of computation to enhance everyday activity Integrate computers seamlessly into the real world (e.g. offices, homes) Use “natural” interfaces for communication (voice, gesture, etc. ) Computer should adapt to the human, not vice-versa!

Perception In order to respond appropriately, objects/room need(s) to pay attention to People and Context Machines have to be aware of their environment: Who, What, When, Where and Why? Interfaces must be adaptive to Overall situation Individual User

Intelligent Environments Classroom 2000 (Georgia Tech) Mozer’s Adaptive House Enhanced Meeting Rooms Kids Room (MIT) … Enhanced Objects such as Whiteboards, Desks, Chairs, … See also the Intelligent Environments Resource Page (http://www.research.microsoft.com/ierp/)

Intelligent Rooms, Univ. California, San Diego

Classroom 2000 Capturing activity in a classroom Speaker’s voice Video Slides Handwritten Notes

Classroom 2000 Presenting (recorded) lectures through a web-based interface Integration of Slides, Notes, Audio, Video Searching Adding additional material

Microsoft Easy Living Project XML-based distributed agent system Computer vision for person-tracking and visual user interaction. Multiple sensor modalities combined. Use of a geometric model of the world to provide context. Automatic or semi-automatic sensor calibration and model building. Fine-grained events and adaptation of the user interface. Device-independent communication and data protocols. Ability to extend the system in many ways.

Mozer’s Adaptive House Operated as an ordinary home Usual light-switches, thermostats, doors etc. Adjustments are measured and used to train the house to automatically adjust temperature adjust lighting choose music or TV channel The house infers the users desires from their actions and behaviours

Adaptive House (Mozer) Sensors: Light Level Sound Level Temperature Motion Door status Window status Light settings Fan Heaters … (M. Mozer, Univ. of Colorado, Boulder)

Issues in Perception Visual Auditory Other: Haptic, Olfactoric, … ? Face-detection / Tracking Body-Tracking Face Recognition Gesture Recognition Action Recognition Gaze Tracking / Tracking Focus of Attention Auditory Speech Recognition Speaker Tracking Auditory Scene Analysis Speaker Identification Other: Haptic, Olfactoric, … ?

Enhanced Meeting Rooms Capturing of Meetings Transcription Summarization Dialog Processing Who was there ? Who talked to whom ?

Work at ISL Face Tracking Facial Feature Tracking (Eyes, Nose, Mouth) Head Pose Estimation / Gaze Tracking Lip-Reading (Audio-Visual Speech Reco.) 3D Person Tracking Pointing Gesture Tracking Other Modalities: Speech (!!!, see John), Dialogue, Translation, Handwriting, ...

Tracking of Human Faces A face provides different functions: identification perception of emotional expressions Human Computer Interaction requires tracking of faces: lip-reading eye/gaze tracking facial action analysis / synthesis Video Conferencing / video telephony application: tracking the speaker achieving low bit rate transmission

Demo: FaceTracker

Color Based Face Tracking Human skin-colors: cluster in a small area of a color space skin-colors of different people mainly differ in intensity! variance can be reduced by color normalization distribution can be characterized by a Gaussian model Chromatic colors:

Color Model Disadvantages: Advantages: environment dependent very fast (light-sources heavily affect color distribution) Advantages: very fast orientation invariant stable object representation not person-dependent model parameters can be quickly adapted

Tracking Gaze and Focus of Attention In meetings: to determine the addressee of a speech act to track the participants attention to analyse, who was in the center of focus for meeting indexing / retrieval Interactive rooms to guide the environments focus to the right application to suppress unwanted responses Virtual collaborative workspaces (CSCW) Human-Robot Cooperation Cars (Driver monitoring)

Tracking a User’s Focus of Attention Focus of Attention tracking: To detect a person’s interest To know what a user is interacting with To understand his actions/intentions To know whether a user is aware of something In meetings: to determine the addressee of a speech act to understand the dynamics of interaction for meeting indexing / retrieval Other areas Smart environments Video-conferencing Human-Robot Interaction

Head Pose Estimation Model-based approaches: Example-based approaches: Locate and track a number of facial features Compute head pose from 2D to 3D correspondences (Gee & Cipolla '94, Stiefelhagen et.al '96, Jebara & Pentland '97,Toyama '98) Example-based approaches: estimate new pose with function approximator (such as ANN) (Beymer et.al.'94, Schiele & Waibel '95, Rae & Ritter '98) use face database to encode images (Pentland et.al. '94)

Model-based Head Pose estimation Find correspondences between points in a 3D model and points in the image Iteratively solve linear equation system to find pose parameters (rx, ry, rz, tx, ty, tz) Y Z X Feature Tracking Pose Estimation 3D Model Real World Image

Demo: Facial Feature Tracking

Demo: Model-based Head Pose

Model-based Head Pose Pose estimation accuracy depends on correct feature localization! Problems: Choice of good features Occlusion due to strong head rotation Fast head movement Detection of tracking failure / re-initialization Requires good image resolution Video

Estimating Head Pose with ANNs Train neural network to estimate head orientation Preprocessed image of the face used as input

Network Architecture Pan (Tilt) Hidden Layer: 40 to 150 units Input Retina: up to 3 x 20x30 pixel 1.800 units

Tracking People in a Panoramic View Camera View Perspective View Panoramic View

Training Separate nets for pan and tilt Trained with Std.-Backprop with Momentum Term Datasets: Training on 6100 images from 12 users Crossevaluation on 750 images from same users Tested on 750 images from same users Additional User Independent Testset: 1500 images from two new users

Results histo: Histogram-normalized image used as input edges: Horizontal- and Vertical Edge Image used as input both: Both, Histogram-image plus Edge Images used

Demo

Spatial-Awareness in Smart Rooms Tracking people indoors To focus sensors on people To resolve spatial lrelationships To avoid bumping into humans To analyze activity Motivation

Person Tracking Vision based localization of people/objects: Single Perspective: Pfinder - W3S - Hydra - etc. Multiple Perspective: AVIARY - Easy Living

Person Tracking in the ISL Smart Room Cam3 Features Cam2 People Feature extractor Cam1 Cam0 Tracking agent Features

Personen-Tracking mit mehreren Kameras Ziel: 3D Tracking von Personen in Räumen Segmentierung von Vordergrundobjekten in jedem Bild „3D Schnitt“ der Strahlen durch die Objektmitten Kalman-Filter

Adaptive Silhouette Extraction Background subtraction: Adaptive Multi-Gaussian background model [Stauffer et al., CVPR 1998] Morphological operators smooth foreground output Connected components form silhouettes Silhouette extraction

Locating people Use calibrated sensors to calculate absolute position 1 Extract reference point: Centroid Use calibrated sensors to calculate absolute position Create list of location hypotheses 1 2 3 Location Hypotheses: i) (X,Y) ii) (X,Y) a b b a b a

Tracking people Best Hypothesis Tracking: Match location hypotheses ato tracks Smooth tracks with Kalman afilter Track 1 ii) i) Track 2 Hypotheses i) (X,Y) ii) (X,Y) Track 1 Track 2

Tracking Problems Imperfect and Merged silhouettes: Counterstrategies Better Vision algorithm Probabilistic Multi-Hypothesis aTracking Reference point: Head

Reference point: Head Use head as reference point instead of centroid - Tracking error b - False alarm rate Use head as reference point instead of centroid Head tracker has significantly lower tracking error and false alarm rate

Demo

Erkennung von Zeigegesten Ziele: Menschliche Zeigegesten erkennen Zeigerichtung in 3D extrahieren Einsatzgebiete: Mensch-Roboter-Interaktion smart rooms Anforderungen: Personenunabhängig Echtzeitbetrieb Kamerabewegung möglich

Erkennung von Zeigegesten Stereokamera Linkes/rechtes Bild

3D-Tracker: Verarbeitungsschritte Kamera Hautfarbe Disparität 3D-Clustering von Hautfarbpixeln liefert Hinweise auf Position von Kopf und Hände.

Gestenerkennung: Bewegungsphasen Zeigegesten bestehen aus drei intuitiv unterscheidbaren Bewegungsphasen: Beginn Halten Ende Genaue Lokalisierung der Haltephase wichtig zur Bestimmung der Zeigerichtung μ [sec] σ [sec] Komplette Geste 1.75 0.48 Beginn 0.52 0.17 Halten 0.76 0.40 Ende 0.47 0.12 Mittlere Dauer der Bewegungsphasen

Gestenerkennung: Modelle Modellierung der 3 Phasen mit separaten Modellen Kontinuierliche HMMs mit 2 Gaussians pro Zustand Null-Modell als Schwellwert für die Phasen-Modelle Training auf handgelabelten Daten

Gestenerkennung: Detektion Eine Zeigegeste wird erkannt, wenn 3 Zeitpunkte tB < tH < tE gefunden werden, so dass PE(tE) > PB(tE) und PE(tE) > 0 PB(tB) > PE(tB) und PB(tB) > 0 PH(tH) > 0

Gestenerkennung: Merkmale Merkmalsvektor: (r, Δθ, Δy ) Experimente: zylindrische Koordinaten besser als sphärische und kartesische Hand relativ zum Kopf  unabhängig von Position im Raum Δθ, Δy  keine Anpassung an Zeigeziele aus dem Training Spline-Interpolation der Merkmals-sequenzen auf konstant 40Hz.

Zeigerichtung Kopf-Hand-Linie Unterarmlinie Sehstrahl Auge-Hand Einfach zu messen Unterarmlinie Potenziell überlegen bei abgewinkeltem Arm Schwieriger zu messen

Audio-Visual Speech Recognition

Lip Tracking Module Feature based detects localization failures and automatic recover from failures tracks facial features (pupils, nostrils, lips)

Audio-Visual Recognition hypc = la hypa + lv hypv 1 = la + lv Kombinations Methoden SNR Gewichte Entropie Gewichte trainierte Gewichte

Fusion Levels Word Level (Vote, Decide based on A and V score) Phoneme Level (Combine by Diff. Weighting Schemes) Feature Level (Combine Features)

Audio-Visual Speech

Mögliche Themen Personentracking Gestenerkennung Attentive Interfaces Face Detection Lippenlesen (Audio-Visual Speech Reco.) Audio-Visual Tracking Emotion Recognition Person Identification Microphone-Arrays Sensor Fusion Smart Room Infrastructure Intelligent Camera Control Self-Calibration Other Smart Room Projects (MIT, Georgia Tech, IM2) Other Sensors: Pressure, IR, etc Speech Recognition in Meetings Far-Field Efficient Microphone-Arrays