GI/ACM Regionalgruppe Rhein-Main Suche ist nicht gleich Suche! 23. Juni 2016 in den Ra ̈ umen des Fraunhofer IGD in Darmstadt Chris Biemann

Slides:



Advertisements
Ähnliche Präsentationen
Wo ist Bart? Year 9 German students were taught prepositions in the context of the topic “zu Hause”. They were then shown the following PPT where prepositions.
Advertisements

WebQuest 1. Thema: Ein Problem oder Rätsel soll gelöst werden. Eine Zusammenfassung wird erarbeitet. Man schafft ein kreatives Werk. Es handelt sich um.
Die deutsche Satzstellung
Verbs Used Impersonally With Dative Deutsch I/II Fr. Spampinato.
Thomas Herrmann Software - Ergonomie bei interaktiven Medien Step 6: Ein/ Ausgabe Instrumente (Device-based controls) Trackball. Joystick.
Can you think of some KEY phrases which would be useful in multiple contexts? Take 2 minutes with a partner and come up with as many as you can!
You need to use your mouse to see this presentation © Heidi Behrens.
You need to use your mouse to see this presentation © Heidi Behrens.
You need to use your mouse to see this presentation © Heidi Behrens.
You need to use your mouse to see this presentation © Heidi Behrens.
CALPER Publications From Handouts to Pedagogical Materials.
Die Olympischen Sommerspiele 2012 (offiziell Spiele der XXX. Olympiade genannt) sollen vom 27. Juli bis 12. August 2012 in London stattfinden. London ist.
Der formelle Imperativ – the Imperative
Coordinating Conjunctions Why we need them & how to use them deutschdrang.com.
 Every part in a sentence has a grammatical function. Some common functions are: - Subject - Verb - Direct object / accusative object - Indirect object.
Konjunktionen & Indirekte Fragen {Conjunctions}
Mein Arbeitspraktikum. Today we are learning to talk about work experience we have done, giving facts, details and opinions The bigger picture: We are.
Die Fragen Wörter Wer? Was? Wann?.
Nominative & Accusative Basic Rules for Relative Pronouns in German:
1 Bauhaus-Universität Weimar ArchitekturProgrammierung Generative Entwurfsmethoden Processing Grundlagen Professur Informatik in der Architektur.
SiPass standalone.
Heute ist Mittwoch der 4te Februar 2015 Lernziele: Akk Pronouns, sprechen Wir machen: AKK arbeiten, sprech arbeiten. Sprech Quiz Donnerstag.
Literary Machines, zusammengestellt für ::COLLABOR:: von H. Mittendorfer Literary MACHINES 1980 bis 1987, by Theodor Holm NELSON ISBN
Wissenschaftliche Kommunikations-Infrastruktur Workshop CIDOC CRM SIG Meeting Germanisches Nationalmuseum, Nürnberg May 19, 2015 Mark Fichtner Germanisches.
What is a “CASE”? in English: pronouns, certain interrogatives
COMMANDS imperative There are three command forms: formal familiar singular familiar plural.
COMMANDS imperative 1. you (formal): Sie 2. you (familiar plural): ihr
KLIMA SUCHT SCHUTZ EINE KAMPAGNE GEFÖRDERT VOM BUNDESUMWELTMINISTERIUM Co2 online.
© Crown copyright 2011, Department for Education These materials have been designed to be reproduced for internal circulation, research and teaching or.
Gregor Graf Oracle Portal (Part of the Oracle Application Server 9i) Gregor Graf (2001,2002)
You need to use your mouse to see this presentation © Heidi Behrens.
EUROPÄISCHE GEMEINSCHAFT Europäischer Sozialfonds EUROPÄISCHE GEMEINSCHAFT Europäischer Fonds für Regionale Entwicklung Workpackage 5 – guidelines Tasks.
Kapitel 2 Grammar INDEX 1.Subjects & Verbs 2.Conjugation of Verbs 3.Subject Verb Agreement 4.Person and Number 5.Present Tense 6.Word Order: Position of.
Kapitel 8 Grammar INDEX 1.Command Forms: The Du-Command Form & Ihr- Command 2.Sentences & Clauses.
Here‘s what we‘ll do... Talk to the person sitting in front of you. Introduce each other, and ask each other questions concerning the information on your.
EUROPÄISCHE GEMEINSCHAFT Europäischer Sozialfonds EUROPÄISCHE GEMEINSCHAFT Europäischer Fonds für Regionale Entwicklung Workpackage 5 – guidelines Tasks.
10.3 Lektion 10 Geschichte und Gesellschaft STRUKTUREN © and ® 2012 Vista Higher Learning, Inc Der Konjunktiv I and indirect speech —Ich komme.
What’s the odd one out and why? TeeBananenBier Orangensaft WasserMilsch KaffeeFisch PizzaSalatSchokoladeSandwich SchokoladeOrangenZitronenApfel PizzaFischOrangeChips.
Money rules the medicine?! A presentation by Jan Peter Hoffmann European healthcare systems in comparison.
© Boardworks Ltd of 8 © Boardworks Ltd of 8 This icon indicates that the slide contains activities created in Flash. These activities are not.
LLP DE-COMENIUS-CMP Dieses Projekt wurde mit Unterstützung der Europäischen Kommission finanziert. Die Verantwortung für den Inhalt dieser.
DAS VIERTE DEUTSCHE KASUS Genitiv. Kasus ● What is a case? A case shows the grammatical function of a word. ● There are four cases in German. Up to now.
Learning Linear Ordering Problems for Better Translation Roy Tromble, Google Pittsburgh Jason Eisner, Johns Hopkins August 7, 2009.
Fitness. An english presentation.
Interrogatives and Verbs
Bremse brake Welcher Roller hat eine Fussbremse, welcher hat eine Handbremse? Which scooter has a footbrake, which one has a handbrake?
Gesund oder ungesund?.
Sentence Structure Questions
Freizeit Thema 5 Kapitel 1 (1)
you: ihr ( familiar plural ) you: du ( familiar singular)
Deutsch I Telling time….
Die Umwelt Thema 13 Kapitel 3 (1)
Azure Countdown Wenn der Freund und Helfer Freunde und Helfer braucht: Sichere Content-Upload-Plattform für Bürger.
Sentence Structure Connectives
You need to use your mouse to see this presentation
Synonyms are two or more words belonging to the same part of speech and possessing one or more identical or nearly identical denotational meanings, interchangeable.
Die andere Vergangenheitsform
You need to use your mouse to see this presentation
Was ist die Verbindung hier?
Official Statistics Web Cartography in Germany − Regional Statistics, Federal and European Elections, Future Activities − Joint Working Party meeting.
OFFICE 365 FOCUS SESSION SHAREPOINT ONLINE 101:LERNE DIE BASICS 19. März 2018 Höhr-Grenzhausen.
Integrating Knowledge Discovery into Knowledge Management
Practical Exercises and Theory
Niedersächsisches Ministerium
School supplies.
Die tiere Share: Introduce the session and PDSA.
Scenario Framework for the Gas Network Development Plan
You need to use your mouse to see this presentation
You need to use your mouse to see this presentation
Zhunussova G., AA 81. Linguistic communication, i.e. the use of language, is characteristically vocal and verbal behaviour, involving the use of discrete.
 Präsentation transkript:

GI/ACM Regionalgruppe Rhein-Main Suche ist nicht gleich Suche! 23. Juni 2016 in den Ra ̈ umen des Fraunhofer IGD in Darmstadt Chris Biemann Adaptive Methoden in der Sprachtechnologie

2 Elemente des Cognitive Computing ss context- ualized iterativeadaptive inter- active Cognitive Computi ng

3 Warum Sprache schwer ist Er saß auf der Bank und zählte seine Kohle. Sie ging zur Bank und hob Geld ab. lexikalische Ebene Konzept- ebene synonym polysem

4 Why Not Only To Use Dictionaries or Ontologies Advantages:  Sense inventory given  Linking to concepts  Full control Photo by zeh fernando under Creative Commons licence “give a man a fish and you feed him for a day… Disadvantages: Dictionaries have to be created Dictionaries are incomplete Language changes constantly: new words, new meanings …

5 Structure Discovery Paradigm … teach a man to fish and you feed him for a lifetime” Consequences:  Only raw text input required  Corpus-driven  Language/domain independent Machine Learning Task Use annotations as features Text Data SD Algorithms Find regularities Annotate regularities in data

6 CORPUS-ADAPTIVE SEMANTICS Machine Learning Task Use annotations as features Text Data SD Algorithms Find regularities Annotate regularities in data

7 ‘holing’ operation: producing pairs of words and features fing er)1 Fisch fing)1 Fisch den)1 Fisch im)1 im Netz)1 Netz fing er)1 Fisch fing)1 Fisch den)1 Fisch im)1 im Netz)1 Netz C. Biemann, M. Riedl (2013): Text: Now in 2D! A Framework for Lexical Expansion with Contextual Similarity. Journal of Language Modelling 1(1): SB(fing, er) OA(Fisch, fing) NK(Fisch, den) MNR(Fisch, im) NK(im, Netz) --(Netz,.)

8 Distributional Thesaurus (DT)  Computed from distributional similarity statistics  Entry for a target word consists of a ranked list of neighbors Netz#NN1000 Netzwerk#NN102 Stromnetz#NN69 Infrastruktur#NN61 Geflecht#NN58 Mobilfunknetz#NN55 Schienennetz#NN52 Angebot#NN46 Streckennetz#NN45 System#NN42 Internet#NN42 Datennetz#NN39 Vernetzung#NN38 Festnetz#NN37 Faden#NN37 Telefonnetz#NN37... äußern1000 sprechen#VV368 warnen#VV344 betonen#VV330 erklären#VV300 bekräftigen#VV281 plädieren#VV274 sagen#VV272 kündigen#VV266 mahnen#VV265 kritisieren#VV264 verweisen#VV255 räumen#VV241 reagieren#VV komplett Abschaffung#NN#-NK vollständig Renovierung#NN#-NK Verzicht#NN#-NK Genesung#NN#-NK First order komplett vollständig Second order 3 Spielzeit#NN#-NK Z. Harris. (1954): Distributional Structure. Word 10 (2/3) G. A. Miller, W. G. Charles (1991): Contextual Correlates of Semantic Similarity. Language and Cognitive Processes 1991, 6 (1) 1-28 D. Lin (1998): Automatic retrieval and clustering of similar words, in Proceedings of COLING ’98, pp. 768–774 Unsinn#NN#-NK

9 DT entry “paper#NN” with contexts xx

10 Clustering of DT entries: Sense Induction bright#JJ paper#NN C. Biemann (2006): Chinese Whispers - an Efficient Graph Clustering Algorithm and its Application to Natural Language Processing Problems. Proceedings of the HLT-NAACL-06 Workshop on Textgraphs-06, New York, USA.

11 Symbolic Distributional Model example “beetle” Biemann, C. and Riedl, M. (2013): Text: Now in 2D! A Framework for Lexical Expansion with Contextual Similarity. Journal of Language Modeling 1(1):

12 CORPUS- AND TEXT-ADAPTIVE SEMANTICS Machine Learning Task Use annotations as features Text Data SD Algorithms Find regularities Annotate regularities in data

13 2D-Text im Frage-Antwort-Szenario T. Miller, C. Biemann, T. Zesch, I. Gurevych (2012): Using Distributional Similarity for Lexical Expansion in Knowledge ‐ based Word Sense Disambiguation. Proceedings of COLING ‐ 12, Mumbai, India

14 Semantic Enterprise Search: MWU and Senses dd Phrases: Apple mouse, mouse genome, USB mouse, cat and mouse.. Senses: Animal like rat, rodent, pig Device like keyboard, joystick

15 Contextualization: Virus on Twitter  Setman Virus Programs and Generator #VirusPrograms&Generator  Zeus Trojan Virus Found On Facebook  Virus Graveyard. sav in Borderlands 2 ( Xbox 360 ): Borderlands 2 für die Xbox 360 ist von einem Virus namens... #Telmi  Virus stiehlt gezielt technische Zeichnungen  Chinesische Behörden weiten Kampf gegen Virus H7N9 aus - Newsticker - Die aktuellsten Nachrichten - News - Bild.de  Freut mich, dass ich einfach so viele von meinen Freunden wieder mit dem Lego Virus infiziert habe :D  Infektiologie : Der unterschätzte Virus – #Herpes hat viele Gesichter Danke :) Magen-Darm Virus waren gestern zur kontrolle im Krankenhaus :o Super Weihnachten :/ Trojaner Malware Schadprogramm Datei Schadsoftware Hacker Schadcode Sicherheitslücke Spyware Spam Schwachstelle Programm Rootkit Software Phishing Dokument s Tool Dialer Exploit Code Keylogger Hacker Seuche Krankheit Infektion Vogelgrippe Grippe Epidemie Tumor Vogelgrippevirus Erkrankung Infektionskrankheit Schweinegrippe Fieber Lungenentzündung Tierseuche Hiv Vogelgrippe-Virus Viruserkrankung Virustyp Lungenkrankheit Variante Pest Aids Influenza

16 ADAPTIVE APPLICATIONS Adaptive Machine Learning Use annotations as features Text Data SD Algorithms Find regularities Annotate regularities in data

17 WebAnno Automation Mode dd Yimam, S.M., Biemann, C., Majnaric, L., Šabanović, Š., Holzinger, A. (2015): Interactive and Iterative Annotation for Biomedical Entity Recognition, International Conference on Brain Informatics and Health (BIH’15), London, UK

18 After Annotating 5 Abstracts xx

19 After Annotating 9 More Abstracts xx

20 Adaptive Writing Aid: Paraphrasing  offer paraphrases from various sources in a text editor  improve through usage: train system on user’s signals

21 Investigative Data-Driven Journalism dd

22 Conclusion  Adaptive Natural Language Processing  makes use of static AND dynamically generated resources  is driven by (text) data that defines its application domain  loops the user into the equation  beyond NLP pipelines “Surely, it will be hard to understand such a system in detail. But who would want to meticulously control every piece of such a system, when one can simply let it emerge?”

23 Thank you for yourand your

24 Size matters # words k 10k100k 1M 10M100M 1G 10G100G Brown Gigaword BNC CCAECPSAE Susanne Wacky Wikipedia Encarta Google Books ClueWeb 1T EuroParl language separation POS induction twitter2011 sense induction morphology induction language ID POS tagging word sense dis. Morphology SemCor two-dimensional text Topic Segmentation