Die Präsentation wird geladen. Bitte warten

Die Präsentation wird geladen. Bitte warten

Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections June 21 ACL 2011 Slav Petrov Google Research Dipanjan Das Carnegie Mellon University.

Ähnliche Präsentationen


Präsentation zum Thema: "Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections June 21 ACL 2011 Slav Petrov Google Research Dipanjan Das Carnegie Mellon University."—  Präsentation transkript:

1 Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections June 21 ACL 2011 Slav Petrov Google Research Dipanjan Das Carnegie Mellon University

2 Part-of-Speech Tagging Portland has a thriving music scene. NOUN VERB DET ADJ NOUN. 2

3 Supervised POS Tagging Supervised setting: average accuracy is 96.2% with TnT 3 (Brants, 2000)

4 Resource-Poor Languages Several major languages with no or little annotated data Oriya Indonesian-Malay Azerbaijani e.g. See Haitian However, lots of parallel and unannotated data! Basic NLP tools like POS tagging essential for development of language technologies 4 Punjabi Vietnamese Polish 32 million 37 million 20 million Native speakers 7.7 million 109 million 69 million 40 million

5 (Nearly) Universal Part-of-Speech Tags VERBDET NOUNCONJ PRONNUM ADJPRT ADV. ADPX 5

6 (Nearly) Universal Part-of-Speech Tags Example Penn Treebank tag maps: Example Spanish Treebank tag maps: See Petrov, Das and McDonald (2011)

7 (Nearly) Universal Part-of-Speech Tags Portland has a thriving music scene. NOUN VERB DET ADJ NOUN. Portland hat eine prächtig gedeihende Musikszene. NOUN VERBDETADJ NOUN. | ADP NOUN ADJ. 7

8 State of the Art in Unsupervised POS Tagging 8

9 Unsupervised Part-of-Speech Tagging Portland hat eineprächtig gedeihende Musikszene. ?? ?? ?? ? Hidden Markov Model (HMM) estimated with the Expectation-Maximization algorithm : observation sequence : state sequence 9 Merialdo (1994)

10 Unsupervised Part-of-Speech Tagging Portland hat eineprächtig gedeihende Musikszene. ?? ?? ?? ? Hidden Markov Model (HMM) estimated with the Expectation-Maximization algorithm one of the 12 coarse tags : observation sequence : state sequence 10 Merialdo (1994)

11 Unsupervised Part-of-Speech Tagging Portland hat ?? Hidden Markov Model (HMM) estimated with the Expectation-Maximization algorithm transition multinomials : observation sequence : state sequence 11 Merialdo (1994)

12 Unsupervised Part-of-Speech Tagging Portland hat ?? Hidden Markov Model (HMM) estimated with the Expectation-Maximization algorithm emission multinomials : observation sequence : state sequence 12 Merialdo (1994)

13 Unsupervised Part-of-Speech Tagging Portland hat eineprächtig gedeihende Musikszene. ?? ?? ?? ? Hidden Markov Model (HMM) estimated with the Expectation-Maximization algorithm DanishDutchGermanGreekItalianPortugueseSpanishSwedishAverage EM-HMM Poor average result 13 Johnson (2007)

14 Unsupervised Part-of-Speech Tagging Hidden Markov Model (HMM) with locally normalized log-linear models : observation sequence Portland hat ?? emission multinomials : state sequence 14 Berg-Kirkpatrick et al. (2010)

15 Unsupervised Part-of-Speech Tagging Hidden Markov Model (HMM) with locally normalized log-linear models : observation sequence Portland hat ?? emission multinomials suffix hyphen capital letters numbers... : state sequence 15 Berg-Kirkpatrick et al. (2010)

16 Unsupervised Part-of-Speech Tagging Hidden Markov Model (HMM) with locally normalized log-linear models : observation sequence Portland hat ?? emission multinomials suffix hyphen capital letters numbers... Estimated using gradient-based methods : state sequence 16 Berg-Kirkpatrick et al. (2010)

17 Unsupervised Part-of-Speech Tagging Hidden Markov Model (HMM) with locally normalized log-linear models Portland hat ?? emission multinomials DanishDutchGermanGreekItalianPortugueseSpanishSwedishAverage EM-HMM Feature-HMM Estimated using gradient-based methods Improvements across all languages 17

18 Portland hat eineprächtig gedeihende Musikszene. NOUN VERB PRON DET ADJ NUM ADJ ADV ADJ NOUN. Unsupervised POS Tagging with Dictionaries Hidden Markov Model (HMM) with locally normalized log-linear models State space constrained by possible gold tags 18

19 Portland hat eineprächtig gedeihende Musikszene. NOUN VERB PRON DET ADJ NUM ADJ ADV ADJ NOUN. Unsupervised POS Tagging with Dictionaries Hidden Markov Model (HMM) with locally normalized log-linear models State space constrained by possible gold tags DanishDutchGermanGreekItalianPortugueseSpanishSwedishAverage EM-HMM Feature-HMM w/ gold dictionary 19

20 For most languages, access to high-quality tag dictionaries is not realistic. Ideas: 1)Use supervision in resource-rich languages 2)Use parallel data 3)Construct projected tag lexicons 20 Morphologically rich languages only have base forms in dictionaries

21 Bilingual Projection Portland has a thriving music scene. NOUN VERB DET ADJ NOUN. automatic labels from supervised tagger, 97% accuracy 21

22 Bilingual Projection Portland has a thriving music scene. NOUN VERB DET ADJ NOUN. Portland hat eine prächtig gedeihende Musikszene. Automatic unsupervised alignments from translation data (available for more than 50 languages) 22

23 Bilingual Projection Portland has a thriving music scene. NOUN VERB DET ADJ NOUN. Portland hat eine prächtig gedeihende Musikszene. Baseline1: direct projection unaligned word NOUN (most frequent tag) 23 Yarowsky and Ngai (2001)

24 Bilingual Projection Portland hat eine prächtig gedeihende Musikszene. NOUN VERBDETNOUNADJNOUN. + more projected tagged sentences supervised training tagger 24 (Brants, 2000) Yarowsky and Ngai (2001) Baseline1: direct projection

25 Bilingual Projection Baseline 1: direct projection 25 DanishDutchGermanGreekItalianPortugueseSpanishSwedishAverage EM-HMM Direct projection Feature-HMM Yarowsky and Ngai (2001)

26 Bilingual Projection Baseline 1: direct projection 26 Yarowsky and Ngai (2001) consistent improvements over unsupervised models DanishDutchGermanGreekItalianPortugueseSpanishSwedishAverage EM-HMM Direct projection Feature-HMM

27 Bilingual Projection Baseline 2: lexicon projection 27

28 Bilingual Projection Baseline 2: lexicon projection NOUN Portland VERB has DET a ADJ thriving NOUN music NOUN scene.. Portlandhateine prächtiggedeihende Musikszene. 28

29 Bilingual Projection Baseline 2: lexicon projection NOUN Portland ADJ thriving gedeihende prächtig VERB has hat DET a eine NOUN scene Musikszene NOUN music... ignore unaligned word 29

30 Bilingual Projection Baseline 2: lexicon projection NOUN Portland ADJ thriving gedeihende VERB has hat DET a eine NOUN scene Musikszene NOUN music... Bag of alignments 30

31 Bilingual Projection Baseline 2: lexicon projection NOUN Portland ADJ thriving gedeihende VERB has hat eine NOUN scene Musikszene NOUN music... DET a 31

32 Bilingual Projection Baseline 2: lexicon projection NOUN Portland ADJ thriving gedeihende VERB has hat eine NOUN scene NOUN music... DET a NUM one PRON one Musikszene 32

33 Bilingual Projection Baseline 2: lexicon projection NOUN Portland ADJ thriving gedeihende VERB has hat eine NOUN scene NOUN music... DET a NUM one PRON one Musikszene VERB thriving 33

34 Bilingual Projection Baseline 2: lexicon projection Portland gedeihende hat eine Musikszene. After scanning all the parallel data: = probability of a tag given a word 34

35 Bilingual Projection Baseline 2: lexicon projection Feature HMM constrained with projected dictionary Improvements over simple projection for majority of the languages 35 DanishDutchGermanGreekItalianPortugueseSpanishSwedishAverage EM-HMM Direct projection Projected Dictionary Feature-HMM

36 Can coverage be improved? Idea: Projected lexicon expansion and refinement using label propagation No information about unaligned words 36 Portland hat eine prächtig gedeihende Musikszene. Portland has a thriving music scene. NOUN VERB DET ADJ NOUN.

37 How can label propagation help? Subramanya, Petrov and Pereira (2010) Our Model: Graph-Based Projections 37

38 ist gut bei ist lebhafter bei ist wichtig bei ist fein bei gutem Essen zugetan fuers Essen drauf 1000 Essen pro schlechtes Essen und zum Essen niederlassen zu realisieren, zu erreichen, zu stecken, zu essen, Example Graph in German 38

39 ist gut bei ist lebhafter bei ist wichtig bei ist fein bei gutem Essen zugetan fuers Essen drauf 1000 Essen pro schlechtes Essen und zum Essen niederlassen zu realisieren, zu erreichen, zu stecken, zu essen, Example Graph in German 39 NOUN VERB

40 How can label propagation help? Plug in auto-tagged words from a source language Links between source and target language units are word alignments 40 Our Model: Graph-Based Projections

41 ist gut bei ist lebhafter bei ist wichtig bei ist fein bei gutem Essen zugetan fuers Essen drauf 1000 Essen pro schlechtes Essen und zum Essen niederlassen zu realisieren, zu erreichen, zu stecken, zu essen, eat food eat eating NOUN VERB good ADJ nicely ADV fine ADJ important ADJ Bilingual Graph 41

42 How can label propagation help? For a target language: Plug in auto-tagged words from a source language Links between source and target language units are word alignments 42 Our Model: Graph-Based Projections

43 ist gut bei ist lebhafter bei ist wichtig bei ist fein bei gutem Essen zugetan fuers Essen drauf 1000 Essen pro schlechtes Essen und zum Essen niederlassen zu realisieren, zu erreichen, zu stecken, zu essen, eat food eat eating NOUN VERB good ADJ nicely ADV fine ADJ important ADJ First Stage of Label Propagation 43

44 ist gut bei ist lebhafter bei ist wichtig bei ist fein bei gutem Essen zugetan fuers Essen drauf 1000 Essen pro schlechtes Essen und zum Essen niederlassen zu realisieren, zu erreichen, zu stecken, zu essen, eat food eat eating NOUN VERB good ADJ nicely ADV fine ADJ important ADJ First Stage of Label Propagation 44

45 How can label propagation help? For a target language: Plug in auto-tagged words from a source language Links between source and target language units are word alignments 45 Our Model: Graph-Based Projections

46 ist gut bei ist lebhafter bei ist wichtig bei ist fein bei gutem Essen zugetan fuers Essen drauf 1000 Essen pro schlechtes Essen und zum Essen niederlassen zu realisieren, zu erreichen, zu stecken, zu essen, eat food eat eating NOUN VERB good ADJ nicely ADV fine ADJ important ADJ Second Stage of Label Propagation 46

47 ist gut bei ist lebhafter bei ist wichtig bei ist fein bei gutem Essen zugetan fuers Essen drauf 1000 Essen pro schlechtes Essen und zum Essen niederlassen zu realisieren, zu erreichen, zu stecken, zu essen, eat food eat eating NOUN VERB good ADJ nicely ADV fine ADJ important ADJ Second Stage of Label Propagation 47

48 ist gut bei ist lebhafter bei ist wichtig bei ist fein bei gutem Essen zugetan fuers Essen drauf 1000 Essen pro schlechtes Essen und zum Essen niederlassen zu realisieren, zu erreichen, zu stecken, zu essen, eat food eat eating NOUN VERB good ADJ nicely ADV fine ADJ important ADJ Second Stage of Label Propagation 48

49 ist gut bei ist lebhafter bei ist wichtig bei ist fein bei gutem Essen zugetan fuers Essen drauf 1000 Essen pro schlechtes Essen und zum Essen niederlassen zu realisieren, zu erreichen, zu stecken, zu essen, eat food eat eating NOUN VERB good ADJ nicely ADV fine ADJ important ADJ Second Stage of Label Propagation 49 Continues till convergence...

50 feinlebhafterrealisieren Portland gedeihende hat eine Musikszene. End result? 50 Our Model: Graph-Based Projections

51 feinlebhafterrealisieren Portland gedeihende hat eine Musikszene. 51 Our Model: Graph-Based Projections

52 52 Lexicon Expansion thousands of words Our Model: Graph-Based Projections

53 Brief Overview: Graph-Based Learning with Labeled and Unlabeled Data 53

54 labeled datapoints unlabeled datapoints supervised label distributions to be found Zhu, Ghahramani and Lafferty (2003) = symmetric weight matrix 0.05

55 Label Propagation Zhu, Ghahramani and Lafferty (2003)

56 set of distributions over unlabeled vertices 56 Label Propagation 0.05

57 unlabeled vertices 57 Label Propagation 0.05

58 brings the distributions of similar vertices closer 58 Label Propagation 0.05

59 brings the distributions of uncertain neighborhoods close to the uniform distribution Size of the label set 59 Label Propagation 0.05

60 Iterative updates for optimization 60 Label Propagation 0.05

61 61 Final Results

62 Feature HMM constrained with graph-based dictionary DanishDutchGermanGreekItalianPortugueseSpanishSwedishAverage EM-HMM Feature-HMM Direct projection Projected Dictionary Graph-Based Projections 62 Our Model: Graph-Based Projections

63 Feature HMM constrained with graph-based dictionary DanishDutchGermanGreekItalianPortugueseSpanishSwedishAverage EM-HMM Feature-HMM Direct projection Projected Dictionary Graph-Based Projections w/ gold dictionary supervised 63 Our Model: Graph-Based Projections

64 Concluding Notes Reasonably accurate POS taggers without direct supervision – Evaluated on major European languages Towards a standard of universal POS tags Traditional evaluation of unsupervised POS taggers done using greedy metrics that use labeled data – Our presented models avoid these evaluation methods 64

65 Future Directions Scaling up the number of nodes in the graph from 2M to billions may help create larger lexicons Including penalties in the graph objective that induce sparse tag distributions at each graph vertex Inclusion of multiple languages in the graph may further improve results – Label propagation in one huge multilingual graph 65

66 66 Projected POS Tagged data available at:

67 Portland has a thriving music scene. NOUN ADJ Portland hat eine prächtig gedeihende Musikszene. | NOUN VERB DET ADJNOUN. Portland tiene una escena musical vibrante. Portland a une scène musicale florissante. ADJ NOUN ADP. NOUN VERB DET ADJ. NOUN VERB DET ADJ NOUN VERB DET NOUN ADJ. Questions? 67


Herunterladen ppt "Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections June 21 ACL 2011 Slav Petrov Google Research Dipanjan Das Carnegie Mellon University."

Ähnliche Präsentationen


Google-Anzeigen