Die Präsentation wird geladen. Bitte warten

Die Präsentation wird geladen. Bitte warten

PATR II Compiler Prolog Aufbaukurs SS 2000 Heinrich-Heine-Universität Düsseldorf Christof Rumpf.

Ähnliche Präsentationen


Präsentation zum Thema: "PATR II Compiler Prolog Aufbaukurs SS 2000 Heinrich-Heine-Universität Düsseldorf Christof Rumpf."—  Präsentation transkript:

1 PATR II Compiler Prolog Aufbaukurs SS 2000 Heinrich-Heine-Universität Düsseldorf Christof Rumpf

2 22.05.2000PATR II Compiler28 Notationskonventionen Instantiierungsmodus von Argumenten –Blau: Input-Argumente –Rot: Output-Argumente Cut –roter Cut ! –grüner Cut ! Prädikatsdefinitionen –abgeschlossen –wird fortgesetzt

3 22.05.2000PATR II Compiler29 Direktiven % external resources :- [tokenize].% load tokenizer % operators :- op(510, xfy, : ).% attr:val :- op(600, xfx, ===).% path equation :- op(1100,xfx,'--->').% syntax rule, lexical entry :- op(1200,xfx,'::'). % description annotation

4 22.05.2000PATR II Compiler30 3 Compiler-Komponenten Tokenizer –Input: PATR II-Grammatik –Output: Token-Zeilen Präprozessor –Input: Token-Zeilen –Output: Token-Sätze Syntax-Compiler –Input: Token-Sätze –Output: Prolog-Klauseln compile_grammar(File):- clear_grammar, tokenize_file(File), read_sentences, compile_sentences.

5 22.05.2000PATR II Compiler31 Tokenizer-Input ; Shieb1.ptr ; Sample grammar one from Shieber 1986 ; Grammar Rules ; ------------------------------------------------------------ Rule {sentence formation} S --> NP VP: = =. Rule {trivial verb phrase} VP --> V: =. ; Lexicon ; ---------------------------------------------------------------- Word uther: = NP = masculine third = singular.

6 22.05.2000PATR II Compiler32 Tokenizer Output = Präprozessor Input line(1,[o($;$),b(1),u($Shieb1$),o($.$),l($ptr$)]). line(2,[o($;$),b(1),u($Sample$),b(1),l($grammar$),b(1),l($one$),b(1),l($from$),b(1),... line(3,[ ]). line(4,[ ]). line(5,[o($;$),b(1),u($Grammar$),b(1),u($Rules$)]). line(6,[o($;$),b(1),o($-$),o($-$),o($-$),o($-$),o($-$),o($-$),o($-$),o($-$),o($-$),o($-$),... line(7,[ ]). line(8,[u($Rule$),b(1),o(${$),l($sentence$),b(1),l($formation$),o($}$)]). line(9,[b(2),u($S$),b(1),o($-$),o($-$),o($>$),b(1),u($NP$),b(1),u($VP$),o($:$)]). line(10,[b(1),o($ $),b(1),o($=$),b(1),o($<$),u($VP$),b(1),... line(11,[b(1),o($ $),b(1),o($=$),b(1),... line(12,[b(1)]). line(13,[u($Rule$),b(1),o(${$),l($trivial$),b(1),l($verb$),b(1),l($phrase$),o($}$)]). line(14,[b(2),u($VP$),b(1),o($-$),o($-$),o($>$),b(1),u($V$),o($:$)]). line(15,[b(1),o($ $),b(1),o($=$),b(1),o($<$),u($V$),b(1),...... line(41,[b(1),o($<$),l($head$),b(1),l($subject$),b(1),l($agreement$),b(1),l($number$),... line(42,[eof]).

7 22.05.2000PATR II Compiler33 Präprozessor Output = Compiler Input sentence( 1,11,[u($Rule$),o(${$),l($sentence$),l($formation$),o($}$),... sentence(12,15,[u($Rule$),o(${$),l($trivial$),l($verb$),l($phrase$),o($}$),... sentence(16,24,[u($Word$),l($uther$),o($:$),o($ $),o($=$),... sentence(25,30,[u($Word$),l($knights$),o($:$),o($ $),o($=$),... sentence(31,36,[u($Word$),l($sleeps$),o($:$),o($ $),o($=$),... sentence(37,41,[u($Word$),l($sleep$),o($:$),o($ $),o($=$),... sentence(42,42,[eof]). Der Präprozessor entfernt Kommentare und Leerzeichen und fasst mit einem Punkt terminierte Sätze aus mehreren Zeilen zusammen. Der eigentliche Compiler kann sich dann auf das wesentliche konzentrieren.

8 22.05.2000PATR II Compiler34 Präprozessor: Main Loop read_sentences:- abolish(cnt/1), write('preprocessing...'), nl, repeat, count(I), read_sentence(N,M,S), assert(sentence(N,M,S)), put(13), tab(3), write(I), write(' sentences preprocessed'), S = [eof], !, nl. read_sentence(N,M,S):- retract(line(N,L)), read_sentence(L,N,M,S), !. Backtracking

9 22.05.2000PATR II Compiler35 Präprozessor: Satz lesen read_sentence([eof],N,N,[eof]):- !.% end of file read_sentence([o($.$)|_],N,N,[]):- !.% end of sentence read_sentence([o($;$)|_],N,M,S):- !,% skip comment N1 is N+1, retract(line(N1,L)),% next line read_sentence(L,N1,M,S). read_sentence([],N,M,S):- !,% end of line N1 is N+1, retract(line(N1,L)),% next line read_sentence(L,N1,M,S). read_sentence([b(_)|T1],N,M,T2):- !,% skip blanks read_sentence(T1,N,M,T2). read_sentence([H|T1],N,M,[H|T2]):-% collect tokens read_sentence(T1,N,M,T2).

10 22.05.2000PATR II Compiler36 Compiler: Main Loop compile_sentences:- abolish(cnt/1), write('compiling...'), nl, retract(sentence(N,M,S)), compile_sentence((N,M),C,S,[]), assert(C), count(I), put(13), tab(3), write(I), write(' sentences compiled'), S = [eof], !, nl. Backtracking

11 22.05.2000PATR II Compiler37 Compiler: Satztypen % compile_sentence(Position,Clause,Sentence,Rest) compile_sentence(_,C) --> [eof], !, {C = finished}. compile_sentence(_,C) --> syntax_rule(C), !. compile_sentence(_,C) --> lex_entry(C), !. compile_sentence(_,C) --> template(C), !. compile_sentence(P,_,_,_):- P = (N,M), nl, write(' error in sentence between lines '), write(N), write(' and '), write(M), nl, fail.

12 22.05.2000PATR II Compiler38 Syntax-Regeln syntax_rule(C) --> rs('Rule'), !, syntax_rule_cont(C). syntax_rule_cont((Expansion :: Descr)) --> rule_name, sr_expansion(Expansion,Sugar), rs(:), !, sr_path_equations(Equations,Sugar), {sr_sugar_cats(Sugar,Equations,Descr)}.

13 22.05.2000PATR II Compiler39 Reservierte Symbole rs(=) --> [o($=$)], !. rs(:)--> [o($:$)], !. rs( [o($<$)], !. rs(>) --> [o($>$)], !. rs('{') --> [o(${$)], !. rs('}') --> [o($}$)], !. rs('Rule') --> [u($Rule$)], !. rs('Word') --> [u($Word$)], !. rs('Let') --> [u($Let$)], !. rs('be') --> [l($be$)], !. rs('-->') --> [o($-$),o($-$),o($>$)], !. Alternative: Definiere für jedes reservierte Symbol ein eigenes Prädikat, z.B. colon statt rs(:).

14 22.05.2000PATR II Compiler40 Weitere Terminalsymbole uatom(A) --> [u(S)], {atom_string(A,S)}. latom(A) --> [l(S)], {atom_string(A,S)}. satom(A) --> [s(S)], {atom_string(A,S)}. int(I) --> [i(I)]. atom(A) --> uatom(A), !. atom(A) --> latom(A), !. atom(A) --> satom(A), !. atomic(A) --> atom(A), !. atomic(A) --> int(A), !.

15 22.05.2000PATR II Compiler41 Regelnamen rule_name --> rs('{'), !, % start of rule name curley_braces_terminated_string. rule_name --> [].% rule names are optional curley_braces_terminated_string --> rs('}'), !.% end of rule name curley_braces_terminated_string --> [_], % read any symbol curley_braces_terminated_string. Regelnamen werden überlesen und nicht in die Prolog- Repräsentation der Regeln übernommen.

16 22.05.2000PATR II Compiler42 Regelexpansion sr_expansion((LHS ---> RHS),[LSugar|RSugar]) --> sr_lhs(LHS,LSugar), rs('-->'), sr_rhs(RHS,RSugar). sr_lhs(LHS,Sugar) --> fsd(LHS,Sugar). sr_rhs(RHS,Sugar) --> ne_fsd_seq(RHS,Sugar). ne_fsd_seq((FSD,FSDs),[Sugar|Sugars]) --> fsd(FSD,Sugar), ne_fsd_seq(FSDs,Sugars). ne_fsd_seq(FSD,[Sugar]) --> fsd(FSD,Sugar). fsd(Var,(FSD,Var)) --> uatom(FSD).

17 22.05.2000PATR II Compiler43 Syntax-Regeln: Pfadgleichungen sr_path_equations((E,Es),Sugar) --> sr_path_equation(E,Sugar), sr_path_equations(Es,Sugar). sr_path_equations(E,Sugar) --> sr_path_equation(E,Sugar). sr_path_equation((LHS === RHS),Sugar) --> sr_path(LHS,Sugar), rs(=), sr_val(RHS,Sugar). sr_val(V,Sugar) --> sr_path(V,Sugar). sr_val(V,_) --> atomic(V).

18 22.05.2000PATR II Compiler44 Syntax-Regeln: Pfade sr_path(Var,Sugar) --> rs( ), {member((FSD,Var),Sugar)}, !. sr_path(Var:P,Sugar) --> rs( ), {member((FSD,Var),Sugar)}, !. ne_feature_seq(F) --> feature(F). ne_feature_seq(F:P) --> feature(F), ne_feature_seq(P). fsd(FSD) --> uatom(FSD). feature(F) --> atomic(F).

19 22.05.2000PATR II Compiler45 Syntaktischer Zucker sr_sugar_cats([(Cat,Var)|Sugar],Equations, ((Var:cat === Cat),Descr)):- sr_sugar_cats(Sugar,Equations,Descr). sr_sugar_cats([],Descr,Descr). Rule {sentence formation} S --> NP VP: = =. Rule {sentence formation} X 0 --> X 1 X 2 : = S = NP = VP = =.

20 22.05.2000PATR II Compiler46 Lexikalische Einträge lex_entry(C) --> rs('Word'), !, lex_entry_cont(C). lex_entry_cont((FS ---> L :: Descr)) --> lexeme(L), rs(:), !, lex_definition(FS, Descr). lexeme(L) --> atom(L).

21 22.05.2000PATR II Compiler47 Lexikon: Merkmalsstrukturen lex_definition(FS,(LDef,LDefs)) --> lexdef(FS,LDef), lex_definition(FS,LDefs). lex_definition(FS,LDef) --> lexdef(FS,LDef). lexdef(FS,LDef) --> template_name(FS,LDef), !. lexdef(FS,LDef) --> lex_path_equation(FS,LDef), !.

22 22.05.2000PATR II Compiler48 Lexikon: Pfadgleichungen lex_path_equation(FS, (LHS === RHS)) --> lex_path(FS, LHS), rs(=), !, lex_val(FS, RHS). lex_path(FS,FS:P) --> rs( ), !. lex_val(FS,V) --> lex_path(FS,V). lex_val(_,V) --> atomic(V).

23 22.05.2000PATR II Compiler49 Templates template(C) --> rs('Let'), !, template_cont(C). template_cont((N :- TDef)) --> template_name(FS,N), rs('be'), template_definition(FS,TDef), {assert(template(N))}.

24 22.05.2000PATR II Compiler50 Templates: Head & Body template_name(FS,N) --> atom(A), {N =.. [A,FS]}. template_definition(FS,TDef) --> lex_definition(FS,TDef).

25 22.05.2000PATR II Compiler51 Löschen einer Grammatik clear_templates:- template(T), T =.. [F,_], abolish(F/1), fail. clear_templates:- abolish(template/1). clear_grammar:- abolish('::'/2), abolish(line/2), abolish(sentence/3), clear_templates.

26 22.05.2000PATR II Compiler52 Compiler Output A ---> B, C :: A : cat === 'S', B : cat === 'NP', C : cat === 'VP', A : head === C : head, C : head : subject === B : head. A ---> uther :: A : cat === 'NP', A : head : agreement : gender === masculine, A : head : agreement : person === third, A : head : agreement : number === singular.

27 22.05.2000PATR II Compiler53 Resourcen Grammatiken PATR II / Prolog –shieb1.ptr / shieb1.arishieb1.ptrshieb1.ari –shieb2.ptr / shieb2.arishieb2.ptrshieb2.ari –shieb3.ptr / shieb3.arishieb3.ptrshieb3.ari –shieb4.ptr / shieb4.arishieb4.ptrshieb4.ari Tokens –shieb1.tok (Tokenizer)shieb1.tok –shieb1.snt (Präprozessor)shieb1.snt PATR II Interpreter –patrlcl.ari: Left-corner mit Linkingpatrlcl.ari –patrlclc.ari: Left-corner mit Linking und Syntaxbäumenpatrlclc.ari –patr-ii.ari: DCGpatr-ii.ari PATR II Compiler –patrcomp.aripatrcomp.ari –patr-ii.ari: DCGpatr-ii.ari

28 22.05.2000PATR II Compiler54 Offene Probleme und Erweiterungen Syntaktischer Zucker der Form VP_1 VP_2 X Lexikalische Regeln Templates in Syntaxregeln Negation und Disjunktion Default Vererbung (Priority Union)...

29 22.05.2000PATR II Compiler55 Literatur Shieber, Stuart (1986): An Introduction to Unification-based Approaches to Grammar. CSLI Lecture Notes. Gazdar, Gerald & Chris Mellish (1989): Natural Language Processing in Prolog. Addison Wesley. Covington, Michael A. (1994): Natural Language Processing for Prolog Programmers. Chap. 6: Parsing Algorithms. Prentice-Hall.


Herunterladen ppt "PATR II Compiler Prolog Aufbaukurs SS 2000 Heinrich-Heine-Universität Düsseldorf Christof Rumpf."

Ähnliche Präsentationen


Google-Anzeigen