Computing Kafka Image: DigiTaWG, 3 September 2013 Dr. J. Berenike Herrmann, Dept.

Slides:



Advertisements
Ähnliche Präsentationen
Relative Clauses.
Advertisements

What is E-Lit?What is E-Lit?. Michael Joyce, afternoon: a story (1990)
Chapter 4 Accusative Case
Verbs Used Impersonally With Dative Deutsch I/II Fr. Spampinato.
You need to use your mouse to see this presentation © Heidi Behrens.
You need to use your mouse to see this presentation © Heidi Behrens.
CALPER Publications From Handouts to Pedagogical Materials.
Universität StuttgartInstitut für Wasserbau, Lehrstuhl für Hydrologie und Geohydrologie Copulas (1) András Bárdossy IWS Universität Stuttgart.
Coordinating Conjunctions Why we need them & how to use them deutschdrang.com.
 Every part in a sentence has a grammatical function. Some common functions are: - Subject - Verb - Direct object / accusative object - Indirect object.
Konjunktionen & Indirekte Fragen {Conjunctions}
The prepositions in and an Two way prepositions. What are two-way prepositions? 0 A set of prepositions can take the dative or the accusative case: "an",
Die Fragen Wörter Wer? Was? Wann?.
Weak pushover verbs..... lieben kaufen spielen suchen....are verbs that do exactly as they are told. They stick to a regular pattern that does not change!
Deutsch 1 G Stunde. Dienstag, der 13. November 2012 Deutsch 1, G Stunde Heute ist ein G- Tag Unit: Family & home Familie & Zuhause Question: Who / How.
Literary Machines, zusammengestellt für ::COLLABOR:: von H. Mittendorfer Literary MACHINES 1980 bis 1987, by Theodor Holm NELSON ISBN
Empirical Methods of Linguistic Research. What you will learn How to write an empirical research paper How to design an experiment / a questionnaire How.
Experimental design. Variables Independent variable Dependent variable Levels Responses.
What is a “CASE”? in English: pronouns, certain interrogatives
Possessive Adjectives How to show belonging… The information contained in this document may not be duplicated or distributed without the permission of.
What is a “CASE”? in English: pronouns, certain interrogatives
GERMAN 1013 Kapitel 5 Review. terminology Word types: –nouns, pronouns, verbs, adjectives, prepositions … Functions: –subject, predicate, object … Form.
type / function / form type of words:
GERMAN WORD ORDER ORDER s. Sentences are made up by placing a variety of words in a specific order. If the order is wrong, the sentence is difficult to.
COMMANDS imperative There are three command forms: formal familiar singular familiar plural.
COMMANDS imperative 1. you (formal): Sie 2. you (familiar plural): ihr
Common mistakes Morgen habe Ich das buch für dich. Nouns are capitalized + the formal form of address Morgen habe ich das Buch für dich. Jetzt, ich wohne.
Unterwegs.
Montag den 8. Juni Lernziel:- To launch a project and receive results.
Kapitel 4 Grammar INDEX 1.Ordinal Numbers 2.Relative Pronouns and Relative Clauses 3.Conditional Sentences 4.Posessive: Genitive Case.
Kapitel 1 Grammar INDEX 1.Questions 2.Nouns 3.Definite Articles 4.Subjects 5.Subject Pronouns & Sein.
Fakultät für Gesundheitswissenschaften Gesundheitsökonomie und Gesundheitsmanagement Universität Bielefeld WP 3.1 and WP 4.1: Macrocost.
Imperfekt (Simple Past) Irregular or strong verbs
Kapitel 2 Grammar INDEX 1.Subjects & Verbs 2.Conjugation of Verbs 3.Subject Verb Agreement 4.Person and Number 5.Present Tense 6.Word Order: Position of.
Kapitel 7 Grammar INDEX 1.Comparison 2.Adjectives 3.Adjective Endings Following Ein-Words.
Memorisation techniques
Kapitel 8 Grammar INDEX 1.Command Forms: The Du-Command Form & Ihr- Command 2.Sentences & Clauses.
10.3 Lektion 10 Geschichte und Gesellschaft STRUKTUREN © and ® 2012 Vista Higher Learning, Inc Der Konjunktiv I and indirect speech —Ich komme.
Reflexiv-Verben Deutsch 2/AC.
Der die das ein eine ein Wie sagt man “the” auf Deutsch? Wie sagt man “a” auf Deutsch?
Text complexity in and for literary studies. foundations.
Kapitel 9 Grammar INDEX 1.Formal Sie- Command 2.There Is/There Are 3.Negation: Nicht/Klein.
Kapitel 2 Grammar INDEX 1.Modal Verbs (Review) 2.Meanings of Modal Verbs (Review) 3.Subject, Direct & Indirect Object (Review)
Word order: 1.In a main clause the VERB is the second idea: Helgakommteben aus der Bäckerei This may not be the second word Meiner Meinung nachsind Hobbys.
On the case of German has 4 cases NOMINATIVE ACCUSATIVE GENITIVE DATIVE.
Der Konjunktiv II (Subjunctive) Quick Summary What is mood? There are three "moods" which apply to verbs: 1.Indicative: Mary is going to the store. 2.Imperative:
B LOCKED DAY 1 OBJECTIVES: To consolidate vocabulary and structures within the theme of DIE UMWELT To further practise the techniques used in the prose.
Wortstellung: die finite Verbform im Aussagesatz Conjugated verb is always 2 nd “normal” word order = subject in the 1 st position, verb in the 2 nd “inverted”
Interrogatives and Verbs
Sentence Structure Questions
Freizeit Thema 5 Kapitel 1 (1)
Sentence Structure Connectives
Englisch Grundlagen, Modal Verbs
Deutsch 1 Die Familie Frau Spampinato
Jetzt machen Venues aufmachen!!! Geh zu
Jetzt machen Venues aufmachen!!! Geh zu
Synonyms are two or more words belonging to the same part of speech and possessing one or more identical or nearly identical denotational meanings, interchangeable.
Relative Clauses Frau Lizz Caplan-Carbin.
Students have revised SEIN and HABEN for homework
You need to use your mouse to see this presentation
“wish” “as if” “if only it were so”
THE PERFECT TENSE IN GERMAN
Die Medien heute Montag, 19. November 2018 Lernziele
type / function / form type of words:
Calorimetry as an efficiency factor for biogas plants?
School supplies.
Die tiere Share: Introduce the session and PDSA.
You need to use your mouse to see this presentation
Zhunussova G., AA 81. Linguistic communication, i.e. the use of language, is characteristically vocal and verbal behaviour, involving the use of discrete.
 Präsentation transkript:

Computing Kafka Image: DigiTaWG, 3 September 2013 Dr. J. Berenike Herrmann, Dept. German Philology, Göttingen University What corpus-stylistic measures can tell us about Franz Kafka's prose

Computing Kafka Image: DigiTaWG, 3 September 2013 Dr. J. Berenike Herrmann, Dept. German Philology, Göttingen University What corpus-stylistic measures can tell us about Franz Kafka's prose

Methodology & Theoretical Background “Toolboxes”

“Computing Kafka“: Start of a Project “Corpus Linguists‘ Toolbox“  Digitized texts  Frequency profiling  Concordancing  Collocation analysis -> “lexical bundles“ (Biber, Conrad, & Cortes, 2004), Ngrams, “clusters“ (Mahlberg), etc.  Key word analysis (e.g., Rayson, Scott, Scott & Tribble, Mahlberg, Antony, Stubbs)

“Computing Kafka“: Start of a Project “NLP Toolbox“ / “Programmer’s Toolbox”  Sentiment Analysis  Topic Modeling  Stylometric Clustering (literary history/genre)  …MD-analysis [Biber]  by means of Python, R […] -> forthcoming “Psycholinguist’s Toolbox”  test effects of style [features to be determined in textual analysis] on readers, ideally battery of experiments, different participant groups

“Computing Kafka“: Start of a Project “Philologist’s Toolbox”  A hundred years’ worth of study of different aspects of Kafka’s prose  Religion and culture (Christian / Jewish): A “jewish author”?  Epoch (Modernism, Prague Modernism, Kafka=solitary phenomenon?)  Genre (Gothic novel, Realistic narration, Fairy tale, Grotesque….)  Culture and science (Psychoanalysis, Modern rationalization/alienation …)  Reader response -> “uncertainty”/ “unsettledness” [no empirical studies]  Historic reception (esp. comparison w/ Robert Walser)  State of publication (while alive / from estate; whole texts / fragments)  Narratological study: “Heterogeneous prose” – different phases, formats  “Formalist” stylistic analysis [no quantitative studies]

“Computing Kafka“: Start of a Project “Formalist” stylistic analysis  perspective/focalization (antecessors world literature – Austen, Flaubert, James; German lit. - Stifter, Kleist)  first/third person narrative voice: limited perspective and neutral perspective  “showing, not telling” (gesture, scene)  lexical precision, lexical “scantiness” (Oschmann, 2010)  depiction of external events, situations -> concrete, sensuous phenomena  “progressive abstractness in narration” (Oschmann, 2010)  plot vs. reflection (development: less plot, more reflection)  plot: order of events: relatively arbitrary (Engel, 2010) -> structural homology  deviation from “reality principle” (Engel, 2010) -> ca. one per text  events not motivated -> ca. one per text  types, not figures (flat characters), generalizations, general constellations (Engel, 2010; Oschmann, 2010)  time: iterative, not singulative narration  overall few details, if details are present -> meaningful

“Computing Kafka“: Step 1 Corpus linguistics & literature studies Corpus stylistic approach

Digitized Texts Kafka in Zeno.org:  ca. 425,000 tokens (counted words)  ca. 26,500 types (distinct word forms) 3 novels (264,669 tokens; 18,344 types; averaged TTR = 6,9%) Amerika (83,805; 9,741) Amerika Der Process (71,773; 7,879) Das Schloß (109,091; 10,623) Das Schloß 58 stories and other types of prose (ca. 160,000 tokens; ca. 8,100 types; averaged TTR ≈ 5,1%) Zwei Gespräche  Gespräch mit dem Beter Gespräch mit dem Beter  Gespräch mit dem Betrunkenen Gespräch mit dem Betrunkenen Betrachtung  Kinder auf der Landstraße Kinder auf der Landstraße  Entlarvung eines Bauernfängers Entlarvung eines Bauernfängers  Der plötzliche Spaziergang Der plötzliche Spaziergang  Entschlüsse Entschlüsse  Der Ausflug ins Gebirge Der Ausflug ins Gebirge  Das Unglück des Junggesellen Das Unglück des Junggesellen  Der Kaufmann Der Kaufmann  Zerstreutes Hinausschaun Zerstreutes Hinausschaun  Der Nachhauseweg Der Nachhauseweg  Die Vorüberlaufenden Die Vorüberlaufenden  Der Fahrgast Der Fahrgast  Kleider Kleider  Die Abweisung Die Abweisung  Zum Nachdenken für Herrenreiter Zum Nachdenken für Herrenreiter  Das Gassenfenster Das Gassenfenster  Wunsch, Indianer zu werden Wunsch, Indianer zu werden  Die Bäume Die Bäume  Unglücklichsein Unglücklichsein

Digitized Texts Das Urteil Die Verwandlung In der Strafkolonie Der Kübelreiter Ein Hungerkünstler  Erstes Leid Erstes Leid  Eine kleine Frau Eine kleine Frau  Ein Hungerkünstler Ein Hungerkünstler  Josefine, die Sängerin Josefine, die Sängerin Ein Landarzt  [Widmung] [Widmung]  Der neue Advokat Der neue Advokat  Ein Landarzt Ein Landarzt  Auf der Galerie Auf der Galerie  Ein altes Blatt Ein altes Blatt  Vor dem Gesetz Vor dem Gesetz  Schakale und Araber Schakale und Araber  Ein Besuch im Bergwerk Ein Besuch im Bergwerk  Das nächste Dorf Das nächste Dorf  Eine kaiserliche Botschaft Eine kaiserliche Botschaft  Die Sorge des Hausvaters Die Sorge des Hausvaters  Elf Söhne Elf Söhne  Ein Brudermord Ein Brudermord  Ein Traum Ein Traum  Ein Bericht für eine Akademie Ein Bericht für eine Akademie

Digitized Texts Prosa aus dem Nachlaß  Hochzeitsvorbereitungen auf dem Lande Hochzeitsvorbereitungen auf dem Lande  Beim Bau der Chinesischen Mauer Beim Bau der Chinesischen Mauer  Der Jäger Grachhus Der Jäger Grachhus  Die Brücke Die Brücke  Der Schlag ans Hoftor Der Schlag ans Hoftor  Eine Kreuzung Eine Kreuzung  Der Nachbar Der Nachbar  Betrachtungen über Sünde, Leid, Hoffnung und den wahren Weg Betrachtungen über Sünde, Leid, Hoffnung und den wahren Weg  Brief an den Vater Brief an den Vater  Zur Frage der Gesetze Zur Frage der Gesetze  Das Stadtwappen Das Stadtwappen  Poseidon Poseidon  Kleine Fabel Kleine Fabel  Von den Gleichnissen Von den Gleichnissen

Key Word Analysis Three stages 1. Compute a word frequency list for each of the two corpora that we wish to compare:  different word forms (types) and occurrence (token) in each text  no. of running words in each corpus

Key Word Analysis 2. Compare the two resulting frequency lists  contingency table for each word  apply chosen statistic to calculate keyness value (most widely- used: log-likelihood and chi-squared, cf. Rayson et al., 2009)  the larger the difference in relative frequencies, the larger the value of “keyness” 3. Sort words in terms of keyness

Key Word Analysis RankFreqLLWord gregor josefine hungerkünstler reisende offizier gregors samsa er es aber verurteilte sie nicht schwester sich … Kafka ShortProse corpus, key word list obtained with AntConc V3.2.4 (Antony, 2011: da.ac.jp/antconc_index.ht ml ) da.ac.jp/antconc_index.ht ml

Key Word Analysis: Caveats Caveats (cf. Rayson, 2012): Chi-squared and LL tests assume that samples are random with independent observations -> this is not so! (Evert etc.)  sidestep: place key words in rank order, rather than determine significance for each word A word can be key if it just occurs in one part of the corpus  examination of dispersion is important (-> pruning???) Often too many key words for a researcher to analyze (Berber Sardinha, 1999) Prime importance: Careful choice of reference corpus Three types of keywords are often found (Scott):  proper nouns;  keywords that human beings would recognize as key and that are indicators of the “aboutness” of a particular text;  high-frequency words such as because, shall, or already, which may be indicators of style, rather than aboutness

Key Word Analysis: What to look for “Frequent nouns may indicate superficial topics […], but not its underlying themes“ “Verbs are often a better candidate for stylistically relevant words“ (Stubbs, 2005, p. 11) Negatives/negations: pragmatic functions (Hidalgo-Downing, 2000)  imply more than is literally said,  deny expectations,  challenge background propositions;  a way of questioning reality, therefore an alienation device Lexis indicating Involved Production (Biber, 1988) frequent in Engl. romantic fiction (Tribble, 2000)  “personal pronouns, the past forms had and was, the negative particle n't and, last but not least, Nigel [= proper noun]”

Key Word Analysis: First Results Amerika, The Trial, and The Castle -> statistically overrepresented:  proper nouns indicating the characters (e.g., K, Karl, Frieda),  common nouns indicating elements of interior spaces (Tür, Zimmer), generic person types (Onkel, Mann), and a multitude of occupational categories (Diener, Oberköchin) -> can be related to Kafka’s typical fictional world and characters A great variety of “small words”  adverbs (nur, vielleicht) -> negotiating certainty and evaluation  modal constructions (hätte, konnte, wollte) and negations (nicht, niemals) -> dealing with aspects of permission, ability, and obligation  pronouns -> related to the narrative perspective (er, ihn), -> also to self-reference (sich, mich) -> impersonal and generalizing constructions (man, alle)

Key Word Analysis: More Results See handouts (Key words for two corpora: Novels and ShortProse, each compared to reference corpus zeno.org; including negative key words)

Further Analyses Key word analysis not only for examination of the propositional text base, but also for stylistic analysis But: limited to single words (e.g., more complex adverbials are not identified, phraseological units, idioms etc.) One possibility -> Multi-word units -> Ngrams, collocations

Collocation Analysis: Lexical bundles Biber et al. (1999) found that idioms and fixed formulas (kick the bucket, a slap in the face) occur rarely in natural speech and writing However, they are more frequent in fiction! kick the bucket, a slap in the face occur ca. 5 times per million words in fiction corpus (cf. Biber et al., 1999, p. 1025pp), much less in conv., fiction, news -> stereotyped dialogue in fiction Types of „idioms“ (Biber et al., 1999, p. 1024pp)  Wh-questions (how do you do? What‘s up?)  Complete noun phrases (a piece of cake)  Prepositional phrases (as a matter of fact, in a nutshell, up to date)  Verb + prep phrases (bear in mind, fall in love)  Verb + noun phrases (kick the bucket, take the bull by the horns)

Collocation Analysis: Lexical bundles Question: What kinds of lexical bundles does Kafka‘s prose involve? „Stereotyped dialogue“ at all? See handouts (4Grams for two corpora: Novels and ShortProse)

Collocation Analysis: Exploration Religion and culture (Christian / Jewish): A “jewish author”? Preliminary indicators on text surface (cf. Engel, 2010, p. 423), e.g.,  Dom (*dom*; N=19), Kirche (*kirch*; N=50)  Synagoge (N=0), (schwarzer) Bart  *bart* (N=59), *bärt* (N=19) (more poss. indicators should be found in religion dictionaries; compare with Jewish literature; cf. Robertson, 1985)

Exploration “Stereotypical Jewish” ( cf. Engel, 2010, p. 423) : Search for collocates of *bart* / *bärt* Problem: Adequate statistics for sparse data: MI, t- score, LL?

RankFreqFreqLFreqRStatCollocate seines seinen seinem rötlichen langem buschigem Balkon weißen weißem verdeckte unverwandt unter Tränen tatarischen tartarischen tartarische strich starker stand speien solchen schwarzer schwarze

RankFreqFreqLFreqRStatCollocate riesenhaften nassen namens krumme knollennasigen knochigen kleinen im herabhängende grau gepflegte gebräunter fremdartige flehte fast erschien erschauert erlaubt dünnen buschigen blonden

Next steps Python… R… lemmatizer lemmatizer matizer matizer …

Computing Kafka Thank you!

Literature Adorno, T. W. (1981).“Notes on Kafka.” In Prisms. Cambridge, Massachusetts: MIT Press, 243– 71. Berber Sardinha, T. (1999). Using keywords in text analysis: Practical aspects. DIRECT Working Papers, 42. Biber, D., Conrad, S., & Cortes, V. (2004). If you look at…: Lexical bundles in university teaching and textbooks. Applied Linguistics, 25(3), Bondi, M., & Scott. M.(2010). Keyness in texts. [Studies in Corpus Linguistics, 41]. Amsterdam: John Benjamins. Engel, M. (2010). Kafka lesen - Verstehensprobleme und Forschungsparadigmen. In M. Engel & B. Auerochs (Eds.), Kafka-Handbuch. Leben, Werk, Wirkung (pp ). Stuttgart: Metzler. Mahlberg, M. (2007). Clusters, key clusters and local textual functions in Dickens. Corpora, 2(1), doi: doi: /cor Oschmann, D. (2010). Kafka als Erzähler. In M. Engel & B. Auerochs (Eds.), Kafka-Handbuch (pp ). Stuttgart. Rayson, P. (2012). Corpus analysis of key words. In The Encyclopedia of Applied Linguistics. Blackwell. Robertson, R. (1985). Kafka: Judaism, politics, and literature. Oxford: Clarendon Press. Scott, M. & Tribble, C. (2006). Textual patterns. Key word and corpus analysis in language education. Amsterdam: John Benjamins.