Text complexity in and for literary studies. foundations.

1 text complexity in and for literary studies

2 foundations

3 complexity – a definition „Complexity is generally used to characterize something with many parts where those parts interact with each other in multiple ways.” Wikipedia “the only consensus among researchers is that there is no agreement about the specific definition of complexity” (Wikipedia)

4 organized complexity non-random interaction between the parts of a system these correlated parts create a differentiated structure the system manifests emergent properties (i.e. properties not reducible to parts of the system)

5 text complexity and readability Readability research describes a lot of stylistic features Correlation with readability not totally clear

6 Vocabulary Type-Token Ratio Root Type-Token Ratio Corrected Type-Token Ratio Bilogarithmic Type-Token Ratio Uber Index = log (Typ2)/ log (Tok/Typ) Measure of Textual Lexical Diversity (McCarthy, 2005) Lexical Density = TokLex/Tok Lexical Word Variation = TypLex/TokLex Noun Variation = TypNoun/TokLex Adjective Variation, Adverb Variation Modifier Variation = (TypAd j + TypAdv)/TokLex Verb Variation 1 = TypVer b/TokVer b Verb Variation 2 = TypVer b/TokLex Squared Verb Variation 1 = Typ2 Verb/TokVerb

7 Syntax

8 Language Model

9 Morphology

10 Most predictive features

11 text complexity in literary studies style syntax vocabulary registers figurative language aesthetics form and content depicted world symbolic elements / aspects of the fictional world Intertextuality polyvalent – inexhaustible for interpretations

12 Readability and text complexity cognitive load vs. interpretability

13 gold standard very difficult to achieve agreement especially the very trivial literature may be hard to get difficult to do for different times and languages three levels: highbrow, middlebrow, lowbrow

14 babysteps some experiments

15 corpus - novels highbrow C. Einstein: Bebuquin C.M. Rilke: Malte Laurids Brigge F. Kafka: Der Prozess R. Müller: Tropen R. Musil: Der Mann ohne Eigenschaften P. Scheerbart: Lesabéndio R. Walser: Der Gehülfe middlebrow Baum: Menschen im Hotel Bettauer: Stadt ohne Juden Fallada: Kleiner Mann, was nun Kellermann: Der Tunnel Perutz: Der Meister des jüngsten Tags Wassermann: Das Gänsemännchen Zobeltitz: Aus tiefem Schacht lowbrow H. Courths-Mahler R. Huch: Der Fall Deruga N. Jacques: Marbuse, der Spieler R. Kraft: Nobody‘s Erlebnisse E. Wallace: Die toten Augen von London







22 Literature Julia HANCKE Sowmya VAJJALA Detmar MEURERS: Readability Classification for German using lexical, syntactic, and morphological features. In: Proceedings of COLING 2012: Technical Papers, pages 1063–1080.

