Analysis Thema 9 / Analysis Grand Alexandra.

Slides:



Advertisements
Ähnliche Präsentationen
Univariate Statistik M. Kresken.
Advertisements

Forschungsdatenzentrum der Bundesagentur für Arbeit im Institut für Arbeitsmarkt- und Berufsforschung Two Issues on Remote Data Access.
Kapitel 4: Schule In this chapter you will: Talk about school
Lancing: What is the future? Lutz Heinemann Profil Institute for Clinical Research, San Diego, US Profil Institut für Stoffwechselforschung, Neuss Science.
Laurie Clarcq The purpose of language, used in communication, is to create a picture in the mind and/or the heart of another.
Don`t make me think! A Common Sense Approach to Web Usability
You need to use your mouse to see this presentation © Heidi Behrens.
You need to use your mouse to see this presentation © Heidi Behrens.
Die t-Verteilung Jonathan Harrington. Standard error of the mean (SE) ist die Standardabweichung von Mittelwerten Ich werfe 5 Würfel und berechne den.
You need to use your mouse to see this presentation
You need to use your mouse to see this presentation © Heidi Behrens.
Modul Statistische Datenanalyse
SStotal SStotal SStreat SSerror SStreat SSerror Biomasse (g) wenig
CALPER Publications From Handouts to Pedagogical Materials.
Clean Code Software-Entwicklung als Handwerkskunst Thomas Nagel, November 2011.
1 Stevens Direct Scaling Methods and the Uniqueness Problem: Empirical Evaluation of an Axiom fundamental to Interval Scale Level.
Universität StuttgartInstitut für Wasserbau, Lehrstuhl für Hydrologie und Geohydrologie Copulas (1) András Bárdossy IWS Universität Stuttgart.
You need to use your mouse to see this presentation © Heidi Behrens.
Statistik – Regression - Korrelation
Coordinating Conjunctions Why we need them & how to use them deutschdrang.com.
Institut für Angewandte Mikroelektronik und Datentechnik Phase 5 Architectural impact on ASIC and FPGA Nils Büscher Selected Topics in VLSI Design (Module.
1IWF/ÖAW GRAZ Data Combination David Fischer, Rumi Nakamura (IWF/OeAW)  Fluxgate: noise + distortion gets worse than the searchcoil at ~ 6 Hz.  Searchcoil:
Weak pushover verbs..... lieben kaufen spielen suchen....are verbs that do exactly as they are told. They stick to a regular pattern that does not change!
Institut für Angewandte Mikroelektronik und Datentechnik Course and contest Results of Phase 4 Nils Büscher Selected Topics in VLSI Design (Module 24513)
Literary Machines, zusammengestellt für ::COLLABOR:: von H. Mittendorfer Literary MACHINES 1980 bis 1987, by Theodor Holm NELSON ISBN
You need to use your mouse to see this presentation © Heidi Behrens.
Institut für Angewandte Mikroelektronik und Datentechnik Results of Phase 4: Layout for ST65 technology by Christoph Niemann Selected Topics.
The influence of spatial variability of polar firn on microwave emission Martin Proksch 1, Henning Löwe 1, Stefanie Weissbach 2, Martin Schneebeli 1 1.
Empirical Methods of Linguistic Research. What you will learn How to write an empirical research paper How to design an experiment / a questionnaire How.
One way ANOVA. ParametricNon-parametric Between subjectsIndependent ANOVA Kruskal Wallis within subjectsRepeated measures ANOVA Friedman’s ANOVA.
Experimental design. Variables Independent variable Dependent variable Levels Responses.
Guten Tag! Freitag, den Hausaufgabe für Monstag den LB 5.3D-F (F = Aufsatz, auf Deutsch!) Quiz: all verbs, past/present.
COMMANDS imperative There are three command forms: formal familiar singular familiar plural.
COMMANDS imperative 1. you (formal): Sie 2. you (familiar plural): ihr
© Crown copyright 2011, Department for Education These materials have been designed to be reproduced for internal circulation, research and teaching or.
EUROPÄISCHE GEMEINSCHAFT Europäischer Sozialfonds EUROPÄISCHE GEMEINSCHAFT Europäischer Fonds für Regionale Entwicklung Workpackage 5 – guidelines Tasks.
Fakultät für Gesundheitswissenschaften Gesundheitsökonomie und Gesundheitsmanagement Universität Bielefeld WP 3.1 and WP 4.1: Macrocost.
Imperfekt (Simple Past) Irregular or strong verbs
Kapitel 2 Grammar INDEX 1.Subjects & Verbs 2.Conjugation of Verbs 3.Subject Verb Agreement 4.Person and Number 5.Present Tense 6.Word Order: Position of.
Kapitel 7 Grammar INDEX 1.Comparison 2.Adjectives 3.Adjective Endings Following Ein-Words.
Memorisation techniques
Here‘s what we‘ll do... Talk to the person sitting in front of you. Introduce each other, and ask each other questions concerning the information on your.
Der die das ein eine ein Wie sagt man “the” auf Deutsch? Wie sagt man “a” auf Deutsch?
Debating - the Jugend debattiert way The format - overview How to prepare a debate The opening round The main round The final round How to evaluate a debate.
G Stunde DEUTSCH 1.  Unit: Family & homeFamilie & Zuhause  Objectives:  Phrases about date, weather and time-telling  Family and family relations.
Essay structure Example: Die fetten Jahre sind vorbei: Was passiert auf der Almhütte? Welche Bedeutung hat sie für jede der vier Personen? Intro: One or.
What’s the weather like?. Look at the question above Turn it around and you have Das Wetter ist.... The phrase Das Wetter ist.... or Es ist.... can be.
© Boardworks Ltd of 8 © Boardworks Ltd of 8 This icon indicates that the slide contains activities created in Flash. These activities are not.
LLP DE-COMENIUS-CMP Dieses Projekt wurde mit Unterstützung der Europäischen Kommission finanziert. Die Verantwortung für den Inhalt dieser.
Fitness. An english presentation.
Interrogatives and Verbs
Freizeit Thema 5 Kapitel 1 (1)
you: ihr ( familiar plural ) you: du ( familiar singular)
The dynamic ultrasound
Process and Impact of Re-Inspection in NRW
You need to use your mouse to see this presentation
Talking about prices Pointing things out
THE PERFECT TENSE IN GERMAN
Univariate Statistik M. Kresken.
Ferrite Material Modeling (1) : Kicker principle
Institut für Experimentelle
Was ist die Verbindung hier?
The Conversational Past
The Conversational Past
Integrating Knowledge Discovery into Knowledge Management
Practical Exercises and Theory
Calorimetry as an efficiency factor for biogas plants?
School supplies.
 Präsentation transkript:

Analysis Thema 9 / Analysis Grand Alexandra

Analysis 3. Inferential Statistics 2. Descriptive Statistics testing hypotheses and models 2. Descriptive Statistics describing the data 1. Data Preparation organizing the data

Conclusion Validity Conclusion Validity Internal Validity Is there a relationship between two variables (between cause and effect)? Assuming that there is a relationship in this study, is the relationship a causal one? conclusion there is a relationship there is no relationship Is the conclusion about the relationship reasonable? „Je mehr Fernseher vorhanden sind, desto schlechter wird die PISA-Leistung.“ (Presse, 10.12.2010. PISA-Sieger: Weiblich und ohne TV) „third variable“?

signal-to-noise ratio problem Threats to conclusion validity Incorrect conclusion about a relationship in the observation 1. conclude that there is no relationship when in fact there is „missing the needle in the haystack“ signal-to-noise ratio problem „noise“ – factors that make it hard to see the relationship „signal“ – relationship you are trying to see 2. conclude that there is a relationship when in fact there is not „seeing things that aren´t there“

Threats to conclusion validity „Finding no relationship when there is one“ conclusion reality no relationship relationship threats: low reliability of measures low reliability of treatment implementation random irrelevancies in the setting random heterogeneity of respondents -> low statistical power violation of assumptions of statistical tests „noise“ producing factors add variability „Finding a relationship when there is not one“ conclusion reality relationship no relationship threats: fishing and the error rate problem violation of assumptions of statistical tests

Improving Conclusion Validity good statistical power (should be > 0.8) power = „the odds of saying that there is an relationship, when in fact there is one“ Factors that affect power: sample size: use lager sample size effect size: increase effect size (e.g. increase the dosage of the program) signal -> increase noise -> decrease α-level: raise the alpha-level good reliability -> reduce „noise“ good implementation

β-error (Type II Error) Statistical Inference Decision Matrix two mutually exclusive hypotheses (H0, HA) decision: which hypothesis to accept and which to reject REALITY H0 is true HA is true decision right 1-α (e.g. 0.95) confidence level decision wrong β (e.g. 0.20) β-error (Type II Error) α (e.g. 0.05) α-error (Type I Error) significance level 1-β (e.g. 0.80) Power accept H0 CONCLUSION accept HA

Statistical Inference Decision H0 right HA right 1-α 1-β POWER β α what we want: high power and low Type I Error problem: the higher the power the higher the Type I Error

Practical Ein in „Wirklichkeit“ hochbegabtes Kind wird als nicht hochbegabt diagnostiziert. Um welchen Fehler handelt es sich in diesem Fall?  α-Fehler (Fehler 1. Art/ Type I Error)  β-Fehler (Fehler 2. Art/ Type II Error) Das Ergebnis einer Studie: WU-StudentInnen mit HAK-Abschluss erreichen eine höhere Punkteanzahl bei der MC-Prüfung in Buchhaltung. In Wirklichkeit gibt es aber keinen Unterschied zwischen HAK- und nicht HAK-Absolventen hinsichtlich der erreichten Punkteanzahl. Um welchen Fehler handelt es sich in diesem Fall?  α-Fehler (Fehler 1. Art/ Type I Error)  β-Fehler (Fehler 2. Art/ Type II Error)

Practical Kreuzen Sie die richtige Antwort an und stellen Sie die falschen Antworten richtig. Durch Erhöhung des α-Fehlers von 0.01 auf 0.05 … sinkt die Power (Teststärke) sinken die Chancen einen Fehler 1. Art zu machen sinken die Chancen einen β-Fehler zu machen  ist der Test restriktiver steigt steigen weniger restriktiv

Analysis Beispieldatensatz „Arbeitszufriedenheit“ – AZ Datensatz: AZ.sav Hinweis: Die Daten wurden zu Illustrationszwecken aus einem Datensatz* willkürlich gewählt! Etwaige Ergebnisse sollten daher nicht allzu ernst genommen werden. Stichprobengröße: n = 15 Variablen: dichotom: SEX, Items zu den Konstrukten Arbeitszufriedenheit** (AZ_... ), Betriebsklima** (BK_...), Arbeitsbelastung** (AB_... ) ordinal: POSITION (Position im Betrieb) metrisch: MITARB (Anzahl der Mitarbeiter), NETTO (monatl. Nettoverdienst in €) neue Variable: AZ „Arbeitszufriedenheit“(Annahme: intervallskaliert!)  Summenscore der einzelnen Variablen AZ_... * Böhnisch, B., Grand, A., Rechberger, R., Wimmer, W. (2006). Berufliche Zufriedenheit. Seminararbeit aus Empirische Forschungsmethoden. ** Items wurden übernommen von: Giegler, H. (1985). Rasch-Skalen zur Messung von „Arbeits- und Berufszufriedenheit“, „Betriebsklima“ und „Arbeits- und Berufsbelastung“ auf Seiten der Betroffenen.

1. Data Preparation Logging the data (Checking the data for accuracy) Developing a database structure – Codebook (Kodierungsschema) Entering the data into the computer (once only entry or double entry); Checking the data for accuracy Data Transformation missing values item reversals (example: transform reversal items e.g. BK_2: old value: 1 „agree“, 2 „disagree“ -> new value: 2 „agree“, 1 „disagree“) recode variables (example: transform items „AZ_...“, „AB_...“, “BK_...“: old value: 1 „agree“, 2 „disagree“ -> new value: 1 „agree“, 0 „disagree“) scale totals (example: generate new variable „AZ“ (Arbeitszufriedenheit))  to get a total score for AZ add across the individual items AZ_...,) categories

1.Data Preparation - Codebook ID SEX 1 2 MITARB NETTO 3 POSITION 2 The codebook should include: variable name variable description variable format instrument/method of collection date collected respondent or group variable location in database 1 1 2 AZ_1 - BK_2 AB_3 AZ_4 BK_5

1. Data Preparation - Checking data for accuarcy summarize (e.g. frequency table) and check the data are the listed values reasonable? („wild codes“, outlier/Ausreißer) are there missing values? („missing values“) outlier/Ausreißer it acutally is an outlier or error in data entry „wild code“ „missing values“ „missing values“ there exist no data or data weren´t entered

2. Descriptive Statistics „quantitative description in a manageable form“ describe basic features of the data, provide simple summaries simple graphics analysis Univariate Analysis - Analysis of one variable at a time Description of a single variable: distribution central tendency (Lagemaß) dispersion (Streuungsmaß) Bivariate Analysis – Analysis of two variables at a time Multivariate Analysis – Analysis of multiple variables at a time

Frequency distribution absolute Häufigkeiten relative Häufigkeiten 2. Descriptive Statistics - Distribution Frequency distribution t a b l e g r a p h absolute frequencies relative frequencies absolute frequencies relative frequencies Frequency table: Geschlecht pie chart bar chart boxplot histogram (stem and leaf diagram) … Geschlecht absolute Häufigkeiten relative Häufigkeiten männlich 8 53% weiblich 7 47% crosstab

2. Descriptive Statistics - Distribution Kreisdiagramm - Geschlecht Balkendiagramm - Position g r a p h s Histogramm – monatl. Nettoverdienst Boxplot – Anzahl der Mitarbeiter

Central Tendencies / LAGEMASSE 2. Descriptive Statistics – Central Tendency Central Tendencies / LAGEMASSE Mean (Mittelwert) Median Modus computation „most frequently occuring value“ „sum of values xi / number n of values“ „center of the sample“ data metric data ordinal data metric data nominal data ordinal data metric data if distribution is approx. normal distributed not robust against single extreme values („outliers“) robust against outliers robust against outliers adequacy

2. Descriptive Statistics – Central Tendency / Practical Berechnen Sie den Mittelwert, Median und Modus der Variablen SEX, MITARB (Anzahl der Mitarbeiter) und POSITION - Achten Sie dabei auf eine sinnvolle Anwendung! Hilfestellung: aufsteigende Sortierung der Variablen Mitarbeiter und Position

2. Descriptive Statistics – Distribution / Practical_Solution Variable Mean Median Modus Mitarbeiter 48.5 18 7 Position - 2 1 Geschlecht

Dispersions/ STREUUNGSMASSE 2. Descriptive Statistics - Dispersion Dispersions/ STREUUNGSMASSE Range / Spannweite Variance s² Standard Deviation s = computation „average of the sum of the squared deviations “ „square root of the variance“ „highest value minus lowest value“ data metric data metric data ordinal data metric data

Dispersions/ STREUUNGSMASSE 2. Descriptive Statistics - Dispersion Dispersions/ STREUUNGSMASSE Interquartile range IQR max 25% „difference between third and first quartile“ 3. quartile (Q3): 75% of the cases fall below this value 1. quartile (Q1): 25% of the cases fall below this value median: 50% of the cases fall above and below this value Q3 computation 25% IQR Q2 = median 25% Q1 data metric data 25% min adequacy robust against outliers

2. Descriptive Statistics – Dispersion / Practical_Solution Berechnung der Varianz, Standardabweichung und der Spannweite der Variable NETTO (Nettoverdienst): n = 15, mean = 1553,3 ; min = 200, max = 2800 Steps (Variance): 1. compute distance between each value and the mean 2. square each discrepancy 3. sum the squares to get the Sum of Squares (SS) value 4. divide the SS by n - 1 Variable Variance Standard Deviation Range (Spannweite) Min Max Netto- verdienst 471595.238 686.728 2600 200 2800

Correlation Correlation „A correlation is a single number that describes the degree of relationship between two variables“ correlation coefficient between -1 < r < 1 the higher the absolute r-value, the stronger the relationship between the variables uncorrelated r = 0 positive correlation r > 0 positive relationship  the higher the x-values the higher the y-values on average negative correlation r < 0 negative relationship  the higher the x-values the lower the y-values on average and vice versa exact linear correlation r = 1 (positive), r= -1 (negative)

Correlation - Example Example: Is there a relationship between the variable „Nettoverdienst“ and the variable „Arbeitszufriedenheit“? If yes, … Which type of relationship? How strong is the relationship? Is the correlation significant? Descriptive statistics for „Nettoverdienst“ and „Arbeitszufriedenheit“ Variable Mean StDev Variance Sum Min Max Range Netto-verdienst 1553.33 686.728 471595.238 23300 200 2800 2600 Arbeits-zufried. 5.20 2.178 4.743 78 1 9 8

Example - Descriptive Statistics Boxplot – Arbeitszufriedenheit (AZ) Boxplot – monatl. Nettoverdienst in €

Example – 1. Which type of relationship?

Example – 2. How strong is the relationship? Product-Moment-Correlation (Pearson) variables (x,y) are metric and normal distributed Calculating the correlation SPSS-Output: Korrelation AZ/NETTO

Example – Q-Q Plot Q-Q Plot: AZ (Arbeitszufriedenheit) Q-Q Plot: monatl. Nettoverdienst in €

Example – 3. Is the correlation significant? Testing the Significance of a Correlation Null Hypothesis: r = 0 Alternative Hypothesis: r <> 0 Steps: determine the significance level alpha-level compute the degrees of freedom df one-tailed or two-tailed test? look at the critical value α = 0.05 df = N-2 -> 15- 2 = 13 two-tailed test

Example – 3. Is the correlation significant? Auszug: t-Verteilungen für Produkt-Moment-Korrelationen SPSS-Output: Korrelation AZ/NETTO correlation is significant: r (0.692) > rcrit (0.514)

Correlation Matrix symmetric matrix relationships between all possible pairs of variables e.g. between C1,…,C10  45 unique correlations N*(N-1) / 2

Other correlations Pearson Product Moment (bivariate normal distribution, variables on interval scale) Spearman rank Order Correlation (rho) (two ordinal variables) Kendall rank order Correlation (tau) Point-Biserial Correlation (one variable is on a continuous interval level and the other is dichotomous)

Literatur Basisliteratur: Trochim, W. & Donelly, J.: The Research methods Knowledge Base (3rd edition) Atomic Dog Internet WWW page, URL: http://www.socialresearchmethods.net/kb/ (version current as of October 20, 2006). Bortz, J., Döring, N. (2006). Forschungsmethoden und Evaluation. Heidelberg: Springer Verlag. Hatzinger, R. (2006). Angewandte Statistik mit SPSS. Wien: Facultas. Hatzinger, R. , Nagel, H. (2009). PASW Statistics. Statistische Methoden und Fallbeispiele. München: Pearson Studium. Nagel, H. (2003). Empirische Sozialforschung.