A Medical University of Graz, Austria b University Medical Center Freiburg, Germany c Averbis GmbH, Freiburg, Germany Machine vs. Human Translation of.

Slides:



Advertisements
Ähnliche Präsentationen
Cadastre for the 21st Century – The German Way
Advertisements

PRESENTATION HEADLINE
External Assessment in Austria
H - A - M - L - E - IC T Teachers Acting Patterns while Teaching with New Media in the Subjects German, Mathematics and Computer Science Prof. S. Blömeke,
Managing the Transition from School-to-Work Empirical Findings from a Mentoring Programme in Germany Prof. i.V. Dr. Martin Lang.
Masterstudiengänge im Tourismus in der Schweiz Chur Lausanne Hochschule für Technik & Wirtschaft Chur International Hospitality Management Lausanne International.
R. Zankl – Ch. Oelschlegel – M. Schüler – M. Karg – H. Obermayer R. Gottanka – F. Rösch – P. Keidler – A. Spangler th Expert Meeting Business.
Die ZBW ist Mitglied der Leibniz-Gemeinschaft Copyright © ZBW 2010 Seite 1 Potenziale semantischer Technologien für die Bibliothek der Zukunft Klaus Tochtermann.
International Developments in Accounting and Auditing - Challenges for the Profession - Georg Lanfermann Partner Department of Professional Practice Audit.
INSURANCE AUDIT FINANCIAL SERVICES Risk margins: An area of conflict between accounting and supervision Joachim Kölschbach Vienna, October 2005.
The Impact of International Standard Materials on Correct Test Results: Experiences from EQASs for HIV, HCV, HBV and BSE H.-P. Grunert 1,2, K.-O. Habermehl.
Steinbeis Forschungsinstitut für solare und zukunftsfähige thermische Energiesysteme Nobelstr. 15 D Stuttgart WP 4 Developing SEC.
CCLVET Cross Cultural Learning and Teaching in Vocational Education and Training Kurzübersicht LEONARDO DA VINCI Transfer of Innovation AGREEMENT NUMBER.
Peter Marwedel TU Dortmund, Informatik 12
When people think about Germany, they usually think of……
study of medicine no obligatory training content in almost all medical schools (universities) in Germany elective subject for medical students during.
Testing the Importance of Cleansing Procedures for
NUMEX – Numerical experiments for the GME Fachhochschule Bonn-Rhein-Sieg Wolfgang Joppich PFTOOL - Precipitation forecast toolbox Semi-Lagrangian Mass-Integrating.
Insulin pump therapy in adults allows metabolic control at lower rates of hypoglycemia along with reduced insulin doses – results from the nationwide DPV-survey.
Lancing: What is the future? Lutz Heinemann Profil Institute for Clinical Research, San Diego, US Profil Institut für Stoffwechselforschung, Neuss Science.
Thomas Herrmann Software - Ergonomie bei interaktiven Medien Step 6: Ein/ Ausgabe Instrumente (Device-based controls) Trackball. Joystick.
POST MARKET CLINICAL FOLLOW UP
Bologna conference: Asymmetric Mobility Medical sciences in Austria Christian SEISER 29 May 2008.
Deutsche Gesellschaft für Technische Zusammenarbeit GmbH Integrated Experts as interface between technical cooperation and the private sector – An Example.
Methods Fuzzy- Logic enables the modeling of rule based knowledge by the use of fuzzy criteria instead of exact measurement values or threshold values.
Comparative Adjectives. The term comparison of adjectives is used when two or more persons or things have the same quality (height, size, color, any characteristic)
virtPresenter „lecture recording framework“
Institut für Umweltphysik/Fernerkundung Physik/Elektrotechnik Fachbereich 1 SADDU June 2008 S. Noël, K.Bramstedt,
Medical Universities‘ Teaching Staff Training in Bochum Germany
HAW Hamburg, CARPE 2011, Prof. Dr. Rüdiger Weißbach, Revision : Bridging the Communication Gap in IT Projects - Enabling Non-IT Professionals.
Laurie Clarcq The purpose of language, used in communication, is to create a picture in the mind and/or the heart of another.
Case Study Session in 9th GCSM: NEGA-Resources-Approach
Machen Sie sich schlau am Beispiel Schizophrenie.
Introduction to Articles
Schweizerische Landesbibliothek ODOK05 - Workshop der VÖB-Kommission für Sacherschliessung Bozen, September 13, 2005 Cross-Language Access to Information.
Institut AIFB, Universität Karlsruhe (TH) Forschungsuniversität gegründet 1825 Towards Automatic Composition of Processes based on Semantic.
A good view into the future Presented by Walter Henke BRIT/SLL Schweinfurt, 14. November 2006.
Deutsch I für Anfänger Introduction to Articles Beginning German I.
Centre for Public Administration Research E-Government for European Cities Thomas Prorok
Deutsch I für Anfänger Introduction to Articles Beginning German I.
Deutsch 1 G Stunde. Donnerstag, der 18. Oktober 2012 Deutsch 1, G Stunde Heute ist ein E- Tag Unit: Family & homeFamilie & Zuhause Objectives: Phrases.
Deutsch 1 G Stunde. Montag, der 10. September 2012 Deutsch 1 (G Stunde)Heute ist ein D - Tag Unit: Introduction to German & Germany Objectives: Introducing.
INTAKT- Interkulturelle Berufsfelderkundungen als ausbildungsbezogene Lerneinheiten in berufsqualifizierenden Auslandspraktika DE/10/LLP-LdV/TOI/
Faculty of Public Health Department of Health Economics and Management University of Bielefeld WP 3.1 and WP 4.1: Macrocost EUprimecare Plenary Meeting.
Deutsch 1 G Stunde. Unit: Introduction to German & Germany Objectives: Phrases about date, weather and time-telling Talking about grades Reflection on.
Confidential Sequans – GlobSys Project Multi-company issues Wolfgang Schaefer Project Manager – FS EMEA May 2009.
Berner Fachhochschule Hochschule für Agrar-, Forst- und Lebensmittelwissenschaften HAFL Recent activities on ammonia emissions: Emission inventory Rindvieh.
Ein Projekt des Technischen Jugendfreizeit- und Bildungsvereins (tjfbv) e.V. kommunizieren.de Blended Learning for people with disabilities.
BASIS - Balanced Scorecards and Strategic Management Information Systems for Public Administrations Björn Niehaves European Research Center for Information.
Hätte gern vs. Möchte gern
Demonstration of Performance of CASCOT 5.0
External Labels – The rules For all external labels the following rules apply (external labels are all labels which are not inside of a shape) - all labels.
ESSnet Workshop Conclusions.
3rd Review, Vienna, 16th of April 1999 SIT-MOON ESPRIT Project Nr Siemens AG Österreich Robotiker Technische Universität Wien Politecnico di Milano.
Two-part conjunctions
RZPD Deutsches Ressourcenzentrum für Genomforschung GmbH DESPRAD-Meeting 02/09/2003 Steffen Schulze-Kremer (until 7/2003) Bernd Drescher (since 8/2003)
1 Stevens Direct Scaling Methods and the Uniqueness Problem: Empirical Evaluation of an Axiom fundamental to Interval Scale Level.
Lehrstuhl für Waldbau, Technische Universität MünchenBudapest, 10./11. December 2006 WP 1 Status (TUM) Bernhard Felbermeier.
Selectivity in the German Mobility Panel Tobias Kuhnimhof Institute for Transport Studies, University of Karlsruhe Paris, May 20th, 2005.
Anmerkungen: Schriftgröße Überschriften immer einheiltich 32. Text bei HR Check Up Präsentation 33, bei Akademie 44. Textfarbe unterschiedliche Blautöne.
Technische Universität München 1 CADUI' June FUNDP Namur G B I The FUSE-System: an Integrated User Interface Design Environment Frank Lonczewski.
Forschungsinstitut Betriebliche Bildung Markierung für aktuelles Hauptthema. Im Folienmaster kopieren und auf der jeweiligen Einzelfolie rechts neben dem.
Andreas Burger ZENTRUM FÜR MEDIZINISCHE LEHRE RUHR-UNIVERSITÄT BOCHUM Irkutsk October 2012 Report about the lecture "Report of the TEMPUS IV- Project Nr.
Most commonly spoke language. This interesting map comes from Ben Blatt of Slate, who used data from the Census Bureau’s American Community Survey. He.
An Approach to standardize a Service Life Cycle Management
1.Usage/Purpose 2.Forms Present Tense Simple Past Tense 2.Meanings 3.Word Order/Placement modal + infinitive omission of infinitives 4. Saying what you.
Inter-Cultural Teaching and Learning ICTaL Technische Universität Berlin Zentraleinrichtung Kooperation Wissenschaftliche und interne Weiterbildung Introductory.
Computer Services Business challenge
Creating Web Documents
Haline E Schendan, Meghan M Searl, Rebecca J Melrose, Chantal E Stern 
 Präsentation transkript:

a Medical University of Graz, Austria b University Medical Center Freiburg, Germany c Averbis GmbH, Freiburg, Germany Machine vs. Human Translation of SNOMED CT Terms Stefan SCHULZ a,b,, Johannes BERNHARDT-MELISCHNIG a, Markus KREUZTHALER a, Philipp DAUMKE b, Martin BOEKER c

Background SNOMED CT: ontology-based, international terminology with over 300,000 concepts and over 700,000 English terms (Fully specified names (FSNs), preferred terms + synonyms) IHTSDO maintains English (US / UK) and (Latin American) Spanish version. English considered reference. Localised versions important for non-English speaking countries, creation and maintenance cost- intensive

Current translation projects Danish and Swedish versions completed (only FSNs) Ongoing translations: Canadian French Other European countries: translation of subsets Special situation for German speaking countries: –2004 SNOMED CT completely translated by translation company, effort: 11.5 person years –Never released (copyright issues pending) –No IHTSDO member among the 7 countries in which German has the status of a primary or secondary official language

What should be translated? Fully Specified Names (FSNs): standardized, self-explaining, lengthy Synonyms: represent clinical jargon, close-to-user, short, abbreviations, acronyms, ambiguous translations of FSNs only do not address important use cases (user friendly interfaces, natural language processing, layperson interfaces, …) Fully Specified Name (FSN)Synonyms Computerized axial tomography of brain (procedure) Brain CT Cerebrovascular accident (disorder) CVA, Stroke Sodium chloride solution (substance) Saline, NaCl Automobile, device (physical object) Car

Alternative approaches Maintain English Fully Specified Names as ultimate reference for meaning (together with logical and (English) free text definitions Use low-cost translation methods for all terms (FSN, synonyms) –crowdsourcing targeting end users –non-expert translators –machine translation What about quality?

Study Objective To compare three kinds of SNOMED CT translations from English to German –Professional medical translators –Free Web-based machine translation service Google Translate –Medical students

Materials Methods International SNOMED CT release 2004, including unreleased German FSN translation International SNOMED CT release 2012 random sample (n=1000) test trai- ning German FSNs translated by two medical students translated by Google Translate active concepts English FSNs

Scoring of the translations Blinded review by two domain experts fully acceptable marginally acceptable unacceptable fidelity of translation linguistic correctness * Daumke P, Schulz S, Müller ML, Dzeyk W, Prinzen L, Pacheco EJ. Subword-based semantic retrieval of clinical and bibliographic documents. Methods of Information in Medicine, 49:141–147, Semantic distance: modified Jaccard distance between sets of "semantic atoms" created by morphosemantic indexing *

Scoring Criteria Es wird die sachliche und die sprachliche Korrektheit bewertet Als externe Hilfsmittel sollen med. Wörterbücher, LEO und Wikipedia verwendet werden, ebenso wie (engl.) SNOMED -Browser Sachliche Korrektheit Grün: Die Übersetzung gibt den Sachverhalt des Originals ohne Einschränkungen wieder, so dass sie zur klinischen Dateneingabe z.B. in Auswahllisten ohne Einschränkung verwendet werden können Gelb: Die Übersetzung gibt den Sachverhalt des Originals mit Einschränkungen wieder. Für die Anwendung in der klinischen Dokumentation sollte die Übersetzung manuell überarbeitet werden Rot: Die Übersetzung ist unbrauchbar. Sprachliche Korrektheit: es wird rein der sprachliche Ausdruck unabhängig von der Übersetzung gewertet. Grün: Die Übersetzung ist orthographisch und grammatisch einwandfrei, nach Vorgabe der von deutschen Medizinverlagen verwendeten Standards Gelb: Die Übersetzung weist kleinere orthographische oder grammatische Mängel auf, die vor der Verwendung in klinischen Dokumenten korrigiert werden müssten Rot: Die Übersetzung weist gravierende orthographische oder grammatische Mängel auf Zu 2. Grün gehören kcz-Regel ("A. cerebralis", aber "Zerebralarterie"; "Ulcus ventriculi", aber "Magenulkus") Korrekte Verwendung von Bindestrichen, bzw. Zusammensetzungen (z.B. nicht "Antibiotika Therapie", sondern "Antibiotikatherapie" oder "Antibiotika- Therapie")korrekte Groß- und Kleinschreibung (am Termanfang optional)Die "Hierarchy Tags" (Klammerausdrücke) wurden bewusst nicht übersetzt"

Results: Translation Student translation performance (student translators) 90 sec / term  6.3 person years for complete SNOMED CT Inter-translator agreement

Results: Quality of translation Inter-rater Reliability (Exact Fleiss' Kappa): Content fidelity: 0.24 Linguistic correctness: 0.40 Comparison of methods: fully acceptable marginally acceptable unacceptable

Summary of Outcome No difference between professional and untrained translators Automated term translation weaker especially regarding linguistic correctness (word endings, word order) In terms of term content fidelity automated term translation better than expected Inter-rater agreement low, particularly regarding content fidelity (despite preceding training phase) Semantic proximity lowest for professional translators (tendency towards more idiomatic translations?)

Limitations Small sample size, especially for stratifying the results along SNOMED CT semantic tags (disorders, procedures, substances, organisms etc.) Small number of raters does not represent the variety of medical professions Criteria for judging content correctness still too weak (despite commonly agreed rating guidelines prior to the experiment)

Final remarks Both lay translators and machine translations should be considered when translating SNOMED CT content Human review of machine translated content necessary According to expected level of consistency and quality (e.g. conformance with naming conventions), expert review also necessary for lay translations Interesting approach for harvesting synonyms or entry terms Results suggests feasibility for using a combined crowdsourcing / machine translation approach

Acknowledgements International Health Terminology Standards Development Organisation (IHTSDO) for the provision of the unreleased German SNOMED CT version