Embedded System Hardware - Reconfigurable Hardware -

Slides:



Advertisements
Ähnliche Präsentationen
Digital Output Board and Motherboard
Advertisements

Cadastre for the 21st Century – The German Way
Service Oriented Architectures for Remote Instrumentation
Finding the Pattern You Need: The Design Pattern Intent Ontology
E-Solutions mySchoeller.com for Felix Schoeller Imaging
Service Discovery in Home Environments
Embedded System Hardware
P. Marwedel Informatik 12, U. Dortmund
Peter Marwedel Informatik 12 TU Dortmund Germany
PPTmaster_BRC_ pot Rexroth Inline compact I/O technology in your control cabinet SERCOS III Components Abteilung; Vor- und Nachname.
Herzlich Willkommen zum Informations-Forum: SAP Interoperabilität
Steinbeis Forschungsinstitut für solare und zukunftsfähige thermische Energiesysteme Nobelstr. 15 D Stuttgart WP 4 Developing SEC.
fakultät für informatik informatik 12 technische universität dortmund Test Peter Marwedel TU Dortmund Informatik 12 Germany 2009/01/17 Graphics: © Alexandra.
Embedded & Real-time Operating Systems
Fakultät für informatik informatik 12 technische universität dortmund Optimizations Peter Marwedel TU Dortmund Informatik 12 Germany 2009/01/17 Graphics:
fakultät für informatik informatik 12 technische universität dortmund Optimizations Peter Marwedel TU Dortmund Informatik 12 Germany 2009/01/10 Graphics:
Fakultät für informatik informatik 12 technische universität dortmund Mapping of Applications to Platforms Peter Marwedel TU Dortmund, Informatik 12 Germany.
Fakultät für informatik informatik 12 technische universität dortmund Optimizations Peter Marwedel TU Dortmund Informatik 12 Germany 2010/01/13 Graphics:
Embedded System Hardware - Processing -
Fakultät für informatik informatik 12 technische universität dortmund Universität Dortmund Middleware Peter Marwedel TU Dortmund, Informatik 12 Germany.
Fakultät für informatik informatik 12 technische universität dortmund Specifications Peter Marwedel TU Dortmund, Informatik 12 Graphics: © Alexandra Nolte,
Peter Marwedel TU Dortmund, Informatik 12
Fakultät für Informatik Informatik 12 technische universität dortmund FPGA-Programming P. Marwedel Informatik 12, U. Dortmund.
Fakultät für informatik informatik 12 technische universität dortmund Hardware/Software Partitioning Peter Marwedel Informatik 12 TU Dortmund Germany Chapter.
Embedded System Hardware - Processing -
Fakultät für informatik informatik 12 technische universität dortmund Optimizations - Compilation for Embedded Processors - Peter Marwedel TU Dortmund.
Technische universität dortmund fakultät für informatik informatik 12 Embedded System Hardware - Processing - Peter Marwedel Informatik 12 TU Dortmund.
Hier wird Wissen Wirklichkeit Computer Architecture – Part 4 – page 1 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Part 4 Fundamentals.
Hier wird Wissen Wirklichkeit Computer Architecture – Part 10 – page 1 of 31 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Part 10 Thread and.
Hier wird Wissen Wirklichkeit Computer Architecture – Part 5 – page 1 of 25 – Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting Part 5 Fundamentals in.
Lehrstuhl Technische Informatik - Computer Engineering Brandenburgische Technische Universität Cottbus Architectures and Diagnosis Methods for Self Repairing.
Thomas Herrmann Software - Ergonomie bei interaktiven Medien Step 6: Ein/ Ausgabe Instrumente (Device-based controls) Trackball. Joystick.
Hochschulteam der Agentur für Arbeit Trier Preventing the Brainware Crisis Workshop Schloss Dagstuhl Student Enrollment in Computer Science.
CCNA Exploration Network Fundamentals
Deutsche Gesellschaft für Technische Zusammenarbeit GmbH Integrated Experts as interface between technical cooperation and the private sector – An Example.
Seminar Telematiksysteme für Fernwartung und Ferndiagnose Basic Concepts in Control Theory MSc. Lei Ma 22 April, 2004.
Institut für Umweltphysik/Fernerkundung Physik/Elektrotechnik Fachbereich 1 SADDU June 2008 S. Noël, K.Bramstedt,
Institut für Umweltphysik/Fernerkundung Physik/Elektrotechnik Fachbereich 1 Pointing Meeting Nov 2006 S. Noël IFE/IUP Elevation and Azimuth Jumps during.
Integration of renewable energies: competition between storage, the power grid and flexible demand Thomas Hamacher.
Laurie Clarcq The purpose of language, used in communication, is to create a picture in the mind and/or the heart of another.
Case Study Session in 9th GCSM: NEGA-Resources-Approach
Machen Sie sich schlau am Beispiel Schizophrenie.
Institut AIFB, Universität Karlsruhe (TH) Forschungsuniversität gegründet 1825 Towards Automatic Composition of Processes based on Semantic.
Lehrstuhl Technische Informatik - Computer Engineering Brandenburgische Technische Universität Cottbus 1 Hierarchical Test Technology for Systems on a.
Sanjay Patil Standards Architect – SAP AG April 2008
| DC-IAP/SVC3 | © Bosch Rexroth Pneumatics GmbH This document, as well as the data, specifications and other information set forth in.
A good view into the future Presented by Walter Henke BRIT/SLL Schweinfurt, 14. November 2006.
BAS5SE | Fachhochschule Hagenberg | Daniel Khan | S SPR5 MVC Plugin Development SPR6P.
INTAKT- Interkulturelle Berufsfelderkundungen als ausbildungsbezogene Lerneinheiten in berufsqualifizierenden Auslandspraktika DE/10/LLP-LdV/TOI/
Fusszeilentext – bitte in (Ansicht – Master – Folienmaster, 1. Folie oben) individuell ändern! Danach wieder zurück in Normalansicht gehen! 1 OTR Shearography.
Berner Fachhochschule Hochschule für Agrar-, Forst- und Lebensmittelwissenschaften HAFL Recent activities on ammonia emissions: Emission inventory Rindvieh.
4th Symposium on Lidar Atmospheric Applications
Ein Projekt des Technischen Jugendfreizeit- und Bildungsvereins (tjfbv) e.V. kommunizieren.de Blended Learning for people with disabilities.
Image Processing and Analysis Introduction. How do we see things ?
Fakultät für informatik informatik 12 technische universität dortmund Memory-architecture aware compilation - Sessions Peter Marwedel TU Dortmund.
3rd Review, Vienna, 16th of April 1999 SIT-MOON ESPRIT Project Nr Siemens AG Österreich Robotiker Technische Universität Wien Politecnico di Milano.
Adjectiv Endungen Lite: Adjective following articles and pre-ceeding nouns. Colors and Clothes.
Berner Fachhochschule Hochschule für Agrar-, Forst- und Lebensmittelwissenschaften HAFL 95% der Ammoniakemissionen aus der Landwirtschaft Rindvieh Pflanzenbau.
Wind Energy in Germany 2004 Ralf Christmann, BMU Joachim Kutscher, PTJ
Globale Plattform-Entwicklung für steigende Nachhaltigkeit
1 Intern | ST-IN/PRM-EU | | © Robert Bosch GmbH Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung,
Fakultät für informatik informatik 12 technische universität dortmund Memory architecture description languages - Session 20 - Peter Marwedel TU Dortmund.
1 Stevens Direct Scaling Methods and the Uniqueness Problem: Empirical Evaluation of an Axiom fundamental to Interval Scale Level.
Selectivity in the German Mobility Panel Tobias Kuhnimhof Institute for Transport Studies, University of Karlsruhe Paris, May 20th, 2005.
EN/FAD Ericsson GmbH EDD/ Information im 21. Jahrundert muss Erwünscht Relevant Erreichbar Schnell Kostenlos!?
Technische Universität München 1 CADUI' June FUNDP Namur G B I The FUSE-System: an Integrated User Interface Design Environment Frank Lonczewski.
TUM in CrossGrid Role and Contribution Fakultät für Informatik der Technischen Universität München Informatik X: Rechnertechnik und Rechnerorganisation.
Technische universität dortmund fakultät für informatik informatik 12 Communication Peter Marwedel Informatik 12 TU Dortmund Germany 2012 年 11 月 21 日 These.
Fakultät für informatik informatik 12 technische universität dortmund Memory-architecture aware compilation - Session 13 - Peter Marwedel TU Dortmund Informatik.
Institut für Angewandte Mikroelektronik und Datentechnik Phase 5 Architectural impact on ASIC and FPGA Nils Büscher Selected Topics in VLSI Design (Module.
 Präsentation transkript:

Embedded System Hardware - Reconfigurable Hardware - Peter Marwedel Informatik 12 TU Dortmund Germany 2008/11/15

Energy Efficiency of FPGAs © Hugo De Man, IMEC, 2007 Courtesy: Philips GOPs/J

Reconfigurable Logic Full custom chips may be too expensive, software too slow. Combine the speed of HW with the flexibility of SW HW with programmable functions and interconnect. Use of configurable hardware; common form: field programmable gate arrays (FPGAs) Applications: bit-oriented algorithms like encryption, fast “object recognition“ (medical and military) Adapting mobile phones to different standards. Very popular devices from XILINX (XILINX Vertex II are recent devices) Actel, Altera and others

Floor-plan of VIRTEX II FPGAs

Virtex II Configurable Logic Block (CLB)

Virtex II Slice (simplified) Example: Look-up tables LUT F and G can be used to compute any Boolean function of  4 variables. a b c d G 1

Virtex II (Pro) Slice [© and source: Xilinx Inc.: Virtex-II Pro™ Platform FPGAs: Functional Description, Sept. 2002, //www.xilinx.com]

Number of resources available in Virtex II Pro devices [© and source: Xilinx Inc.: Virtex-II Pro™ Platform FPGAs: Functional Description, Sept. 2002, //www.xilinx.com]

Interconnect Hierarchical Routing Resources

Virtex II Pro Devices include up to 4 PowerPC processor cores [© and source: Xilinx Inc.: Virtex-II Pro™ Platform FPGAs: Functional Description, Sept. 2002, //www.xilinx.com]

Peter Marwedel Informatik 12 TU Dortmund Germany Memory Peter Marwedel Informatik 12 TU Dortmund Germany 2008/11/15

Memory Oops! Memories! For the memory, efficiency is again a concern: speed (latency and throughput); predictable timing energy efficiency size cost other attributes (volatile vs. persistent, etc)

Access times and energy consumption increases with the size of the memory "Currently, the size of some applications is doubling every 10 months" [STMicroelectronics, Medea+ Workshop, Stuttgart, Nov. 2003] Example (CACTI Model):

Access times and energy consumption for multi-ported register files Cycle Time (ns) Area (l2x106) Power (W) We have used the Rixner model to evaluate cycle time, area and power for these register files. These graphs show the figures for the 0.18 mm technology as the register file size is increased. Blue lines are for 2 memory ports, the others are for 3 memory poerts. On a VLIW processor, the cycle time can be bounded by the register file, since in these processors the register file is the main central structure. From now on, we will assume that the access time to register file determines the processor cycle time. Source and © H. Valero, 2001 Rixner’s et al. model [HPCA’00], Technology of 0.18 mm

How much of the energy consumption of a system is memory-related? Mobile PC Average System Power 600/500 MHz uP 13% LCD 10" 30% HDD 19% Memory+Graphics 15% Power Supply 10% Other Mobile PC Thermal Design (TDP) System Power Note: Based on Actual Measurements 600/500 MHz uP 37% LCD 10" 19% HDD 9% Memory+Graphics 12% Power Supply 10% Other 13% CPU Dominates Thermal Design Power Multiple Platform Components Comprise Average Power [Courtesy: N. Dutt; Source: V. Tiwari]

Energy consumption in mobile devices [O. Vargas (Infineon Technologies): Minimum power consumption in mobile-phone memory subsystems; Pennwell Portable Design - September 2005;] Thanks to Thorsten Koch (Nokia/ Univ. Dortmund) for providing this source. 16

Access-times will be a problem Speed gap between processing and main DRAM increases Performance early 60ties (Atlas): page fault ~ 2500 instructions 2002 (2 GHz µP): access to DRAM ~ 500 instructions  penalty for cache miss about same as for page fault in Atlas Similar problems for PCs and MPSoCs 8 4 CPU (1.5-2 p.a.)  2x every 2 years 2 DRAM (1.07 p.a.) 1 [P. Machanik: Approaches to Addressing the Memory Wall, TR Nov. 2002, U. Brisbane] years 1 2 3 4 5

Hierarchical memories using scratch pad memories (SPM) Example SPM is a small, physically separate memory mapped into the address space Address space scratch pad memory FFF.. ARM7TDMI cores, well-known for low power consumption Hierarchy no tag memory main SPM processor SPM select Selection is by an appropriate address decoder (simple!)

Comparison of currents using measurements E.g.: ATMEL board with ARM7TDMI and ext. SRAM /3

Why not just use a cache ? (1) Energy for parallel access of sets, in comparators, muxes. [R. Banakar, S. Steinke, B.-S. Lee, 2001]

Influence of the associativity Parameters different from previous slides [P. Marwedel et al., ASPDAC, 2004]

Peter Marwedel Informatik 12 TU Dortmund Germany Communication Peter Marwedel Informatik 12 TU Dortmund Germany

Communication

Communication: Hierarchy Inverse relation between volume and urgency quite common: Sensor/actuator busses

Communication - Requirements - Real-time behavior Efficient, economical (e.g. centralized power supply) Appropriate bandwidth and communication delay Robustness Fault tolerance Maintainability Diagnosability Security Safety

Basic techniques: Electrical robustness Single-ended vs. differential signals ground Voltage at input of Op-Amp positive  '1'; otherwise  '0' Local ground Local ground Combined with twisted pairs; Most noise added to both wires.

Evaluation Advantages: Subtraction removes most of the noise Changes of voltage levels have no effect Reduced importance of ground wiring Higher speed Disadvantages: Requires negative voltages Increased number of wires and connectors Applications: USB, FireWire, ISDN Ethernet (STP/UTP CAT 5 cables) differential SCSI High-quality analog audio signals

Real-time behavior Carrier-sense multiple-access/collision-detection (CSMA/CD, Standard Ethernet) no guaranteed response time. Alternatives: token rings, token busses Carrier-sense multiple-access/collision-avoidance (CSMA/CA) WLAN techniques with request preceding transmission Each partner gets an ID (priority). After each bus transfer, all partners try setting their ID on the bus; partners detecting higher ID disconnect themselves from the bus. Highest priority partner gets guaranteed response time; others only if they are given a chance.

Other basic techniques Fault tolerance: error detecting and error correcting bus protocols Privacy: encryption, virtually private networks

Sensor/actuator busses Sensor/actuator busses: Real-time behavior very important; different techniques: Many wires less wires expensive & flexible

Field busses: Profibus More powerful/expensive than sensor interfaces; mostly serial. Emphasis on transmission of small number of bytes. Examples: Process Field Bus (Profibus) Designed for factory and process automation. Focus on safety; comprehensive protocol mechanisms. Claiming 20% market share for field busses. Token passing. ≦93.75 kbit/s (1200 m);1500 kbits/s (200m); 12 Mbit/s (100m) Integration with Ethernet via Profinet. [http://www.profibus.com/]

Controller area network (CAN) Designed by Bosch and Intel in 1981; used in cars and other equipment; differential signaling with twisted pairs, arbitration using CSMA/CA, throughput between 10kbit/s and 1 Mbit/s, low and high-priority signals, maximum latency of 134 µs for high priority signals, coding of signals similar to that of serial (RS-232) lines of PCs, with modifications for differential signaling. See //www.can.bosch.com

Time-Triggered-Protocol (TTP) The Time-Triggered-Protocol (TTP) [Kopetz et al.] for fault-tolerant safety systems like airbags in cars.

FlexRay FlexRay: developed by the FlexRay consortium (BMW, Ford, Bosch, DaimlerChrysler, …) Combination of a variant of the TTP and the Byteflight [Byteflight Consortium, 2003] protocol. Specified in SDL. Improved error tolerance and time-determinism Meets requirements with transfer rates >> CAN std. High data rate can be achieved: initially targeted for ~ 10Mbit/sec; design allows much higher data rates TDMA (Time Division Multiple Access) protocol: Fixed time slot with exclusive access to the bus Cycle subdivided into a static and a dynamic segment.

TDMA in FlexRay Exclusive bus access enabled for short time in each case. Dynamic segment for transmission of variable length information. Fixed priorities in dynamic segment: Minislots for each potential sender. Bandwidth used only when it is actually needed. http://www.tzm.de/FlexRay/FlexRay_Introduction.html

Time intervals in Flexray © Prof. Form, TU Braunschweig, 2007 Microtick (µt) = Clock period in partners, may differ between partners Macrotick (mt) = Basic unit of time, synchronized between partners (=riµt, ri varies between partners i) Slot=Interval allocated per sender in static segment (=pmt, p: fixed (configurable)) Minislot = Interval allocated per sender in dynamic segment (=qmt, q: variable) Short minislot if no transmission needed; starts after previous minislot. Cycle = Static segment + dynamic segment + network idle time show flexray animation from dortmund

Structure of Flexray networks Bus guardian protects the system against failing processors, e.g. so-called “babbling idiots” http://www.ixxat.de/index.php?seite=introduction_flexray_en&root=5873&system_id=5875&com=formular_suche_treffer&markierung=flexray

http://www.computer.org/micro/mi2002/pdf/m4010.pdf

Other field busses LIN MAP:MAP is a bus designed for car factories. EIB:The European Installation Bus (EIB) is a bus designed for smart homes. European Installation Bus (EIB) Designed for smart buildings; CSMA/CA; low data rate. IEEE 488: Designed for laboratory equipment. Attempts to use standard Ethernet. However, timing predictability remains a serious issue.

Wireless communication

Wireless communication: Examples IEEE 802.11 a/b/g/n UMTS DECT Bluetooth ZigBee Timing predictability of wireless communication?

Summary FPGAs Memories “Small is beautiful” (in terms of energy consumption, access times, size) Communication structures