CSC 2231: Parallel Computer Architecture and ... - cs.toronto.edu

CSC 2231: Parallel Computer Architecture and ... - cs.toronto.edu

CSC 2231: Parallel Computer Architecture and Programming GPUs Prof. Gennady Pekhimenko University of Toronto Fall 2017 The content of this lecture is adapted from the slides of Tor Aamodt (UBC) Project Progress Report Due next week Friday (Nov. 3rd) Ask questions after the class 2 Review #7 GPUs and the Future of Parallel Computin

g Steve Keckler et al., IEEE Micro 2011 Due Nov. 10 3 Review #5 Results Grades (out of 10) Mean: 9.05 10 9 8 7 6 5 4 3

2 1 0 6s 7s 8s 9s 10s 4 What is a GPU?

GPU = Graphics Processing Unit Accelerator for raster based graphics (OpenGL, DirectX) Highly programmable (Turing complete) Commodity hardware 100s of ALUs; 10s of 1000s of concurrent threads NVIDIA Volta: V100 5 +

The GPU is Ubiquitous [APU13 keynote] 6 Early GPU History 1981: 1996: 1999: 2001: 2002: 2005: 2006: IBM PC Monochrome Display Adapter (2D) 3D graphics (e.g., 3dfx Voodoo)

register combiner (NVIDIA GeForce 256) programmable shaders (NVIDIA GeForce 3) floating-point (ATI Radeon 9700) unified shaders (ATI R520 in Xbox 360) compute (NVIDIA GeForce 8800) 7 + process commands Host / Front End / Vertex Fetch transform vertices to screen-space

Vertex Processing generate pertriangle equations Primitive Assembly, Setup generate pixels, delete pixels that cannot be seen Rasterize & Zcull Pixel Shader determine the colors , transparencies and depth of the pixel Texture

do final hidden surface test,blend and write out color and new depth Pixel Engines (ROP) [David Kirk / Wen-mei Hwu] Frame Buffer Controller GPU: The Life of a Triangle 8 + pixel color result of running shader program

9 Why use a GPU for computing? GPU uses larger fraction of silicon for computation than CPU. At peak performance GPU uses order of magnitude less energy per operation than CPU. Rewrite Application CPU 2nJ/op GPU 200pJ/op Order of Magnitude More Energy Efficient However.

Application must perform well 10 + GPU uses larger fraction of silicon for computation than CPU? Control ALU ALU ALU

ALU Cache DRAM DRAM CPU [NVIDIA] GPU 11 CSC 2231: Parallel Computer

Architecture and Programming GPUs Prof. Gennady Pekhimenko University of Toronto Fall 2017 The content of this lecture is adapted from the slides of Tor Aamodt (UBC)

Recently Viewed Presentations

  • Welcome to PE! 125 hours of time on

    Welcome to PE! 125 hours of time on

    Welcome to PE! 125 hours of time on PE 1hr and 15 mins of activity each school day Flexibility - you can exercise on your days off instead of a school day…whatever fits your schedule.
  • Best Practices in Fall Prevention

    Best Practices in Fall Prevention

    Put an "X" under the room number and bed beside all the tasks that need to be done (leave blank if no safety issue is identified or problem can be fixed immediately). Indicate whether this task should be completed by...
  • Macbeth: Act 2 - Ms. M's Lit Corner

    Macbeth: Act 2 - Ms. M's Lit Corner

    Setting, props, movement, tone-volume-pace of the lines. Lady Macbeth's report of her role and why she did not murder Duncan. Macbeth's Crisis and Lady Macbeth's "handling" of the scene. The Knocking at the Gate
  • Moving Beyond the Three-point Thesis - Mr. Arenas' Classroom

    Moving Beyond the Three-point Thesis - Mr. Arenas' Classroom

    So, what's the problem with a three-point thesis anyway? There is nothing inherently wrong about three-point thesis statements, but if you want your essays to shine and move beyond formulaic writing, you may want to try to using an implied...
  • Baruch_AM - Maryland

    Baruch_AM - Maryland

    2013Expert Panel New Practice Recommendations. Maryland TB Expert Panel November 25, 2013. Discussion based on CTBCP review of published research, published guidelines and recommendations, and review of other state practices by CTBCP staff regarding topics put forward by CTBCP, LHD...
  • Some faces of Bali... - WOU Homepage

    Some faces of Bali... - WOU Homepage

    Chorus is to portray the monkeys in sound and movement. Percussive vocal effects. the syllable "cak" is repeated by the 100 monkeys through clenched teeth the rhythms are similar to cengceng (cymbals) patterns from gamelan ensembles. one member sings the...
  • Diapositiva 1 - RUA: Principal

    Diapositiva 1 - RUA: Principal

    3. Phylum Proteobacteria. 3.4. d-Proteobacteria: SULFATE (AND SULFUR) REDUCING BACTERIA (SRB)* Desulfo - (generally) or Desulfuro-They reduce sulfate/sulfur
  • Hexavalent Chromium Safety

    Hexavalent Chromium Safety

    TWA is expressed in units of parts per million (PPM) or mg/m3. Permissible Exposure Level. Example, assume that an employee is subject to the following exposure to Cr(VI): Two hours exposure at 10 μg/m3. Two hours exposure at 5 μg/m3.