CSC 2231: Parallel Computer Architecture and ... - cs.toronto.edu
CSC 2231: Parallel Computer Architecture and Programming GPUs Prof. Gennady Pekhimenko University of Toronto Fall 2017 The content of this lecture is adapted from the slides of Tor Aamodt (UBC) Project Progress Report Due next week Friday (Nov. 3rd) Ask questions after the class 2 Review #7 GPUs and the Future of Parallel Computin
g Steve Keckler et al., IEEE Micro 2011 Due Nov. 10 3 Review #5 Results Grades (out of 10) Mean: 9.05 10 9 8 7 6 5 4 3
2 1 0 6s 7s 8s 9s 10s 4 What is a GPU?
GPU = Graphics Processing Unit Accelerator for raster based graphics (OpenGL, DirectX) Highly programmable (Turing complete) Commodity hardware 100s of ALUs; 10s of 1000s of concurrent threads NVIDIA Volta: V100 5 +
The GPU is Ubiquitous [APU13 keynote] 6 Early GPU History 1981: 1996: 1999: 2001: 2002: 2005: 2006: IBM PC Monochrome Display Adapter (2D) 3D graphics (e.g., 3dfx Voodoo)
register combiner (NVIDIA GeForce 256) programmable shaders (NVIDIA GeForce 3) floating-point (ATI Radeon 9700) unified shaders (ATI R520 in Xbox 360) compute (NVIDIA GeForce 8800) 7 + process commands Host / Front End / Vertex Fetch transform vertices to screen-space
Vertex Processing generate pertriangle equations Primitive Assembly, Setup generate pixels, delete pixels that cannot be seen Rasterize & Zcull Pixel Shader determine the colors , transparencies and depth of the pixel Texture
do final hidden surface test,blend and write out color and new depth Pixel Engines (ROP) [David Kirk / Wen-mei Hwu] Frame Buffer Controller GPU: The Life of a Triangle 8 + pixel color result of running shader program
9 Why use a GPU for computing? GPU uses larger fraction of silicon for computation than CPU. At peak performance GPU uses order of magnitude less energy per operation than CPU. Rewrite Application CPU 2nJ/op GPU 200pJ/op Order of Magnitude More Energy Efficient However.
Application must perform well 10 + GPU uses larger fraction of silicon for computation than CPU? Control ALU ALU ALU
ALU Cache DRAM DRAM CPU [NVIDIA] GPU 11 CSC 2231: Parallel Computer
Architecture and Programming GPUs Prof. Gennady Pekhimenko University of Toronto Fall 2017 The content of this lecture is adapted from the slides of Tor Aamodt (UBC)
Put an "X" under the room number and bed beside all the tasks that need to be done (leave blank if no safety issue is identified or problem can be fixed immediately). Indicate whether this task should be completed by...
Setting, props, movement, tone-volume-pace of the lines. Lady Macbeth's report of her role and why she did not murder Duncan. Macbeth's Crisis and Lady Macbeth's "handling" of the scene. The Knocking at the Gate
So, what's the problem with a three-point thesis anyway? There is nothing inherently wrong about three-point thesis statements, but if you want your essays to shine and move beyond formulaic writing, you may want to try to using an implied...
2013Expert Panel New Practice Recommendations. Maryland TB Expert Panel November 25, 2013. Discussion based on CTBCP review of published research, published guidelines and recommendations, and review of other state practices by CTBCP staff regarding topics put forward by CTBCP, LHD...
Chorus is to portray the monkeys in sound and movement. Percussive vocal effects. the syllable "cak" is repeated by the 100 monkeys through clenched teeth the rhythms are similar to cengceng (cymbals) patterns from gamelan ensembles. one member sings the...
TWA is expressed in units of parts per million (PPM) or mg/m3. Permissible Exposure Level. Example, assume that an employee is subject to the following exposure to Cr(VI): Two hours exposure at 10 μg/m3. Two hours exposure at 5 μg/m3.
Ready to download the document? Go ahead and hit continue!