CUDA optimization
FABRIC DRAPING
CS 750 HIGH PERFORMANCE COMPUTING – COURSE PROJECT
Contents
Draping
Rendering process
CAD model of the target
Target feature points
Grid of points for fabric
Physical simulation
Fabric geometry
Fully Rendered Fabric on Target
Our scope
Scope of project
Grid of points for fabric
Fabric geometry
B-Spline surface
Blossoming polynomials
Bi-variate Blossom : Quadratic Bspline
Tri-variate Blossom : Cubic Bspline
B-spline Construction
B-spline Construction
Blossom Construction
Surface blossoms
Grid of Control Points + u,v grid
Inputs
Grid of points for fabric
Fabric geometry
u, v coordinates
CUDA Intro
CUDA virtualizes the physical hardware into threads and blocks
Threads
Blocks
B-spline basis Construction
Thread(i,j): Across domain
Block(ib,jb) = 16 threads/ Per Block = 1 grid point (u,v)
Parallel Reduction: Sequential Addressing
Warps (Scheduling unit)
Each warp runs threads in a lock step fashion
Data transfer rates
Memory hierarchy
Code walk through
Thread and Block IDs obtained
Barriers an Thread Synchronization
CUDA memory management.
Nsight : Visual Studio
NVVP
NVVP
Results
Sample duration: (without yarn info)
CUDA streams for Asynchronous data transfer
Profiler
References
CUDA
Graphics