A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Sharc + Synergy Lab Reading Group Schedule 2022 Fall | |||||||||||||||||||||||||
2 | Time: Friday 11:00 am - 12:30 pm | |||||||||||||||||||||||||
3 | Location: https://gatech.zoom.us/my/callie.hao | |||||||||||||||||||||||||
4 | Note: There is no need to specify a paper title yet -- you can just reserve a slot and fill in the details later. | |||||||||||||||||||||||||
5 | ||||||||||||||||||||||||||
6 | Date | Slot | Presenter | Title | Authors & Afflications | Publish Venue | Year | Abstract | Link to Slides | |||||||||||||||||
7 | 9/2/2022 | 1 | Stefan Abi-Karam | Self-Supervised Learning For Graphs | ||||||||||||||||||||||
8 | 2 | |||||||||||||||||||||||||
9 | 9/9/2022 | 1 | No Presentation Today! Let's meet next week. :) Sorry! | |||||||||||||||||||||||
10 | 2 | Feel free to sign up for other slots! | ||||||||||||||||||||||||
11 | 9/16/2022 | 1 | Divya Kiran Kadiyala | Clio: a hardware-software co-designed disaggregated memory system | Zhiyuan Guo et.al , UCSD | ASPLOS'22 | 2022 | Clio: a hardware-software co-designed disaggregated memory system | Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems | Presentation Slides: https://gtvault-my.sharepoint.com/:p:/g/personal/dkadiyala3_gatech_edu/EfzX0yOQOmpBuw3hl6t_3h4B0EcIh7csQV1EzPLcN3H5sA?e=tHUZha Additional: DirectCXL Paper: https://www.usenix.org/conference/atc22/presentation/gouk DirectCXL Video: https://www.youtube.com/watch?v=6a5NSMH-7hY&ab_channel=KAISTCAMELab | |||||||||||||||||
12 | 2 | Akshat Ramachandran | PositIV : A Configurable Posit processor Architecture for Image and Video Processing | Akshat Ramachandran, John Gustafson et al., VJTI & NUS, Singapore | Euromicro DSD | 2022 | Image processing is essential for applications such as robot vision, remote sensing, computational photography, augmented reality etc. In the design of dedicated hardware for such applications, IEEE Std 754TM floating point (float) arithmetic units have been widely used. While float-based architectures have achieved favorable results, their hardware is complicated and requires a large silicon footprint. In this paper we propose a Posit-based Image and Video processor (PositIV), a completely pipelined, configurable, image processor using posit arithmetic that guarantees lower power use and smaller silicon footprint than floats. PositIV is able to effectively overlap computation with memory access and supports multidimensional addressing, virtual border handling, prefetching and buffering. It is suc- cessfully able to integrate configurability, flexibility, and ease of development with real-time performance characteristics. The performance of PositIV is validated on several image processing algorithms for different configurations and compared against state-of-the-art implementations. Additionally, we empirically demonstrate the superiority of posits in processing images for several conventional algorithms, achieving at least 35–40% im- provement in image quality over standard floats. | Paper Link: https://ieeexplore.ieee.org/document/9996725 | ||||||||||||||||||
13 | 9/23/2022 | 1 | Hanqiu Chen | Sampling methods for efficient training of graph convolutional networks: A survey | Xin Liu et al. Chinese Academy of Sciences | IEEE/CAA Journal of Automatica Sinica'21 | https://arxiv.org/abs/2103.05872 | TBD | ||||||||||||||||||
14 | 2 | |||||||||||||||||||||||||
15 | 9/30/2022 | 1 | Zhihan Xu | N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based Heterogeneous Computing Cores | Zhihan Xu et al., Shanghai Jiao Tong Univ. | FPGA 2022 | https://dl.acm.org/doi/10.1145/3490422.3502367 | |||||||||||||||||||
16 | 2 | |||||||||||||||||||||||||
17 | 10/7/2022 | 1 | Yuhong Li | What Makes Convolutional Models Great on Long Sequence Modeling? | Yuhong Li | |||||||||||||||||||||
18 | 2 | |||||||||||||||||||||||||
19 | 10/14/2022 | 1 | Rishov Sarkar | APack: Off-Chip, Lossless Data Compression for Efficient Deep Learning Inference | Alberto Delmas Lascorz et al., University of Toronto | arXiv:2201.08830 [cs.AR] | 2022 | https://arxiv.org/abs/2201.08830 | https://gtvault.sharepoint.com/:p:/s/SharcLab/EbC07wsTREdNmxwHBMhl9GIB0OHkk6Ek2rK5pF4yM6meBA?e=iRGhha | |||||||||||||||||
20 | 2 | |||||||||||||||||||||||||
21 | 10/21/2022 | 1 | Akshay Kamath | Towards Grand Unification of Object Tracking | Bin Yan et al. | ECCV 2022 | ||||||||||||||||||||
22 | 2 | |||||||||||||||||||||||||
23 | 10/28/2022 | 1 | Hamed Seyedroudbari | Nightcore: Efficient and Scalable Serverless Computing for Latency-Sensitive, Interactive Microservices | Zhipeng Jia et al. | ASPLOS '21 | ||||||||||||||||||||
24 | 2 | |||||||||||||||||||||||||
25 | 11/4/2022 | 1 | Shang Yang | Heuristic Adaptability to Input Dynamics for SpMM on GPUs | Guohao Dai et al., Tsinghua Univ. | DAC 2022 | 2022 | https://arxiv.org/abs/2202.08556 Sparse Matrix-Matrix Multiplication (SpMM) has served as fundamental components in various domains. Many previous studies exploit GPUs for SpMM acceleration because GPUs provide high bandwidth and parallelism. We point out that a static design does not always improve the performance of SpMM on different input data (e.g., >85\% performance loss with a single algorithm). In this paper, we consider the challenge of input dynamics from a novel auto-tuning perspective, while following issues remain to be solved: (1) Orthogonal design principles considering sparsity. Orthogonal design principles for such a sparse problem should be extracted to form different algorithms, and further used for performance tuning. (2) Nontrivial implementations in the algorithm space. Combining orthogonal design principles to create new algorithms needs to tackle with new challenges like thread race handling. (3) Heuristic adaptability to input dynamics. The heuristic adaptability is required to dynamically optimize code for input dynamics. To tackle these challenges, we first propose a novel three-loop model to extract orthogonal design principles for SpMM on GPUs. The model not only covers previous SpMM designs, but also comes up with new designs absent from previous studies. We propose techniques like conditional reduction to implement algorithms missing in previous studies. We further propose DA-SpMM, a Data-Aware heuristic GPU kernel for SpMM. DA-SpMM adaptively optimizes code considering input dynamics. Extensive experimental results show that, DA-SpMM achieves 1.26x~1.37x speedup compared with the best NVIDIA cuSPARSE algorithm on average, and brings up to 5.59x end-to-end speedup to applications like Graph Neural Networks. | ||||||||||||||||||
26 | 2 | |||||||||||||||||||||||||
27 | 11/11/2022 | 1 | ||||||||||||||||||||||||
28 | 2 | |||||||||||||||||||||||||
29 | 11/18/2022 | 1 | Asmer Hamid Ali | An FPGA Implementation of Deep Spiking Neural Networks for Low-Power and Fast Classification | Xiping Ju et al. | |||||||||||||||||||||
30 | 2 | |||||||||||||||||||||||||
31 | 11/25/2022 | 1 | ||||||||||||||||||||||||
32 | 2 | |||||||||||||||||||||||||
33 | 12/2/2022 | 1 | ||||||||||||||||||||||||
34 | 2 | |||||||||||||||||||||||||
35 | 12/9/2022 | 1 | ||||||||||||||||||||||||
36 | 2 | |||||||||||||||||||||||||
37 | 12/16/2022 | 1 | Ninad Jangle | A Hardware/Software Co-Design Vision for Deep Learning at the Edge | F. Ponzina, S. Machetti, M. Rios, B. W. Denkinger, A. Levisse, G. Ansaloni, M. Peon-Quir ´ os, D. Atienza | IEEE Micro | 2022 | https://www.researchgate.net/publication/362502122_A_hardwaresoftware_Co-Design_Vision_for_Deep_Learning_At_the_Edge | https://docs.google.com/presentation/d/1XTZIozqQCn4qeSHD78GazNbA73nwPjqH/edit?usp=sharing&ouid=114294662162462639944&rtpof=true&sd=true | |||||||||||||||||
38 | 2 | |||||||||||||||||||||||||
39 | ||||||||||||||||||||||||||
40 | ||||||||||||||||||||||||||
41 | ||||||||||||||||||||||||||
42 | ||||||||||||||||||||||||||
43 | ||||||||||||||||||||||||||
44 | ||||||||||||||||||||||||||
45 | ||||||||||||||||||||||||||
46 | ||||||||||||||||||||||||||
47 | ||||||||||||||||||||||||||
48 | ||||||||||||||||||||||||||
49 | ||||||||||||||||||||||||||
50 | ||||||||||||||||||||||||||
51 | ||||||||||||||||||||||||||
52 | ||||||||||||||||||||||||||
53 | ||||||||||||||||||||||||||
54 | ||||||||||||||||||||||||||
55 | ||||||||||||||||||||||||||
56 | ||||||||||||||||||||||||||
57 | ||||||||||||||||||||||||||
58 | ||||||||||||||||||||||||||
59 | ||||||||||||||||||||||||||
60 | ||||||||||||||||||||||||||
61 | ||||||||||||||||||||||||||
62 | ||||||||||||||||||||||||||
63 | ||||||||||||||||||||||||||
64 | ||||||||||||||||||||||||||
65 | ||||||||||||||||||||||||||
66 | ||||||||||||||||||||||||||
67 | ||||||||||||||||||||||||||
68 | ||||||||||||||||||||||||||
69 | ||||||||||||||||||||||||||
70 | ||||||||||||||||||||||||||
71 | ||||||||||||||||||||||||||
72 | ||||||||||||||||||||||||||
73 | ||||||||||||||||||||||||||
74 | ||||||||||||||||||||||||||
75 | ||||||||||||||||||||||||||
76 | ||||||||||||||||||||||||||
77 | ||||||||||||||||||||||||||
78 | ||||||||||||||||||||||||||
79 | ||||||||||||||||||||||||||
80 | ||||||||||||||||||||||||||
81 | ||||||||||||||||||||||||||
82 | ||||||||||||||||||||||||||
83 | ||||||||||||||||||||||||||
84 | ||||||||||||||||||||||||||
85 | ||||||||||||||||||||||||||
86 | ||||||||||||||||||||||||||
87 | ||||||||||||||||||||||||||
88 | ||||||||||||||||||||||||||
89 | ||||||||||||||||||||||||||
90 | ||||||||||||||||||||||||||
91 | ||||||||||||||||||||||||||
92 | ||||||||||||||||||||||||||
93 | ||||||||||||||||||||||||||
94 | ||||||||||||||||||||||||||
95 | ||||||||||||||||||||||||||
96 | ||||||||||||||||||||||||||
97 | ||||||||||||||||||||||||||
98 | ||||||||||||||||||||||||||
99 | ||||||||||||||||||||||||||
100 |