Yunquan Zhang
TitleCited byYear
AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs
Q Wang, X Zhang, Y Zhang, Q Yi
SC'13: Proceedings of the International Conference on High Performance …, 2013
1522013
Model-driven level 3 BLAS performance optimization on Loongson 3A processor
Z Xianyi, W Qian, Z Yunquan
2012 IEEE 18th International Conference on Parallel and Distributed Systems …, 2012
1202012
yaSpMV: yet another SpMV framework on GPUs
S Yan, C Li, Y Zhang, H Zhou
Acm Sigplan Notices 49 (8), 107-118, 2014
1002014
StreamScan: fast scan algorithms for GPUs without global barrier synchronization
S Yan, G Long, Y Zhang
ACM SIGPLAN Notices 48 (8), 229-238, 2013
712013
The BLIS framework: Experiments in portability
FGV Zee, TM Smith, B Marker, TM Low, RA Geijn, FD Igual, ...
ACM Transactions on Mathematical Software (TOMS) 42 (2), 12, 2016
662016
Parallel processing systems for big data: a survey
Y Zhang, T Cao, S Li, X Tian, L Yuan, H Jia, AV Vasilakos
Proceedings of the IEEE 104 (11), 2114-2136, 2016
512016
Models of parallel computation: a survey and classification
Y Zhang, G Chen, G Sun, Q Miao
Frontiers of Computer Science in China 1 (2), 156-165, 2007
382007
Accelerating viola-jones facce detection algorithm on gpus
H Jia, Y Zhang, W Wang, J Xu
2012 IEEE 14th International Conference on High Performance Computing and …, 2012
332012
MPFFT: An auto-tuning FFT library for OpenCL GPUs
Y Li, YQ Zhang, YQ Liu, GP Long, HP Jia
Journal of Computer Science and Technology 28 (1), 90-105, 2013
322013
GPURoofline: a model for guiding performance optimizations on GPUs
H Jia, Y Zhang, G Long, J Xu, S Yan, Y Li
European Conference on Parallel Processing, 920-932, 2012
302012
Study on parallel computing
GL Chen, GZ Sun, YQ Zhang, ZY Mo
Journal of Computer Science and Technology 21 (5), 665-673, 2006
302006
A parallel shortest path algorithm based on graph-partitioning and iterative correcting
Y Tang, Y Zhang, H Chen
2008 10th IEEE International Conference on High Performance Computing and …, 2008
262008
Optimizing and scaling HPCG on Tianhe-2: early experience
X Zhang, C Yang, F Liu, Y Liu, Y Lu
International Conference on Algorithms and Architectures for Parallel …, 2014
242014
Performance evaluation of allgather algorithms on terascale linux cluster with fast ethernet
J Chen, L Zhang, Y Zhang, W Yuan
Eighth International Conference on High-Performance Computing in Asia …, 2005
222005
Performance evaluation of multithreaded sparse matrix-vector multiplication using openmp
S Liu, Y Zhang, X Sun, RR Qiu
2009 11th IEEE International Conference on High Performance Computing and …, 2009
212009
DRAM (h): a parallel computation model for high per-formance numerical computing
Z Yun-Quan
Chinese Journal of Computers 26 (12), 1660-1670, 2003
212003
Optimizing SpMV for diagonal sparse matrices on GPU
X Sun, Y Zhang, T Wang, X Zhang, L Yuan, L Rao
2011 International conference on parallel processing, 492-501, 2011
202011
Parallelization and performance optimization on face detection algorithm with OpenCL: A case study
W Wang, Y Zhang, S Yan, Y Zhang, H Jia
Tsinghua Science and Technology 17 (3), 287-295, 2012
172012
pVOCL: Power-aware dynamic placement and migration in virtualized GPU environments
P Lama, Y Li, AM Aji, P Balaji, J Dinan, S Xiao, Y Zhang, W Feng, ...
2013 IEEE 33rd International Conference on Distributed Computing Systems …, 2013
162013
623 Tflop/s HPCG run on Tianhe-2: Leveraging millions of hybrid cores
Y Liu, C Yang, F Liu, X Zhang, Y Lu, Y Du, C Yang, M Xie, X Liao
The International Journal of High Performance Computing Applications 30 (1 …, 2016
152016
The system can't perform the operation now. Try again later.
Articles 1–20