Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models D Dai, C Deng, C Zhao, RX Xu, H Gao, D Chen, J Li, W Zeng, X Yu, Y Wu, ... arXiv preprint arXiv:2401.06066, 2024 | 17 | 2024 |
Deepseek llm: Scaling open-source language models with longtermism X Bi, D Chen, G Chen, S Chen, D Dai, C Deng, H Ding, K Dong, Q Du, ... arXiv preprint arXiv:2401.02954, 2024 | 16 | 2024 |
Critique of “Planetary Normal Mode Computation: Parallel Algorithms, Performance, and Reproducibility” by SCC Team From Tsinghua University C Zhang, C Zhao, J He, S Chen, L Zheng, K Huang, W Han, J Zhai IEEE Transactions on Parallel and Distributed Systems 32 (11), 2631-2634, 2021 | 2 | 2021 |
Canvas: End-to-End Kernel Architecture Search in Neural Networks C Zhao, G Zhang, M Gao arXiv preprint arXiv:2304.07741, 2023 | 1 | 2023 |
Student Cluster Competition 2018, Team Tsinghua University: Reproducing performance of multi-physics simulations of the Tsunamigenic 2004 Sumatra megathrust earthquake on the … J He, C Zhao, J Yu, X Yu, L Zheng, C Lou, S Tang, W Han, J Zhai Parallel Computing 90, 102570, 2019 | | 2019 |