0%

Last year, I wrote a post about how to install Ubuntu 18.04 Server automatically. The major reason why I choose to install the older version is I failed to make Ubuntu 20.04 install without pressing any key at that time while the approach for the offline installation recommended by the official is not working.

SC21又又又又是在线上打的。第二年痛失美帝免费旅游机会了!!!第二年了!!!没有机票,酒店,和大吃大喝的比赛能叫比赛吗!不过结果还是不错的,远远的超出我的预期(原因请看下文分解)。

MPI allows to create a new communicator by splitting an existing one into a sub-communicator, which can make our program dynamically select a subset of computing nodes to involve in the collective communication operations, such as all-reduce and all-gather operations. NCCL also has a similar feature, but it is not well-documented yet.

网上几乎所有的文章都直接忽悠上CDN的车,难道上CDN就是提升速度的最优解?在CDN这条弯路上折腾了快两年,玩了一圈免备案的CDN,踩了各种各样的坑以后,恍然大悟,茅厕顿开,便在此大放厥词写下此文。本文主要介绍CDN的正确用法,以及性价比爆炸,便宜又效果好的网站加速方案。

Recently, a SOTA sharding approach, GSPMD/GShard, was proposed and it provides an intuitive interface to partition a large array on arbitrary dimensions, while utilizing sharding propagation algorithms to automatically infer the partitioning strategy for tensors without user-specified sharding specifications. This document introduces the design and the implementation of XLA Sharding System.

It would be easier to read the source code if we are aware of the runtime information, including call stacks and variable values. This tutorial introduces how to utilize our powerful VSCode to trace XLA Compiler.

ASC20好死不死决赛场地就在妮可,痛失旅游机会。祸不单行,ASC20取消,合并为ASC20-21,公费旅游机会-=2。雪上加霜,疫情还顺手把晚宴给整没了。没有机票,酒店,和大吃大喝的比赛能叫比赛吗!

It is always hard to debug distributed programs. Not only the concurrency is extremely naughty, but we don’t have enough tools, or don’t know there are several tools to debug the distributed programs. But I found that tmux is capable of handling multiple windows, which means it’s possible to control numerous nodes without GUI.