NekoDaemon's Blog

Enable L3 PFC + DCQCN for RoCE on Edgecore SONiC

Posted on 2024-04-01

This article is only for reference, as Edgecore SONiC, a customized variant with a lot of proprietary commands, is quite different from community SONiC. Also, some commands and configurations are specialized for certain switch ASICs, such as Intel Tofino I use right now. Thus, I would still suggest do not throw the official guidebook away, read it carefully, and it will save your life.

Fastest way to Install WireGuard-Go on Ubuntu 22.04

Posted on 2024-02-10

I am surprised that there is no documentation about how to install WireGuard-Go from the Ubuntu official repo via APT, and even the installation process is a little bit tricky.

Fix corrupted NVIDIA Driver after Upgrading / Downgrading Ubuntu Kernel

Posted on 2023-09-19 Last updated on 2023-09-20

One day, you rebooted your server and suddenly found your cute GPUs had all disappeared. Then you executed nvidia-smi to see what was going on, but you only got this error message.

Enable L3 PFC + DCQCN for RoCE on Mellanox ConnectX NICs

Posted on 2023-07-24 Last updated on 2024-04-01

RoCE networks, a high-performance implementation of RDMA networks, offload flow control and congestion control algorithms to hardware to achieve high performance. However, these algorithms target lossless networks so they can be simple enough to implement on hardware. Thus, we should mitigate packet loss issues and guarantee lossless networks to our best. DCQCN (Congestion Control) + PFC (Flow Control) is a common option for many data centers. We observed that our system would suffer severe performance fluctuation if disabling them.

Quick way to fix wrong font size and location exported by PowerPoint Save as Picture on Mac

Posted on 2023-05-07

Microsoft is always on its way of producing bugs...

How to properly setup GPUDirect RDMA

Posted on 2023-03-29 Last updated on 2023-06-09

GPUDirect RDMA (GDR) is an incredible technology allowing remote machines directly to manipulate the local GPU's memory. However, there are not many online resources discussing about this technology. So, I felt very confused when I encountered issues relevant to RDMA, especially for GDR.

NekoDaemon's Blog

Enable L3 PFC + DCQCN for RoCE on Edgecore SONiC

Fastest way to Install WireGuard-Go on Ubuntu 22.04

Fix corrupted NVIDIA Driver after Upgrading / Downgrading Ubuntu Kernel

Enable L3 PFC + DCQCN for RoCE on Mellanox ConnectX NICs

Quick way to fix wrong font size and location exported by PowerPoint Save as Picture on Mac

How to properly setup GPUDirect RDMA

快速理解并行 Makefile

Easy ways to setup Reverse Proxy for NAT-Passthrough

Slurm Quick Installation for Cluster on Ubuntu 20.04

Tips of configuring InfiniBand adapters