명령어에 대한 설명을 온라인에서 찾기 어려워서 작성해둡니다.
특정 GPU의 드라이버상에서 인식을 Disable할 때 사용했습니다.
#nvidia-smi
root@user:~# nvidia-smi
#nvidia-smi drain
root@user:~# nvidia-smi drain
drain -- Display drain state information about the system as well as remove and discover devices.
Usage: nvidia-smi drain [options]
Options include:
[-p | --pciid]: GPU PCI ID in the format XXXX:YY.Z.a
where XXXX = domain
YY = bus
Z = device a = function
[-m | --modify]: Modify the drain state of a GPU specified by -p.
Persistence mode must be disabled to perform this function.
0 = not draining
1 = draining
[-q | --query]: Query the drain state of a GPU specified by -p.
[-r | --remove]: If possible, remove the GPU specified by -p.
The GPU must have already been put in a drain state using -m and
persistence mode for this GPU must be disabled.
[-d | --discover]: Discover all GPUs on the system that had previously been removed.
[-h | --help]: Display help information
# drain 적용.
root@user:~# nvidia-smi drain -p 0000:18:00.0 -m 1
Successfully set GPU 00000000:18:00.0 drain state to: draining.
#결과1
root@user:~# nvidia-smi
No devices were found
#drain 해제
root@user:~# nvidia-smi drain -p 0000:18:00.0 -m 0
Successfully set GPU 00000000:18:00.0 drain state to: not draining.
#결과2
root@user:~# nvidia-smi
'Nvidia > TEST' 카테고리의 다른 글
DCGMI repository 등록 및 설치/실행(Ubuntu 22.04.3) (0) | 2023.10.18 |
---|---|
NVLINK 상태 확인 (0) | 2023.09.23 |
GPU-BURN (0) | 2022.11.14 |
Nvidia-bug-report.sh (0) | 2022.11.03 |