A Python module for getting the GPU status from NVIDA GPUs using nvidia-smi programmically in Python

Related tags

GPU Utilitiesgputil
Overview

GPUtil

GPUtil is a Python module for getting the GPU status from NVIDA GPUs using nvidia-smi. GPUtil locates all GPUs on the computer, determines their availablity and returns a ordered list of available GPUs. Availablity is based upon the current memory consumption and load of each GPU. The module is written with GPU selection for Deep Learning in mind, but it is not task/library specific and it can be applied to any task, where it may be useful to identify available GPUs.

Table of Contents

  1. Requirements
  2. Installation
  3. Usage
    1. Main functions
    2. Helper functions
  4. Examples
    1. Select first available GPU in Caffe
    2. Occupy only 1 GPU in TensorFlow
    3. Monitor GPU in a separate thread
  5. License

Requirements

NVIDIA GPU with latest NVIDIA driver installed. GPUtil uses the program nvidia-smi to get the GPU status of all available NVIDIA GPUs. nvidia-smi should be installed automatically, when you install your NVIDIA driver.

Supports both Python 2.X and 3.X.

Python libraries:

Tested on CUDA driver version 390.77 Python 2.7 and 3.5.

Installation

  1. Open a terminal (Ctrl+Shift+T)
  2. Type pip install gputil
  3. Test the installation
    1. Open a terminal in a folder other than the GPUtil folder
    2. Start a python console by typing python in the terminal
    3. In the newly opened python console, type:
      import GPUtil
      GPUtil.showUtilization()
    4. Your output should look something like following, depending on your number of GPUs and their current usage:
       ID  GPU  MEM
      --------------
        0    0%   0%
      

Old way of installation

  1. Download or clone repository to your computer
  2. Add GPUtil folder to ~/.bashrc
    1. Open a new terminal (Press Ctrl+Alt+T)
    2. Open bashrc:
      gedit ~/.bashrc
      
    3. Added your GPUtil folder to the environment variable PYTHONPATH (replace <path_to_gputil> with your folder path):
      export PYTHONPATH="$PYTHONPATH:<path_to_gputil>"
      
      Example:
      export PYTHONPATH="$PYTHONPATH:/home/anderskm/github/gputil"
      
    4. Save ~/.bashrc and close gedit
    5. Restart your terminal
  3. Test the installation
    1. Open a terminal in a folder other than the GPUtil folder
    2. Start a python console by typing python in the terminal
    3. In the newly opened python console, type:
      import GPUtil
      GPUtil.showUtilization()
    4. Your output should look something like following, depending on your number of GPUs and their current usage:
       ID  GPU  MEM
      --------------
        0    0%   0%
      

Usage

To include GPUtil in your Python code, all you hve to do is included it at the beginning of your script:

import GPUtil

Once included all functions are available. The functions along with a short description of inputs, outputs and their functionality can be found in the following two sections.

Main functions

deviceIDs = GPUtil.getAvailable(order = 'first', limit = 1, maxLoad = 0.5, maxMemory = 0.5, includeNan=False, excludeID=[], excludeUUID=[])

Returns a list ids of available GPUs. Availablity is determined based on current memory usage and load. The order, maximum number of devices, their maximum load and maximum memory consumption are determined by the input arguments.

  • Inputs
    • order - Deterimines the order in which the available GPU device ids are returned. order should be specified as one of the following strings:
      • 'first' - orders available GPU device ids by ascending id (defaut)
      • 'last' - orders available GPU device ids by descending id
      • 'random' - orders the available GPU device ids randomly
      • 'load'- orders the available GPU device ids by ascending load
      • 'memory' - orders the available GPU device ids by ascending memory usage
    • limit - limits the number of GPU device ids returned to the specified number. Must be positive integer. (default = 1)
    • maxLoad - Maximum current relative load for a GPU to be considered available. GPUs with a load larger than maxLoad is not returned. (default = 0.5)
    • maxMemory - Maximum current relative memory usage for a GPU to be considered available. GPUs with a current memory usage larger than maxMemory is not returned. (default = 0.5)
    • includeNan - True/false flag indicating whether to include GPUs where either load or memory usage is NaN (indicating usage could not be retrieved). (default = False)
    • excludeID - List of IDs, which should be excluded from the list of available GPUs. See GPU class description. (default = [])
    • excludeUUID - Same as excludeID except it uses the UUID. (default = [])
  • Outputs
    • deviceIDs - list of all available GPU device ids. A GPU is considered available, if the current load and memory usage is less than maxLoad and maxMemory, respectively. The list is ordered according to order. The maximum number of returned device ids is limited by limit.
deviceID = GPUtil.getFirstAvailable(order = 'first', maxLoad=0.5, maxMemory=0.5, attempts=1, interval=900, verbose=False)

Returns the first avaiable GPU. Availablity is determined based on current memory usage and load, and the ordering is determined by the specified order. If no available GPU is found, an error is thrown. When using the default values, it is the same as getAvailable(order = 'first', limit = 1, maxLoad = 0.5, maxMemory = 0.5)

  • Inputs
    • order - See the description for GPUtil.getAvailable(...)
    • maxLoad - Maximum current relative load for a GPU to be considered available. GPUs with a load larger than maxLoad is not returned. (default = 0.5)
    • maxMemory - Maximum current relative memory usage for a GPU to be considered available. GPUs with a current memory usage larger than maxMemory is not returned. (default = 0.5)
    • attempts - Number of attempts the function should make before giving up finding an available GPU. (default = 1)
    • interval - Interval in seconds between each attempt to find an available GPU. (default = 900 --> 15 mins)
    • verbose - If True, prints the attempt number before each attempt and the GPU id if an available is found.
    • includeNan - See the description for GPUtil.getAvailable(...). (default = False)
    • excludeID - See the description for GPUtil.getAvailable(...). (default = [])
    • excludeUUID - See the description for GPUtil.getAvailable(...). (default = [])
  • Outputs
    • deviceID - list with 1 element containing the first available GPU device ids. A GPU is considered available, if the current load and memory usage is less than maxLoad and maxMemory, respectively. The order and limit are fixed to 'first' and 1, respectively.
GPUtil.showUtilization(all=False, attrList=None, useOldCode=False)

Prints the current status (id, memory usage, uuid load) of all GPUs

  • Inputs
    • all - True/false flag indicating if all info on the GPUs should be shown. Overwrites attrList.
    • attrList - List of lists of GPU attributes to display. See code for more information/example.
    • useOldCode - True/false flag indicating if the old code to display GPU utilization should be used.
  • Outputs
    • None

Helper functions

 class GPU

Helper class handle the attributes of each GPU. Quoted descriptions are copied from corresponding descriptions by nvidia-smi.

  • Attributes for each GPU
    • id - "Zero based index of the GPU. Can change at each boot."
    • uuid - "This value is the globally unique immutable alphanumeric identifier of the GPU. It does not correspond to any physical label on the board. Does not change across reboots."
    • load - Relative GPU load. 0 to 1 (100%, full load). "Percent of time over the past sample period during which one or more kernels was executing on the GPU. The sample period may be between 1 second and 1/6 second depending on the product."
    • memoryUtil - Relative memory usage from 0 to 1 (100%, full usage). "Percent of time over the past sample period during which global (device) memory was being read or written. The sample period may be between 1 second and 1/6 second depending on the product."
    • memoryTotal - "Total installed GPU memory."
    • memoryUsed - "Total GPU memory allocated by active contexts."
    • memoryFree - "Total free GPU memory."
    • driver - "The version of the installed NVIDIA display driver."
    • name - "The official product name of the GPU."
    • serial - This number matches the serial number physically printed on each board. It is a globally unique immutable alphanumeric value.
    • display_mode - "A flag that indicates whether a physical display (e.g. monitor) is currently connected to any of the GPU's connectors. "Enabled" indicates an attached display. "Disabled" indicates otherwise."
    • display_active - "A flag that indicates whether a display is initialized on the GPU's (e.g. memory is allocated on the device for display). Display can be active even when no monitor is physically attached. "Enabled" indicates an active display. "Disabled" indicates otherwise."
GPUs = GPUtil.getGPUs()
  • Inputs
    • None
  • Outputs
    • GPUs - list of all GPUs. Each GPU corresponds to one GPU in the computer and contains a device id, relative load and relative memory usage.
GPUavailability = GPUtil.getAvailability(GPUs, maxLoad = 0.5, maxMemory = 0.5, includeNan=False, excludeID=[], excludeUUID=[])

Given a list of GPUs (see GPUtil.getGPUs()), return a equally sized list of ones and zeroes indicating which corresponding GPUs are available.

  • Inputs
    • GPUs - List of GPUs. See GPUtil.getGPUs()
    • maxLoad - Maximum current relative load for a GPU to be considered available. GPUs with a load larger than maxLoad is not returned. (default = 0.5)
    • maxMemory - Maximum current relative memory usage for a GPU to be considered available. GPUs with a current memory usage larger than maxMemory is not returned. (default = 0.5)
    • includeNan - See the description for GPUtil.getAvailable(...). (default = False)
    • excludeID - See the description for GPUtil.getAvailable(...). (default = [])
    • excludeUUID - See the description for GPUtil.getAvailable(...). (default = [])
  • Outputs
    • GPUavailability - binary list indicating if GPUs are available or not. A GPU is considered available, if the current load and memory usage is less than maxLoad and maxMemory, respectively.

See demo_GPUtil.py for examples and more details.

Examples

Select first available GPU in Caffe

In the Deep Learning library Caffe, the user can switch between using the CPU or GPU through their Python interface. This is done by calling the methods caffe.set_mode_cpu() and caffe.set_mode_gpu(), respectively. Below is a minimum working example for selecting the first available GPU with GPUtil to run a Caffe network.

# Import caffe and GPUtil
import caffe
import GPUtil

# Set CUDA_DEVICE_ORDER so the IDs assigned by CUDA match those from nvidia-smi
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"

# Get the first available GPU
DEVICE_ID_LIST = GPUtil.getFirstAvailable()
DEVICE_ID = DEVICE_ID_LIST[0] # grab first element from list

# Select GPU mode
caffe.set_mode_gpu()
# Select GPU id
caffe.set_device(DEVICE_ID)

# Initialize your network here

Note: At the time of writing this example, the Caffe Python wrapper only supports 1 GPU, although the underlying code supports multiple GPUs. Calling directly Caffe from the terminal allows for using multiple GPUs.

Occupy only 1 GPU in TensorFlow

By default, TensorFlow will occupy all available GPUs when using a gpu as a device (e.g. tf.device('\gpu:0')). By setting the environment variable CUDA_VISIBLE_DEVICES, the user can mask which GPUs should be visible to TensorFlow via CUDA (See CUDA_VISIBLE_DEVICES - Masking GPUs). Using GPUtil.py, the CUDA_VISIBLE_DEVICES can be set programmatically based on the available GPUs. Below is a minimum working example of how to occupy only 1 GPU in TensorFlow using GPUtil. To run the code, copy it into a new python file (e.g. demo_tensorflow_gputil.py) and run it (e.g. enter python demo_tensorflow_gputil.py in a terminal).

Note: Even if you set the device you run your code on to a CPU, TensorFlow will occupy all available GPUs. To avoid this, all GPUs can be hidden from TensorFlow with os.environ["CUDA_VISIBLE_DEVICES"] = ''.

# Import os to set the environment variable CUDA_VISIBLE_DEVICES
import os
import tensorflow as tf
import GPUtil

# Set CUDA_DEVICE_ORDER so the IDs assigned by CUDA match those from nvidia-smi
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"

# Get the first available GPU
DEVICE_ID_LIST = GPUtil.getFirstAvailable()
DEVICE_ID = DEVICE_ID_LIST[0] # grab first element from list

# Set CUDA_VISIBLE_DEVICES to mask out all other GPUs than the first available device id
os.environ["CUDA_VISIBLE_DEVICES"] = str(DEVICE_ID)

# Since all other GPUs are masked out, the first available GPU will now be identified as GPU:0
device = '/gpu:0'
print('Device ID (unmasked): ' + str(DEVICE_ID))
print('Device ID (masked): ' + str(0))

# Run a minimum working example on the selected GPU
# Start a session
with tf.Session() as sess:
    # Select the device
    with tf.device(device):
        # Declare two numbers and add them together in TensorFlow
        a = tf.constant(12)
        b = tf.constant(30)
        result = sess.run(a+b)
        print('a+b=' + str(result))

Your output should look something like the code block below. Notice how only one of the GPUs are found and created as a tensorflow device.

I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
Device: /gpu:0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: TITAN X (Pascal)
major: 6 minor: 1 memoryClockRate (GHz) 1.531
pciBusID 0000:02:00.0
Total memory: 11.90GiB
Free memory: 11.76GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: TITAN X (Pascal), pci bus id: 0000:02:00.0)
a+b=42

Comment the os.environ["CUDA_VISIBLE_DEVICES"] = str(DEVICE_ID) line and compare the two outputs. Depending on your number of GPUs, your output should look something like code block below. Notice, how all 4 GPUs are being found and created as a tensorflow device, whereas when CUDA_VISIBLE_DEVICES was set, only 1 GPU was found and created.

I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
Device: /gpu:0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: TITAN X (Pascal)
major: 6 minor: 1 memoryClockRate (GHz) 1.531
pciBusID 0000:02:00.0
Total memory: 11.90GiB
Free memory: 11.76GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x2c8e400
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 1 with properties: 
name: TITAN X (Pascal)
major: 6 minor: 1 memoryClockRate (GHz) 1.531
pciBusID 0000:03:00.0
Total memory: 11.90GiB
Free memory: 11.76GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x2c92040
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 2 with properties: 
name: TITAN X (Pascal)
major: 6 minor: 1 memoryClockRate (GHz) 1.531
pciBusID 0000:83:00.0
Total memory: 11.90GiB
Free memory: 11.76GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x2c95d90
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 3 with properties: 
name: TITAN X (Pascal)
major: 6 minor: 1 memoryClockRate (GHz) 1.531
pciBusID 0000:84:00.0
Total memory: 11.90GiB
Free memory: 11.76GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 0 and 2
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 0 and 3
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 1 and 2
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 1 and 3
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 2 and 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 2 and 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 3 and 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 3 and 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 1 2 3 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y Y N N 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 1:   Y Y N N 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 2:   N N Y Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 3:   N N Y Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: TITAN X (Pascal), pci bus id: 0000:02:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:1) -> (device: 1, name: TITAN X (Pascal), pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:2) -> (device: 2, name: TITAN X (Pascal), pci bus id: 0000:83:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:3) -> (device: 3, name: TITAN X (Pascal), pci bus id: 0000:84:00.0)
a+b=42

Monitor GPU in a separate thread

If using GPUtil to monitor GPUs during training, it may show 0% utilization. A way around this is to use a separate monitoring thread.

import GPUtil
from threading import Thread
import time

class Monitor(Thread):
    def __init__(self, delay):
        super(Monitor, self).__init__()
        self.stopped = False
        self.delay = delay # Time between calls to GPUtil
        self.start()

    def run(self):
        while not self.stopped:
            GPUtil.showUtilization()
            time.sleep(self.delay)

    def stop(self):
        self.stopped = True
        
# Instantiate monitor with a 10-second delay between updates
monitor = Monitor(10)

# Train, etc.

# Close monitor
monitor.stop()

License

See LICENSE

Comments
  • GPU util stuck at 0%?

    GPU util stuck at 0%?

    I'm having a strange issue on various machine where every call of showUtilization() shows 0% GPU util, even though nvidia-smi at the same time returns 100%. It does, however, correctly show memory usage. Any idea why this might occur?

    Thanks for writing this utility!

    opened by jfainberg 9
  • Values of '[Not Supported]' are not handled properly.

    Values of '[Not Supported]' are not handled properly.

    Values of '[Not Supported]' are not handled properly.

    In [1]: import GPUtil
    
    In [2]: g = GPUtil.getGPUs()
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-2-871afb3451f3> in <module>()
    ----> 1 g = GPUtil.getGPUs()
    
    ~\AppData\Local\Continuum\Anaconda3\envs\tensorflow\lib\site-packages\GPUtil\__init__.py in getGPUs()
         80                 deviceIds[g] = int(vals[i])
         81             elif (i == 1):
    ---> 82                 gpuUtil[g] = float(vals[i])/100
         83             elif (i == 2):
         84                 memTotal[g] = int(vals[i])
    
    ValueError: could not convert string to float: '[Not Supported]'
    
    bug 
    opened by tasptz 7
  • NameError: name 'unicode' is not defined

    NameError: name 'unicode' is not defined

    Hi,

    On Windows 10 (64 bit), I'm getting the following error:

    Python 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 16:13:55) [MSC v.1900 64 bit (AMD64)]
    Type 'copyright', 'credits' or 'license' for more information
    IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.

    In [1]: import GPUtil In [2]: GPUtil.showUtilization()


    NameError Traceback (most recent call last)
    in ()
    ----> 1 GPUtil.showUtilization()
    ~\Anaconda3\lib\site-packages\GPUtil\GPUtil.py in showUtilization(all, attrList, useOldCode)
    248 elif (isinstance(attr,str)):
    249 attrStr = attr;
    --> 250 elif (isinstance(attr,unicode)):
    251 attrStr = attr.encode('ascii','ignore')
    252 else:
    NameError: name 'unicode' is not defined

    Any idea how to fix this?

    Thanks a lot!

    opened by janroden 6
  • GPUtil CPU Usage

    GPUtil CPU Usage

    Hi I notice when using GPUtil that the CPU usage is much higher than pynvml, can anyone explain why or assist me?

    Using GPUtil

    #!/usr/bin/python
    import GPUtil
    gpu = GPUtil.getGPUs()[0]
    gpu_util = int(gpu.load * 100)
    gpu_temp = int(gpu.temperature)
    
    $ /usr/bin/time -v ./GPUtil-test.py
            Command being timed: "./GPUtil-test.py"
            User time (seconds): 0.21
            System time (seconds): 0.43
            Percent of CPU this job got: 481%
            Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.13
            Average shared text size (kbytes): 0
            Average unshared data size (kbytes): 0
            Average stack size (kbytes): 0
            Average total size (kbytes): 0
            Maximum resident set size (kbytes): 26088
            Average resident set size (kbytes): 0
            Major (requiring I/O) page faults: 0
            Minor (reclaiming a frame) page faults: 7978
            Voluntary context switches: 32
            Involuntary context switches: 769
            Swaps: 0
            File system inputs: 0
            File system outputs: 0
            Socket messages sent: 0
            Socket messages received: 0
            Signals delivered: 0
            Page size (bytes): 4096
            Exit status: 0
    

    Using pynvml

    #!/usr/bin/python
    import pynvml as nv
    nv.nvmlInit()
    handle = nv.nvmlDeviceGetHandleByIndex(0)
    gpu_util = nv.nvmlDeviceGetUtilizationRates(handle).gpu
    gpu_temp = nv.nvmlDeviceGetTemperature(handle, nv.NVML_TEMPERATURE_GPU)
    nv.nvmlShutdown()
    
    $ /usr/bin/time -v ./pynvml-test.py 
            Command being timed: "./pynvml-test.py "
            User time (seconds): 0.02
            System time (seconds): 0.01
            Percent of CPU this job got: 84%
            Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.03
            Average shared text size (kbytes): 0
            Average unshared data size (kbytes): 0
            Average stack size (kbytes): 0
            Average total size (kbytes): 0
            Maximum resident set size (kbytes): 15732
            Average resident set size (kbytes): 0
            Major (requiring I/O) page faults: 0
            Minor (reclaiming a frame) page faults: 2454
            Voluntary context switches: 2
            Involuntary context switches: 2
            Swaps: 0
            File system inputs: 0
            File system outputs: 0
            Socket messages sent: 0
            Socket messages received: 0
            Signals delivered: 0
            Page size (bytes): 4096
            Exit status: 0
    
    opened by konomikitten 5
  • Added Windows Support

    Added Windows Support

    If the platform is Windows and nvidia-smi could not be found from the environment path, try to find it from system drive with default installation path

    opened by daun-io 4
  • Avoid deadlock when reading stdout of subprocess

    Avoid deadlock when reading stdout of subprocess

    https://docs.python.org/2/library/subprocess.html Do not use stdout=PIPE or stderr=PIPE with this function as that can deadlock based on the child process output volume. Use Popen with the communicate() method when you need pipes.

    opened by bashbug 4
  • FileNotFoundError shows up whenever I try to use this package

    FileNotFoundError shows up whenever I try to use this package

    This is the error it gives me whenever I try to run it:

    GPUtil 1.3.0 Traceback (most recent call last): File "demo_GPUtil.py", line 10, in GPU.showUtilization() File "C:\Users\dylan\AppData\Local\Programs\Python\Python36\Lib\site-packages\GPUtil\GPUtil.py", line 193, in showUtilization GPUs = getGPUs() File "C:\Users\dylan\AppData\Local\Programs\Python\Python36\Lib\site-packages\GPUtil\GPUtil.py", line 64, in getGPUs p = Popen(["nvidia-smi","--query-gpu=index,uuid,utilization.gpu,memory.total,memory.used,memory.free,driver_version,name,gpu_serial,display_active,display_mode", "--format=csv,noheader,nounits"], stdout=PIPE) File "C:\Users\dylan\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 709, in init restore_signals, start_new_session) File "C:\Users\dylan\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 997, in _execute_child startupinfo) FileNotFoundError: [WinError 2] The system cannot find the file specified

    opened by evaherrada 4
  • GPUtil.showUtilization does not work for individual attrList

    GPUtil.showUtilization does not work for individual attrList

    The showUtilization function offers the possibility to restrict the output given an attrList list in the parameters.

    However, if such attrList is defined in the parameter list, it will never make it to its processing. The function decides first between "all" is set or not. In both cases, either the output ("oldCode") is directly printed or the attrList parameter is overwritten, regardless whether it has been set or not.

    It's just a small thing, but it would be convenient to be able to restrict the output to only the few fields one needs for debugging ...

    Thanks, Andre

    opened by askusa 3
  • FileNotFoundError

    FileNotFoundError

    hi, i pip installed the package and tried running GPUtil.getAvailable() but got the bellow listed massage. any thought?

    thank you very much for this package.

    GPUtil.getAvailable() Traceback (most recent call last): File "", line 1, in File "C:\Users\dkarl\AppData\Local\conda\conda\envs\dudy_test\lib\site-packages\GPUtil\GPUtil.py", line 123, in getAvailable GPUs = getGPUs() File "C:\Users\dkarl\AppData\Local\conda\conda\envs\dudy_test\lib\site-packages\GPUtil\GPUtil.py", line 64, in getGPUs p = Popen(["nvidia-smi","--query-gpu=index,uuid,utilization.gpu,memory.total,memory.used,memory.free,driver_version,name,gpu_serial,display_active,display_mode", "--format=csv,noheader,nounits"], stdout=PIPE) File "C:\Users\dkarl\AppData\Local\conda\conda\envs\dudy_test\lib\subprocess.py", line 709, in init restore_signals, start_new_session) File "C:\Users\dkarl\AppData\Local\conda\conda\envs\dudy_test\lib\subprocess.py", line 997, in _execute_child startupinfo) FileNotFoundError: [WinError 2] The system cannot find the file specified

    opened by davidkarl 3
  • Added try-except around getgpu so the function does not crash on non-gpu hosts

    Added try-except around getgpu so the function does not crash on non-gpu hosts

    I am writing unit test on some deep learning code that might run on cpu-only hosts. Currently all functions crash if you run them on a host without nvidia-smi. With this change the function returns an empty array instead of crash.

    opened by ifeherva 3
  • include LICENSE.txt in distributions

    include LICENSE.txt in distributions

    The MIT license "shall be included in all copies or substantial portions of the Software", but you're currently not including it in your sdist or bdist_wheel files. The MANIFEST.in puts it in sdists, and setup.cfg for wheels.

    opened by djsutherland 2
  • showUtilization causes GPU stuttering

    showUtilization causes GPU stuttering

    Running a simple looped call to this function (showUtilization) causes stuttering in games (recordable in frametimes) and shown in 3rd party testing below:

    191485181-35c9d5c0-58a7-4286-bdc6-ebc78ccc4084 (Gif is taken from another project but the below script gives the same issue)

    To Reproduce
    Steps to reproduce the behavior:

    1. Open a browser page to https://www.testufo.com/animation-time-graph
    2. Allow the test to settle
    3. Run test script

    Test Script

    import time
    import GPUtil
    
    while True:
        GPUtil.showUtilization()
        time.sleep(1)
    
    opened by Cyruz143 0
  • GPUtil doesn't find GPU

    GPUtil doesn't find GPU

    I am having an issue with this module. It doesn't find my GPU, but when I go in Command Line and write "nvidia-smi" everything seems to work. I already reinstalled my NVIDIA drivers and the module, but nothing works.

    opened by ByOle1307 0
  • Over 60 times slower than nvidia-smi to asses resource usage

    Over 60 times slower than nvidia-smi to asses resource usage

    Easiest way to replicate would be:

    time:

    import nvidia_smi
    import numpy as np
    
    
    nvidia_smi.nvmlInit()
    
    for _ in range(50):
            gpus = [nvidia_smi.nvmlDeviceGetHandleByIndex(i) for i in range(nvidia_smi.nvmlDeviceGetCount())]
            res_arr = [nvidia_smi.nvmlDeviceGetUtilizationRates(handle) for handle in gpus]
            print('Usage with nivida-smi: ', np.sum([res.gpu for res in res_arr]), '%')
    
    

    Then time:

    import GPUtil
    import numpy as np
    
    for _ in range(50):
            res_arr = GPUtil.getGPUs()
            print('Usage with GPUtil: ', np.sum([res.load for res in res_arr])*100, '%')
    

    YMMV here but for the first one I get constant reports of 1% GPU utilization and runtime is:

    real    0m0,179s
    user    0m0,688s
    sys     0m0,818s
    

    For the second one GPU utilization climb to a whooping 93% by the 6th call and the runtime is:

    real    0m11,267s
    user    0m0,605s
    sys     0m11,449s
    

    The getGPUs() seems to be fairly close to what nvidia SMI does with nvmlDeviceGetUtilizationRates, and quite frankly it being 63x times slower and consuming ~100% of my GPU (2080RTX) to run, as opposed to 1% seems a bit unreasonable.

    Since may people use this library to figure out GPU utilization it might be reasonable to try and have a more efficient version of getGPUs for that or, if it provides some "extra" features (e.g. it samples 100x calls and average them out) a way to control the settings on that might be welcome.

    Or maybe I'm doing something completely wrong here, in which case, let me know.

    opened by George3d6 0
  • ValueError when nvidia-smi finds no GPU

    ValueError when nvidia-smi finds no GPU

    In Line 90:

    lines = output.split(os.linesep)
    

    returns [''] instead of [] when nvidia-smi finds no GPU, which then causes ValueError by the parser.

    Suggested update:

    lines = list(filter(None, output.split(os.linesep)))
    
    opened by kuangdai 0
Releases(v1.4.0)
  • v1.4.0(Dec 18, 2018)

    Added

    • Added automatic detection of nvidia-smi on windows
    • Added GPU temperature
    • Added example for monitoring GPU usage from a separat thread

    Removed

    • Removed numpy dependency

    Fixed

    • Fixed crashing when GPUs returned by nvidia-smi
    • Fixed potential deadlock when calling nvidia-smi
    • Fixed issue with python v.2 unicode
    • Fixed various spelling errors in readme
    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Apr 9, 2018)

    • Added support for GPU fields (such as memory and load), which were not supported by the GPU.
    • Added UUID to GPUs.
    • Added support for excluding GPUs based on ID or UUID
    • Added __version__
    • Moved main part of code from __init__.py to GPUtil.py
    • Updated showUtilization() with increased flexibitlity.
    • Updated readme
    Source code(tar.gz)
    Source code(zip)
  • v1.2.3(Feb 9, 2017)

    A minor, but very important bug has been fixed in regards to calculating the memory utilization, which is used to determined if a GPU is available. The bug meant that in Python 2.X the memory utilization would always show as 0%. Bug was not present in 3.X, as dividing two integers in 3.X are automatically converted to float.

    Source code(tar.gz)
    Source code(zip)
  • v1.2.2(Feb 9, 2017)

  • v1.2.0(Feb 9, 2017)

    Project have been renamed to GPUtil to avoid confusion with the slightly similar GPUstats python script here on github. The project have also been prepared for PyPI, so that it can easily be installed and upgraded through pip install.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Feb 1, 2017)

    GPUstats

    • Added new functionality to getFirstAvailable(...) to support multiple attempts to find available GPU.
      • Added 4 new input arguments:
        • order - specifies, the ordering of the GPUs. Same functionality as for getAvailable(...)
        • attempts - number of attempts for locating available GPU.
        • interval - interval in seconds between each attempt.
        • verbose - print attempt counter
    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(Jan 16, 2017)

  • v1.0.0(Jan 16, 2017)

    GPUstats:

    • Get memory usage and utility of all CUDA compatible GPUs using Python (and nvidia-smi)
    • Get the first, last and random available GPU from all available GPUs
    • Example showing how to programmatically only allocate memory on 1 GPU when using TensorFlow
    Source code(tar.gz)
    Source code(zip)
Owner
Anders Krogh Mortensen
Postdoc
Anders Krogh Mortensen
cuDF - GPU DataFrame Library

cuDF - GPU DataFrames NOTE: For the latest stable README.md ensure you are on the main branch. Resources cuDF Reference Documentation: Python API refe

RAPIDS 5.2k Jan 08, 2023
📊 A simple command-line utility for querying and monitoring GPU status

gpustat Just less than nvidia-smi? NOTE: This works with NVIDIA Graphics Devices only, no AMD support as of now. Contributions are welcome! Self-Promo

Jongwook Choi 3.2k Jan 04, 2023
A Python module for getting the GPU status from NVIDA GPUs using nvidia-smi programmically in Python

GPUtil GPUtil is a Python module for getting the GPU status from NVIDA GPUs using nvidia-smi. GPUtil locates all GPUs on the computer, determines thei

Anders Krogh Mortensen 927 Dec 08, 2022
General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases.

Vulkan Kompute The general purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabl

The Institute for Ethical Machine Learning 1k Dec 26, 2022
ArrayFire: a general purpose GPU library.

ArrayFire is a general-purpose library that simplifies the process of developing software that targets parallel and massively-parallel architectures i

ArrayFire 4k Dec 29, 2022
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

NVIDIA DALI The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provi

NVIDIA Corporation 4.2k Jan 08, 2023
A NumPy-compatible array library accelerated by CUDA

CuPy : A NumPy-compatible array library accelerated by CUDA Website | Docs | Install Guide | Tutorial | Examples | API Reference | Forum CuPy is an im

CuPy 6.6k Jan 05, 2023
cuSignal - RAPIDS Signal Processing Library

cuSignal The RAPIDS cuSignal project leverages CuPy, Numba, and the RAPIDS ecosystem for GPU accelerated signal processing. In some cases, cuSignal is

RAPIDS 646 Dec 30, 2022
Python interface to GPU-powered libraries

Package Description scikit-cuda provides Python interfaces to many of the functions in the CUDA device/runtime, CUBLAS, CUFFT, and CUSOLVER libraries

Lev E. Givon 924 Dec 26, 2022
CUDA integration for Python, plus shiny features

PyCUDA lets you access Nvidia's CUDA parallel computation API from Python. Several wrappers of the CUDA API already exist-so what's so special about P

Andreas Klöckner 1.4k Jan 02, 2023
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Introduction This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code her

NVIDIA Corporation 6.9k Dec 28, 2022
cuGraph - RAPIDS Graph Analytics Library

cuGraph - GPU Graph Analytics The RAPIDS cuGraph library is a collection of GPU accelerated graph algorithms that process data found in GPU DataFrames

RAPIDS 1.2k Jan 01, 2023
jupyter/ipython experiment containers for GPU and general RAM re-use

ipyexperiments jupyter/ipython experiment containers and utils for profiling and reclaiming GPU and general RAM, and detecting memory leaks. About Thi

Stas Bekman 153 Dec 07, 2022
Library for faster pinned CPU <-> GPU transfer in Pytorch

SpeedTorch Faster pinned CPU tensor - GPU Pytorch variabe transfer and GPU tensor - GPU Pytorch variable transfer, in certain cases. Update 9-29-1

Santosh Gupta 657 Dec 19, 2022
Python 3 Bindings for NVML library. Get NVIDIA GPU status inside your program.

py3nvml Documentation also available at readthedocs. Python 3 compatible bindings to the NVIDIA Management Library. Can be used to query the state of

Fergal Cotter 212 Jan 04, 2023
A Python function for Slurm, to monitor the GPU information

Gpu-Monitor A Python function for Slurm, where I couldn't use nvidia-smi to monitor the GPU information. whole repo is not finish Installation TODO Mo

Squidward Tentacles 2 Feb 11, 2022
QPT-Quick packaging tool 前项式Python环境快捷封装工具

QPT - Quick packaging tool 快捷封装工具 GitHub主页 | Gitee主页 QPT是一款可以“模拟”开发环境的多功能封装工具,一行命令即可将普通的Python脚本打包成EXE可执行程序,与此同时还可轻松引入CUDA等深度学习加速库, 尽可能在用户使用时复现您的开发环境。

GT-Zhang 545 Dec 28, 2022
Conda package for artifact creation that enables offline environments. Ideal for air-gapped deployments.

Conda-Vendor Conda Vendor is a tool to create local conda channels and manifests for vendored deployments Installation To install with pip, run: pip i

MetroStar - Tech 13 Nov 17, 2022
cuML - RAPIDS Machine Learning Library

cuML - GPU Machine Learning Algorithms cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions t

RAPIDS 3.1k Jan 04, 2023
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.

A lightweight, GPU accelerated, SQL engine built on the RAPIDS.ai ecosystem. Get Started on app.blazingsql.com Getting Started | Documentation | Examp

BlazingSQL 1.8k Jan 02, 2023