Intuition for yolort

[1]:
import cv2
import torch

from yolort.utils import cv2_imshow, get_image_from_url, read_image_to_tensor
from yolort.utils.image_utils import color_list, plot_one_box
[2]:
import os

os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0"

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Read an image

Let’s request and pre-process the images that to be detected.

[3]:
img_source = "https://huggingface.co/spaces/zhiqwang/assets/resolve/main/bus.jpg"
# img_source = "https://huggingface.co/spaces/zhiqwang/assets/resolve/main/zidane.jpg"
img_raw = get_image_from_url(img_source)
img = read_image_to_tensor(img_raw)
img = img.to(device)

images = [img]

Model Definition and Initialization

[4]:
score_thresh = 0.55
stride = 64
img_h, img_w = 640, 640
[5]:
from yolort.models import yolov5n6
[6]:
model = yolov5n6(pretrained=True, score_thresh=score_thresh, size=(img_h, img_w))
[7]:
model = model.eval()
model = model.to(device)
[8]:
# Perform inference on an image tensor
model_out = model(images)
/opt/conda/lib/python3.8/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2157.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]

Verify the PyTorch backend inference results

[9]:
model_out[0]['boxes'].cpu().detach()
[9]:
tensor([[ 32.27846, 225.15259, 811.47729, 740.91077],
        [ 50.42178, 387.48911, 241.54393, 897.61035],
        [219.03334, 386.14346, 345.77686, 869.02582],
        [678.05023, 374.65341, 809.80341, 874.80621]])

Target outputs of boxes (yolov5n6, with shape of 1080x810):

tensor([[ 32.27846, 225.15259, 811.47729, 740.91077],
        [ 50.42178, 387.48911, 241.54393, 897.61035],
        [219.03334, 386.14346, 345.77686, 869.02582],
        [678.05023, 374.65341, 809.80341, 874.80621]])
[10]:
model_out[0]['scores'].cpu().detach()
[10]:
tensor([0.88238, 0.84486, 0.72629, 0.70077])

Target outputs of scores: (``yolov5n6``, with shape of 1080x810)

tensor([0.88238, 0.84486, 0.72629, 0.70077])
[11]:
model_out[0]['labels'].cpu().detach()
[11]:
tensor([5, 0, 0, 0])

Target outputs of labels:

tensor([5, 0, 0, 0])

Detection output visualisation

First let get the labels of COCO datasets.

[12]:
# Get label names
import requests

labels = []
response = requests.get("https://huggingface.co/spaces/zhiqwang/assets/resolve/main/coco.names")
names = response.text

for label in names.strip().split('\n'):
    labels.append(label)

The labels can also be obtained by

from yolort.utils.image_utils import load_names
labels = load_names('../notebooks/assets/coco.names')
[13]:
colors = color_list()
for box, label in zip(model_out[0]['boxes'].tolist(), model_out[0]['labels'].tolist()):
    img_raw = plot_one_box(box, img_raw, color=colors[label % len(colors)], label=labels[label])
[14]:
cv2_imshow(img_raw, imshow_scale=0.5)
../_images/notebooks_inference-pytorch-export-libtorch_22_0.png

Scripting YOLOv5

[15]:
# TorchScript export
print(f'Starting TorchScript export with torch {torch.__version__}...')
export_script_name = 'yolov5n6.torchscript.pt'  # filename
Starting TorchScript export with torch 1.10.1+cu113...
[16]:
model_script = torch.jit.script(model)
model_script.eval()
model_script = model_script.to(device)
/opt/conda/lib/python3.8/site-packages/torch/jit/_recursive.py:614: LightningDeprecationWarning: The `LightningModule.loaded_optimizer_states_dict` property is deprecated in v1.4 and will be removed in v1.6.
  item = getattr(mod, name, None)
/opt/conda/lib/python3.8/site-packages/torch/jit/_recursive.py:614: LightningDeprecationWarning: The `LightningModule.model_size` property was deprecated in v1.5 and will be removed in v1.7. Please use the `pytorch_lightning.utilities.memory.get_model_size_mb`.
  item = getattr(mod, name, None)
/opt/conda/lib/python3.8/site-packages/torch/jit/_recursive.py:701: LightningDeprecationWarning: The `LightningModule.model_size` property was deprecated in v1.5 and will be removed in v1.7. Please use the `pytorch_lightning.utilities.memory.get_model_size_mb`.
  item = getattr(nn_module, name, None)
/opt/conda/lib/python3.8/site-packages/torch/jit/_recursive.py:501: LightningDeprecationWarning: The `LightningModule.model_size` property was deprecated in v1.5 and will be removed in v1.7. Please use the `pytorch_lightning.utilities.memory.get_model_size_mb`.
  item = getattr(nn_module, name, None)
/opt/conda/lib/python3.8/site-packages/torch/jit/_recursive.py:584: LightningDeprecationWarning: The `LightningModule.model_size` property was deprecated in v1.5 and will be removed in v1.7. Please use the `pytorch_lightning.utilities.memory.get_model_size_mb`.
  item = getattr(nn_module, name, None)
[17]:
# Save the scripted model file for subsequent use (Optional)
model_script.save(export_script_name)

Inference on TorchScript Backend

[18]:
out = model(images)
out_script = model_script(images)
/opt/conda/lib/python3.8/site-packages/yolort/models/yolo.py:179: UserWarning: YOLO always returns a (Losses, Detections) tuple in scripting.
  warnings.warn("YOLO always returns a (Losses, Detections) tuple in scripting.")
/opt/conda/lib/python3.8/site-packages/yolort/models/yolo_module.py:142: UserWarning: YOLOv5 always returns a (Losses, Detections) tuple in scripting.
  warnings.warn("YOLOv5 always returns a (Losses, Detections) tuple in scripting.")

Verify the Inference Output on LibTorch backend

[19]:
for k, v in out[0].items():
    torch.testing.assert_allclose(out_script[1][0][k], v, rtol=1e-07, atol=1e-09)

print("Exported model has been tested with libtorch, and the result looks good!")
Exported model has been tested with libtorch, and the result looks good!

Comparing Inference Time between pytorch and libtorch

Time consumed on PyTorch backend

[20]:
%%timeit
with torch.no_grad():
    out = model(images)
13.7 ms ± 412 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Time consumed on LibTorch backend

[21]:
# Dummy run once for torchscript
with torch.no_grad():
    out_script = model_script(images)
[22]:
%%timeit
with torch.no_grad():
    out_script = model_script(images)
8.7 ms ± 310 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

View this document as a notebook: https://github.com/zhiqwang/yolov5-rt-stack/blob/main/notebooks/inference-pytorch-export-libtorch.ipynb