GitHub - ai-forever/CerberusDet: CerberusDet: Unified Multi-Task Object Detection

CerberusDet: Unified Multi-Dataset Object Detection

[Paper][🤗 HuggingFace Models]

The code is based on:

Install

$ git clone
$ pip install -e .

Docker

Run the docker:

sudo docker-compose up -d
sudo docker attach cerberusdet_cerber_1

Data

Use script voc.py to download VOC dataset

For information about the VOC dataset and its creators, visit the PASCAL VOC dataset website.

Use script objects365_part.py to download subset of Objects365 dataset with 19 animals categories:

['Monkey', 'Rabbit', 'Yak', 'Antelope', 'Pig',  'Bear', 'Deer', 'Giraffe', 'Zebra', 'Elephant',
'Lion', 'Donkey', 'Camel', 'Jellyfish', 'Other Fish', 'Dolphin', 'Crab', 'Seal', 'Goldfish']

Along with Objects365 subset with 12 tableware categories:

  [ 'Cup', 'Plate', 'Wine Glass', 'Pot', 'Knife', 'Fork', 'Spoon', 'Chopsticks',
    'Cutting/chopping Board', 'Tea pot', 'Kettle', 'Tong']

To download full Objects365 dataset, set DOWNLOAD_SUBSETS = False in the script objects365_part.py.

The Objects365 dataset is available for the academic purpose only. For information about the dataset and its creators, visit the Objects365 dataset website.

Train

Download pretrained on COCO yolov8 weights
Run train process with 1 GPU

$ python3 cerberusdet/train.py \
--img 640 --batch 32 \
--data data/voc_obj365.yaml \
--weights pretrained/yolov8x_state_dict.pt \
--cfg cerberusdet/models/yolov8x_voc_obj365.yaml \
--hyp data/hyps/hyp.cerber-voc_obj365.yaml \
--name voc_obj365_v8x --device 0

OR run train process with several GPUs:

$ CUDA_VISIBLE_DEVICES="0,1,2,3" \
python -m torch.distributed.launch --nproc_per_node 4 cerberusdet/train.py \
--img 640 --batch 32 \
--data data/voc_obj365.yaml \
--weights pretrained/yolov8x_state_dict.pt \
--cfg cerberusdet/models/yolov8x_voc_obj365.yaml \
--hyp data/hyps/hyp.cerber-voc_obj365.yaml \
--name voc_obj365_v8x \
--sync-bn

By default logging will be done with tensorboard, but you can use mlflow if set --mlflow-url, e.g. --mlflow-url localhost.

CerberusDet model config details

Example of the model's config for 2 tasks: yolov8x_voc_obj365.yaml

The model config is based on yolo configs, except that the head is divided into two sections (neck and head)
The layers of the neck section can be shared between tasks or be unique
The head section defines what the head will be for all tasks, but each task will always have its own unique parameters
The from parameter of the first neck layer must be a positive ordinal number, specifying from which layer, starting from the beginning of the entire architecture, to take features.
The cerber section is optional and defines the architecture configuration for determining the neck layers to be shared among tasks. If not specified, all layers will be shared among tasks, and only the heads will be unique.
The CerberusDet configuration is constructed as follows:
cerber: List[OneBranchConfig], where
OneBranchConfig = List[cerber_layer_number, SharedTasksConfig], where
cerber_layer_number - the layer number (counting from the end of the backbone) after which branching should occur
SharedTasksConfig = List[OneBranchGroupedTasks], where
OneBranchGroupedTasks = [number_of_task1_head, number_of_task2_head, ...] - the task head numbers (essentially task IDs) that should be in the same branch and share layers thereafter

The head numbers will correspond to tasks according to the sequence in which they are listed in the data configuration.

Example for YOLO v8x:
[[2, [[15], [13, 14]]], [6, [[13], [14]]]] - configuration for 3 tasks. Task id=15 will have all task-specific layers, starting from the 3rd. Tasks id=13, id=14 will share layers 3-6, then after the 6th, they will have their own separate branches with all layers.

Evaluation

Download CerberusDet checkpoint (see below)
Run script bash_scripts/val.sh

Inference

You can run inference using either the provided bash script or directly via the Python API.

1. Using Bash Script

First, download the CerberusDet checkpoint trained on VOC and parts of the Objects365 dataset (see the Pretrained Checkpoints section below).

Then, run the detection script:

./bash_scripts/detect.sh

2. Using Python API

You can also integrate CerberusDet into your own code. Below is an example of how to initialize the model, preprocess images, and visualize the results.

import cv2
from cerberusdet.cerberusdet_inference import CerberusDetInference, CerberusVisualizer
from cerberusdet.cerberusdet_preprocessor import CerberusPreprocessor

# 1. Configuration
weights_path = 'weights/voc_obj365_v8x_best.pt'
img_path = 'data/images/bus.jpg'
device = 'cuda:0'

# 2. Initialize model, preprocessor, and visualizer
inferencer = CerberusDetInference(
    weights=weights_path,
    device=device,
    conf_thres=0.3,
    iou_thres=0.45,
    half=True
)

# Note: Pass the model's stride to the preprocessor
preprocessor = CerberusPreprocessor(
    img_size=640,
    stride=inferencer.stride,
    half=inferencer.half,
    auto=True
)

visualizer = CerberusVisualizer(line_thickness=2, text_scale=0.5)

# 3. Load images
# The preprocessor expects a list of numpy arrays (BGR)
images = [cv2.imread(img_path)]
original_shapes = [img.shape[:2] for img in images]

# 4. Run inference
img_tensor = preprocessor.preprocess(images, device=inferencer.device)
detections = inferencer.predict(img_tensor, original_shape=original_shapes)

# Visualization
res_image = visualizer.draw_detections(
    images[0],
    detections[0],
    hide_task=False,  # Show task name (VOC, O365, etc.)
    hide_conf=False   # Show confidence score
)

# 5. Output / Save results
print(f"Found objects: {len(detections[0])}")
for det in detections[0]:
    print(f"{det['label_name']} ({det['score']:.2f}) - Task: {det['task']}")

cv2.imshow("CerberusDet Result", res_image)
cv2.imwrite("result.jpg", res_image)

cv2.waitKey(0)
cv2.destroyAllWindows()

NOTE: To run inference using standard YOLOv8 checkpoints, use the cerberusdet.yolo_wrapper.YOLOV8ForObjectDetection class. Please ensure the following requirements are met:

pip install ultralytics==8.1.0 torch==2.5.1

Tip: Class names for specific datasets can be found in the corresponding YAML configuration files located in the data/ directory.

Example using the VOC_07_12_best_state_dict.pt checkpoint (Click to expand)

import torch
from PIL import Image
from cerberusdet.yolo_wrapper import YOLOV8ForObjectDetection, YoloV8Config

image_path = 'images/image1.png'
model_path = 'weights/VOC_07_12_best_state_dict.pt'

# 1. Load model
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Define class names (or load them from data/voc.yaml)
voc_class_names = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car',
                   'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
                   'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor']

model_config = YoloV8Config(
      num_classes=len(voc_class_names),
      names={str(i): voc_class_names[i] for i in range(len(voc_class_names))},
)

model = YOLOV8ForObjectDetection(config=model_config).from_pretrained(
      pretrained_model_path=model_path, device=device, dtype=torch_dtype
)

# 2. Perform inference
image = Image.open(image_path)
results = model.predict(image, conf=0.4, iou=0.7, half=torch_dtype is torch.float16, return_dict=True)

# 3. Print results
for result in results:
      boxes = result.boxes
      print("Found objects:", [(result.names[int(c)], f"{float(conf):.2f}") for c, conf in zip(boxes.cls, boxes.conf)])

`

Pretrained Checkpoints

Model	Train set	size ^(pixels)	mAP^val 50-95	mAP^val 50	Speed ^{V100 b32, fp16 (ms)}	params ^(M)	FLOPs ^{@640 (B)}
YOLOv8x	VOC	640	0.758	0.916	5.6	68	257.5
YOLOv8x	Objects365_animals	640	0.43	0.548	5.6	68	257.5
YOLOv8x	Objects365_tableware	640	0.56	0.68	5.6	68	257.5
YOLOv8x	Objects365_full	640	0.291	0.381	5.6	70	267.0
CerberusDet_v8x	VOC, Objects365_animals	640	0.751, 0.432	0.918, 0.556	7.2	105	381.3
CerberusDet_v8x	VOC, Objects365_animals, Objects365_tableware	640	0.762, 0.421, 0.56	0.927, 0.541, 0.68	10	142	505.1
CerberusDet_v8x	VOC, Objects365_full	640	0.767, 0.355	0.932, 0.464	7.2	107	390.8

YOLOv8x models were trained with the commit: https://github.com/ultralytics/ultralytics/tree/2bc36d97ce7f0bdc0018a783ba56d3de7f0c0518

Hyperparameter Evolution

See the launch example in the bash_scripts/evolve.sh.

Notes

To evolve hyperparameters specific to each task, specify initial parameters separately per task and append --evolve_per_task
To evolve specific set of hyperparameters, specify their names separated by comma via the --params_to_evolve argument, e.g. --params_to_evolve 'box,cls,dfl'
Use absolute paths to configs.
Specify search algorith via --evolver. You can use the search algorithms of the ray library (see available values here: predefined_evolvers.py), or 'yolov5'

License

CerberusDet is released under the GNU AGPL v.3 license.

See the file LICENSE for more details.

Citing

If you use our models, code or dataset, we kindly request you to cite our paper and give repository a ⭐

@article{cerberusdet,
   Author = {Irina Tolstykh,Michael Chernyshov,Maksim Kuprashevich},
   Title = {CerberusDet: Unified Multi-Dataset Object Detection},
   Year = {2024},
   Eprint = {arXiv:2407.12632},
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
bash_scripts		bash_scripts
cerberusdet		cerberusdet
data		data
.flake8		.flake8
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md
docker-compose.yml		docker-compose.yml
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CerberusDet: Unified Multi-Dataset Object Detection

Install

Docker

Data

Train

Evaluation

Inference

1. Using Bash Script

2. Using Python API

Pretrained Checkpoints

Hyperparameter Evolution

License

Citing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ai-forever/CerberusDet

Folders and files

Latest commit

History

Repository files navigation

CerberusDet: Unified Multi-Dataset Object Detection

Install

Docker

Data

Train

Evaluation

Inference

1. Using Bash Script

2. Using Python API

Pretrained Checkpoints

Hyperparameter Evolution

License

Citing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages