deepvisiontools.config

class Configuration[source]

Configuration class for deepvisiontools (Singleton -> can be instancied once and then point to same object). Store all configuration information about the library configuration. If you wish to changes parameters later on, you can simply modify the corresponding attributes.

Parameters:

device (Literal["cpu", "cuda"], optional) – Device to be used by default when creating objects, running models etc. Defaults to “cpu”.
data_type (Literal["instance_mask", "bbox", "keypoint", "semantic_mask"], optional) – Default format to use in dataset, models, prediction etc. Defaults to “bbox”.
num_classes (int, optional) – Number of classes in model. Defaults to 1.
mask_min_size (int, optional) – Minimal size of mask to be considered : below this threshold annotation will be ignored. Defaults to 15.
semantic_mask_logits_combination (Literal["avg", "min", "max"]) – How to combine logits in patchification or adding semantic masks. avg takes the mean, min takes the minimum and max takes the maximum.
splitted_mask_handling (bool, optional) – If set to True redefine masks that are splitted (after cropping for example) so they belong to independant objects. Defaults to False.
model_nms_threshold (float, optional) – Default nms iou threshold used in models. Defaults to 0.45.
model_confidence_threshold (float, optional) – Default model confidence threshold to consider it a valid object prediction. Defaults to 0.5.
model_max_detection (int, optional) – Maximum number of objects outputed by a model (useful for some models such as yolo type models). Defaults to 300.
metrics_matcher_type (Literal["bbox", "instance_mask"], optional) – Object matcher data type used in metrics (note that instance_mask is slower because needs transition to cpu to save gpu memory). If data_type is instance mask and matcher type is bbox will convert it to bbox for matching in metrics. Defaults to “bbox”.
metrics_match_iou_threshold (float, optional) – Metrics matcher iou threshold. Defaults to 0.45.
patchifier_mode (Literal["bbox", "instance_mask"], optional) – Patchifier data type used for nms and duplicate supresser. If data_type is mask and patchifier to bbox will convert for according operations. Defaults to “bbox”.
seed (Union[False, int], optional) – use a manual seed to enforce reproducibility (you probably want to also switch deterministic to True in that case). If False it ends reproducibility. Defaults to False.
deterministic (bool, optional) – Use deterministic algorithms. Helps further reproducibility (see also seeds). Be careful : some models can’t be deterministic so sometimes you need to switch it to False even if you are manually seeding. Defaults to False.

Example:

>>> from deepvisiontools import Configuration()
>>> config = Configuration(data_type = "instance_mask") # can instantiate with given parameter
>>> config.device = "cuda"  # can modify parameters by modifying attributes / properties

Attributes

data_type (Literal["instance_mask", "bbox", "keypoint", "semantic_mask"], optional): Default format to use in dataset, models, prediction etc. Defaults to “bbox”.
num_classes (int): Number of classes in model. Defaults to 1.
mask_min_size (int): Minimal size of mask to be considered : below this threshold annotation will be ignored. Defaults to 15.
semantic_mask_logits_combination (Literal["avg", "min", "max"]): How to combine logits in patchification or adding semantic masks. avg takes the mean, min takes the minimum and max takes the maximum.
splitted_mask_handling (bool): If set to True redefine masks that are splitted (after cropping for example) so they belong to independant objects. Defaults to False.
model_nms_threshold (float): Default nms iou threshold used in models. Defaults to 0.45.
model_confidence_threshold (float): Default model confidence threshold to consider it a valid object prediction. Defaults to 0.5.
model_max_detection (int): Maximum number of objects outputed by a model (useful for some models such as yolo type models). Defaults to 300.
metrics_matcher_type (Literal["bbox", "instance_mask"]): Object matcher data type used in metrics. If data_type is instance mask and matcher type is bbox will convert it to bbox for matching in metrics. Defaults to “bbox”.
metrics_match_iou_threshold (float): Metrics matcher iou threshold. Defaults to 0.45.
patchifier_mode (Literal["bbox", "instance_mask"]): Patchifier data type used for nms and duplicate supresser. If data_type is mask and patchifier to bbox will convert for according operations. Defaults to “bbox”.

Properties

device (Literal["cpu", "cuda"]): Device to be used by default when creating objects, running models etc. Defaults to “cpu”.
seed (Union[False, int], optional): use a manual seed to enforce reproducibility (you probably want to also switch deterministic to True in that case). If False it ends reproducibility. Defaults to False.
deterministic (bool, optional): Use deterministic algorithms. Helps further reproducibility (see also seeds). Be careful : some models can’t be deterministic so sometimes you need to switch it to False even if you are manually seeding. Defaults to False.

Notes

If you use instance_mask, bbox will included when needed from the masks.
you can change model_nms_threshold and model_confidence_threshold for the entire lib by modifying the attributes

3) In instance mode, by default the target Format remove small objects in case their masks contains less than min_mask_threshold (default 5 pixels). Change the attribute to modify this behaviour. 4) The option splitted_mask_handling is by default False. If you set to True, when performing transformation on object mask that split it into discontinuous sub-masks the library creates new objects for every sub-masks. Otherwise they will still be describing the same unique object

deepvisiontools.data

class AbstractBatchAugmenter[source]: Abstract class for augmentation within DataLoader (combine elements of batch together such as mosaic type augmentation) Note : these augmentations always come after normal augmentations that are implemented in Dataset instead of dataloader for this one.

class Augmentation[source]

Class that handles augmentation in dataset. Call on different Formats (data_type) specific methods :param augmentations: List of torchvision.transforms.v2 Transform classes (or from deepvisiontools.data.additional_augmentations) :type augmentations: List[T.Transform]

Parameters:: augmentations (List[Transform])

class DeepVisionDataset[source]

Detection dataset class for deepvisiontools : load and return image, annotation, image name.

Parameters:

dataset_path (Union[str, Path]) – path to dataset folder.
reader (BaseReader, optional) – Class to read data from dataset folder. Defaults to CocoReader.
preprocessing (Callable, optional) – Preprocessing images (normalization). Defaults to build_preprocessing().
augmentation (List[Transform], optional) – Augmentation to apply to images / annotations. Must be from torchvision.transforms.v2.Transform Defaults to None.
label_converter (Dict[int, int], optional) – Convert labels to another value. For e.g : {0: 2, 1: 5} etc. Defaults to None.
category_ids (Union[Dict[int, str], None])

Example:

>>> from deepvisiontools import DeepVisionDataset
>>> data_path = "path/to/data"
>>> dataset = DeepVisionDataset(data_path)
>>> image, target, image_name = dataset[1]
>>> print(type(image), type(target), type(image_name))
<class 'torch.Tensor' >, <class 'BboxFormat' >, <class 'str'>
>>> print(image.shape, target.size, image_name)
torch.Size([3,512,512]), 5, 'img_01.png'

Attributes

dataset_path (Path): path to dataset folder.
reader (BaseReader): Class to read data from dataset folder. Defaults to CocoReader.
preprocessing (Callable): Preprocessing images (normalization). Defaults to build_preprocessing().
augmentation (List[Transform]): Augmentation to apply to images / annotations. Must be from torchvision.transforms.v2.Transform Defaults to None.
category_ids (Dict[int, str]): Dict that associate a name to a category label index. Defaults is equal to self.reader.category_ids
label_converter (Dict[int, int]): Convert labels to another value. For e.g : {0: 2, 1: 5} etc. Defaults to None.

Methods:

export_dataset(destination_folder, number_visu='all', file_extension='')[source]

Export dataset accordingly to BaseReader class. For example CocoReader will export in following structure: Dataset Name -> Image_dir, coco_annotations.json

Parameters:

destination_folder (Union[str, Path]) – Path to new dataset folder.
number_visu (Union[Literal["all"], int], optional) – number of visualization to create. If “all” will derive all of them. Defaults to “all”.
file_extension (str, optional) – if requires a specific file extension. If “” will use BaseReader’s. Defaults to “”.

keep_indexes(indexes)[source]

Filter dataset by keeping only indices given in arg.

Parameters:: indexes (Union[list, slice, Tensor]) – can be slice, Tensor or list. To use slice please use : slice(i, j) with i, j desired slice indexes in arg.
Return type:: DeepVisionDataset

split(sequence)[source]

split dataset in 3 new datasets according to proportions

Parameters:: sequence (Sequence[float, float, float]) – proportions to split the dataset into. Sum must be 1.
Return type:: Tuple[DeepVisionDataset, DeepVisionDataset, DeepVisionDataset]

Example:

>>> dataset = DeepVisionDataset("path/to/dataset")
>>> train_dataset, valid_dataset, test_dataset = dataset.split((0.6, 0.2, 0.2))

class DeepVisionLoader[source]

Child class of DataLoader that batchify images and BaseFormats. DetectionLoader support any features from torch Dataloaders (Sampler, etc..).

Parameters:

*args
*kwargs

Example:

>>> from deepvisiontools import DeepVisionLoader
>>> loader = DeepVisionLoader(dataset, batch_size=2)
>>> for batch in loader:
>>>     img, target, img_name = batch

Methods:

collate_fn(batch)[source]

Parameters:

batch (List[Tuple[Tensor, BaseFormat]]) – List of pairs image/target.

Returns:

Batch images (N, 3, H, W).
BaseFormats wrapped into BatchedFormats class.

Return type:

Tuple[Tensor, BatchedFormats]

pad_to_larger(images, targets)[source]

Pad images and targets to larger image size.

Parameters:

images (List[Tensor]) – Images.
targets (List[BaseFormat]) – Targets.

Return type:

Tuple[List[Tensor], List[BaseFormat]]

visualize(dir_path)[source]

Generate visualization through DeepVisionLoader. Can be useful to test batch_augmenter effect.

Parameters:: dir_path (str | Path)

class MosaicBatchAugmenter[source]

This Batch augmentation generate a mosaic containing n images from a batch (mix some patch of images / targets into one image). If the number of image is larger than batch size shift to smaller possibility (for e.g. n = 4 batch_size=3 -> n becomes 2). if number of image to be mixed is smaller than batch_size, create new mosaics if possible : for e.g batchsize = 5, n=2 -> generate 2 mosaics from the first 2 images, then an additional 2 images with remaining, and finally the remaining is 1. The remaining images are left untouched

Args:
mixed_img_numb (Literal[1, 2, 4, 6, 8, 9, 12], optional): Number of img per mosaic. Defaults to 2. probability (float, optional): _description_. Defaults to 0.5.

Parameters:

mixed_img_numb (Literal[1, 2, 4, 6, 8, 9, 12])
probability (float)

class RandomCenterCropAndResize[source]

With a given probability, apply CenterCrop and Resize from torchvision.transforms.v2. NB : here we resize only and systematically if cropped.

Args:
crop (Union[int, Sequence[int]]): Size to crop resize (Union[int, Sequence[int]]): Size to resize p (float, optional): probability. Defaults to 0.5.

Parameters:

crop (Sequence[int])
resize (Sequence[int])

class RandomChangeBackground[source]

With a given probability p, swap image background. New background is taken from an image folder for which path is provided. Note 1 : it is implemented only for instance_mask, semantic_mask and bbox data type Note 2 : new background image type must be one of .jpg, .jpeg, .png, .tif, .tiff, .PNG, .JPG, .JPEG, .TIF, .TIFF :param background_dir_path: Path to background folder :type background_dir_path: Union[str, Path] :param p: Probability. Defaults to 0.5. :type p: float, optional

Parameters:

background_dir_path (str | Path)
p (float)

class RandomCropAndResize[source]

With a given probability, apply RandomCrop and Resize from torchvision.transforms.v2. NB : here we resize only and systematically if cropped.

Parameters:

crop (Union[int, Sequence[int]]) – Size to crop
resize (Union[int, Sequence[int]]) – Size to resize
p (float, optional) – probability. Defaults to 0.5.

Methods:

class RandomPadAndResize[source]

With a given probability, apply Pad and Resize from torchvision.transforms.v2. This looks like a zoom out effect by decreasing spatial resolution. NB : here we resize only and systematically if Padded.

Args:
MaxPad (Union[int, Sequence[int]]): maximum padding bounds can be int for common padding bound for all borders or sequence of 4 ints for (t, l, b, r) resize (Tuple[int, int]): Size to resize p (float, optional): probability to apply transformation. Defaults to 0.5.

Parameters:

maxpad (Sequence[int])
resize (Tuple[int, int])

deepvisiontools.data.data_reader

class BaseReader[source]: Base class for readers. __len__ and __getitem__ methods must be implemented in concrete class. Your concrete class must implement concrete category_id property that returns Dict[int, str] where int is label and str category name. Your concrete class must have a class attribute describing annotation file type (“json” for json file, “png” for image etc.) You must implement export_annotation and group_export methods in concrete classes See CocoReader class for concrete implementation

class CocoReader[source]

Child class of BaseReader. Coco format reader class. Handles dataset with structure:

Dataset Name -> Image_dir, coco_annotations.json

Note : bboxes must be in XYWH format

Parameters:: annotation_path (Union[str, Path]) – path to json file or to dataset directory.

Attributes

annot_dict (Dict[Any, Any]): coco dict loaded.

Properties

category_ids Dict[int, str]: label / category correspondance

Methods

export_annotation(image_name, image, format, categories)[source]

from image, image name, categories and target (as BaseFormat) returns a writeable coco dict.

Parameters:

image_name (str)
image (Tensor)
format (BaseFormat)
categories (Dict[int, str]) – Dict of label / category name correspondance

Returns:

image_name, coco dict

Return type:

Tuple[str, Dict[Any, Any]]

get_img_anns(index)[source]

return from index image as img name, spatial size as Tuple[int, int] (h, w) and all annotations for given image index

Parameters:: index (int)
Returns:: img_name, spatial_size, list of coco anns
Return type:: Tuple[str, Tuple[int, int], List[dict]]

rleToMask(rle, shape)[source]

convert rle encoding to binary mask

Parameters:

rle (str)
shape (Tuple[int, int])

Returns:

mask

Return type:

np.ndarray

segment2mask(ann, spatial_size)[source]

Convert segment to object mask.

Parameters:

ann (Dict[Any, Any]) – coco format annotation dict
spatial_size (Tuple[int, int]) – size of image as (h, w)

Returns:

mask

Return type:

np.ndarray

deepvisiontools.formats

class BaseData[source]

Abstract class for base data.

abstract apply_augmentation(image, transform)[source]

Need to be defined in concrete class : apply augmentation on it

Parameters:

image (Tensor) – image to augment
transform (Transform) – torchvision transform v2 augmentation

Returns:

transformed BaseData, present tensor, transformed image

Return type:

Tuple[BaseData, Tensor, Tensor]

class BaseFormat[source]

Base class to wrap BaseData (masks, boxes and others elements) of targets / predictions with labels and scores in deepvisiontools.

Parameters:

data (BaseData)
labels (Tensor)
scores (Union[Tensor, None], optional) – Defaults to None.

Properties

device (Literal["cpu", "cuda"]): When changed, move data, labels and scores stored into same device.
data (BaseData): value of data like InstanceMaskData, BboxData etc.
scores (Union[Tensor, None]): scores as a 1d tensor.
labels (Tensor): labels as a 1d tensor.
nb_object (int): number of objects
canvas_size (Tuple[int, int]): Size of associated image (h, w)

Methods:

apply_augmentation(image, transform)[source]

Apply augmentation. Handles labels as well and image.

Parameters:

form (BaseFormat)
image (Tensor)
transform (Transform)

Returns:

augmented format, present Tensor, augmented image

Return type:

Tuple[BaseFormat, Tensor, Tensor]

sanitize()[source]

Sanitize the format.

Returns:: sanitized Format, indices of present objects
Return type:: Tuple[BaseFormat, Tensor]

class BatchedFormat[source]

A class that handles a list of Formats

Parameters:: formats (List[BaseFormat])

Properties

device (Literal["cpu", "cuda"]): When changed, move all formats into same device.
formats (List[BaseFormat]): contains all stored formats.
size (int): number of formats

Methods

classmethod cat(batches)[source]

batches need to be a list of BatchedFormat of same type !

Parameters:: batches (List[BatchedFormat])

sanitize()[source]: Apply sanitize to all formats

class BboxData[source]

Bounding box data class (child of BaseData)

Parameters:

bbox (Union[BoundingBoxes, Tensor]) – tensor value of bounding box. Shape must be [N, 4]
format (Literal["XYXY", "XYWH", "CXCYWH"], optional) – format of created BoundingBox. Defaults to “XYXY”.
canvas_size (Tuple[int, int], optional) – Size of associated image [h, w]. Defaults to None.

Properties

device (Literal["cpu", "cuda"])
value (BoundingBoxes): Tensor value of bounding box
format (Literal[“XYXY”, “XYWH”, “CXCYWH”]): if changed directly will automatically re-derive value
nb_object (int): number of objects contained.
canvas_size (Tuple[int, int])

Methods:

apply_augmentation(image, transform)[source]

Apply transform on self and associated image

Parameters:

image (Tensor)
transform (Transform)

Returns:

augmented data, present Tensor, image

Return type:

Tuple[BboxData, Tensor, Tensor]

crop(t, l, h, w)[source]

Crop the BboxData object and update values, canvas etc. Note : forcing XYXY format to be compatible with torchvision func but restore format after.

Parameters:

t (int) – top coordinate of crop
l (int) – left coordinate of crop
h (int) – height value of crop
w (int) – width value of crop

Return type:

Tuple[BboxData, Tensor]

classmethod empty(canvas_size)[source]

Return an empty BboxData with value = Tensor of shape [0, 4]

Parameters:: canvas_size (Tuple[int, int]) – size of associated image.
Returns:: empty BboxData
Return type:: BboxData

classmethod from_mask(mask)[source]

Generate BboxData object from mask

Parameters:: mask (Union[InstanceMaskData, Tensor])
Returns:: BboxData
Return type:: BboxData

pad(t, l, r, b)[source]

Pad the BboxData object and update values, canvas etc. Note : forcing XYXY format to be compatible with torchvision func but restore format after.

Parameters:

t (int) – top value of crop
l (int) – left value of crop
r (int) – right value of crop
b (int) – bottom value of crop

Return type:

Tuple[BboxData, Tensor]

sanitize()[source]

remove boxes with width or height = 0 returns as well a tensor of kept objects (useful for labels handling in BaseFormat)

Return type:: Tuple[BboxData | Tensor]

class BboxFormat[source]

Class for Bounding box format (Child class of BaseFormat). contains BBoxData value, labels and scores.

Parameters:

data (BBoxData)
labels (Tensor)
scores (Tensor | None, optional)

Properties & attributes : cf BaseFormat

Methods

classmethod empty(canvas_size)[source]

Create an empty BboxFormat of dimension canvas_size

Parameters:: canvas_size (Tuple[int, int])
Returns:: BboxFormat
Return type:: BboxFormat

classmethod from_instance_mask(mask)[source]

Create a BboxFormat from InstanceMaskFormat

Parameters:: mask (InstanceMaskFormat)
Returns:: BboxFormat
Return type:: BboxFormat

class FormatOperatorHandler[source]

Class that handles operations on format such as crop, pad, sanitize etc.

Methods:

apply_augmentation(form, image, transform)[source]

Apply augmentation on BaseData through its method. Handles labels as well and image.

Parameters:

form (BaseFormat)
image (Tensor)
transform (Transform)

Returns:

augmented format, present Tensor, augmented image

Return type:

Tuple[BaseFormat, Tensor, Tensor]

apply_base_method(form, func, **kwargs)[source]

Apply a base method (crop, pad, sanitize etc.) from BaseData and handles labels modifications. The method must return as well a present objects Tensor : [BaseData, Tensor]

Parameters:

func (str) – func to be called (ex: crop, pad …)
form (BaseFormat)

Returns:

New format and tensor of present objects after operation

Return type:

Tuple[Format, Tensor]

class InstanceMaskData[source]

Instance segmentation data class (Child class of BaseData)

Parameters:: mask (Union[Mask, Tensor]) – Stacked mask (Tensor) of shape [H, W]. Each object is indexed in [1…N] range.

Properties

device (Literal["cpu", "cuda"])
value (BoundingBoxes): Tensor value of stacked instance mask
nb_object (int): number of objects contained.
canvas_size (Tuple[int, int]): dim of mask (h, w)

Methods:

apply_augmentation(image, transform)[source]

Apply transform on self and associated image

Parameters:

image (Tensor)
transform (Transform)

Returns:

augmented data, present Tensor, image

Return type:

Tuple[InstanceMaskData, Tensor, Tensor]

crop(t, l, h, w)[source]

Crop InstanceMaskData to desired coordinates.

Parameters:

t (int) – top crop coord
l (int) – left crop coord
h (int) – height of crop
w (int) – width of crop

Returns:

padded InstanceMaskData, indices of present objects

Return type:

Tuple[InstanceMaskData, Tensor]

classmethod empty(canvas_size)[source]: generate empty instance mask (full of 0) with given canvas_size

classmethod from_binary_masks(mask)[source]

Generate InstanceMaskData from one_hot (binary) mask of shape [N, H, W] where N = number of objects. Note that background must not be included.

Parameters:: mask (Tensor) – one hot mask of shape [N, H, W]
Returns:: Stacked InstanceMaskData
Return type:: InstanceMaskData

pad(t, l, r, b)[source]

Pad InstanceMaskData to desired coordinates. Note : the order t, l, r, b is different between deepvisiontools and torchvision.

Parameters:

t (int) – top padding
l (int) – left padding
r (int) – right padding
b (int) – bottom padding

Returns:

padded InstanceMaskData, indices of present objects

Return type:

Tuple[InstanceMaskData, Tensor]

sanitize()[source]

reindex and remove all small objects from mask

Returns:

new InstanceMaskData without small objects, id of objects which were removed because too small (CAREFUL : 0 is written as 1 in stacked mask, here we index per object not mask)

Return type:

Tuple[InstanceMaskData, Tensor]

class InstanceMaskFormat[source]

Class for Instance Segmentation Format (Child class of BaseFormat). contains InstanceMaskData value, labels and scores.

Parameters:

data (InstanceMaskData)
labels (Tensor)
scores (Tensor | None, optional)

Properties & attributes : cf BaseFormat

Methods

classmethod empty(canvas_size)[source]

Create an empty InstanceMaskFormat of dimension canvas_size

Parameters:: canvas_size (Tuple[int, int])
Returns:: InstanceMaskFormat
Return type:: InstanceMaskFormat

export_semantic_mask()[source]

From self (data.value and labels) generate a semantic mask by replacing objects indexing by their corresponding labels. Note that labels are shifted by 1 as 0 is preserved for background

Returns:

Semantic mask

Return type:

Tensor

class SemanticMaskData[source]

Semantic segmentation data class (Child class of BaseData)

Parameters:: mask (Union[Mask, Tensor]) – Semantic mask with value in [0, …, N_cls] range.

Properties

device (Literal["cpu", "cuda"])
value (Mask): Tensor value of semantic mask
nb_object (int): number of objects contained. In this case, nb of objects is the number of differents classes present in mask.
canvas_size (Tuple[int, int]): dim of mask (h, w)

Methods:

apply_augmentation(image, transform, scores=None)[source]

Apply transform on self and associated image

Parameters:

image (Tensor)
transform (Transform)
scores (Tensor | None) – associated logits score if present

Returns:

augmented data, present Tensor, image

or - Tuple[InstanceMaskData, Tensor, Tensor]:

augmented data, present Tensor, image, augmented scores

Return type:

Tuple[InstanceMaskData, Tensor, Tensor]

crop(t, l, h, w)[source]

Crop SemanticMaskData to desired coordinates.

Parameters:

t (int) – top crop coord
l (int) – left crop coord
h (int) – height of crop
w (int) – width of crop

Returns:

padded InstanceMaskData, indices of present objects

Return type:

Tuple[SemanticMaskData, Tensor]

classmethod empty(canvas_size)[source]: generate empty semantic mask data (full of 0) with given canvas_size

pad(t, l, r, b)[source]

Pad SemanticMaskData to desired coordinates. Note : the order t, l, r, b is different between deepvisiontools and torchvision.

Parameters:

t (int) – top padding
l (int) – left padding
r (int) – right padding
b (int) – bottom padding

Returns:

padded InstanceMaskData, indices of present objects

Return type:

Tuple[SemanticMaskData, Tensor]

sanitize()[source]

Clean SemanticMaskData (simply checks if class have a number of pixel > Configuration().mask.min_size)

Returns:: cleaned data, present objects
Return type:: Tuple[BaseData | Tensor]

class SemanticMaskFormat[source]

Class for Semantic Mask format (Child class of BaseSemanticFormat).

Parameters:

data (SemanticMaskData)
scores (Tensor | None, optional)

Properties & attributes : cf BaseSemanticFormat

Methods

classmethod empty(canvas_size, scores=None)[source]

Create an empty SemanticMaskFormat of dimension canvas_size

Parameters:

canvas_size (Tuple[int, int])
scores (Tensor | None)

Returns:

SemanticMaskFormat

Return type:

BboxFormat

classmethod from_instance_mask(mask, scores=None)[source]

Create a SemanticMaskFormat from InstanceMaskFormat

Parameters:

mask (InstanceMaskFormat)
scores (Tensor | None)

Returns:

SemanticMaskFormat

Return type:

BboxFormat

combine_logits(logit1, logit2)[source]

Used for semantic mask data, and particularly in patchification. Take 2 logits mask and combine them according to Configuration().semantic_mask_logits_combination If one mask has strictly zeros somewhere, just take the other value. If both are zeros become zeros. If both are non zeros, combine them.

Parameters:

logit1 (Tensor)
logit2 (Tensor)

Returns:

Tuple[Tensor]

Return type:

Tuple[Tensor]

get_preds_and_logits(logit1, logit2)[source]

Combine 2 logits (used in __add__ of semanticmaskformat) and return semantic mask and logits

Parameters:

logit1 (Tensor)
logit2 (Tensor)

Return type:

Tuple[Tensor]

logit2pred(logit)[source]: transform logits into pred

mask2boxes(mask)[source]

from stacked (id object = 1 … N) mask (H, W) returns tensor of shape (N, 4)

Parameters:: mask (Tensor)
Return type:: Tensor

reindex_mask_with_splitted_objects(mask)[source]

Function that reidex masks objects by creating new objects if they are disconnected.

Parameters:

mask (Tensor) – Input mask tensor containing disconnected part of given objects.

Returns:

New mask indexed with 1 per object after separating disconnected objects, indexes of original common objects they belonged.

Return type:

Tuple[Tensor, Tensor]

Detailed explanation : Imagine a mask with 2 objects 0 and 1 and the first is separated in two parts disconnected. The new mask will contain 3 objects and the indices will be [0, 0, 1]

deepvisiontools.inference

class BasePatchifier[source]

Abstract class for patchifier. If you want to implement a custom one you need to implement unpatchify method

pad_to(image, new_size)[source]

Pad image to given size :param image: :type image: Tensor :param new_size: :type new_size: Tuple[int, int]

Returns:

padded image, (t, l, r, b)

Return type:

Tuple[Tensor, Tuple[int, int, int, int]]

Parameters:

image (Tensor)
new_size (Tuple[int, int])

patchify(image)[source]

Create patches for image prediction : 1) Pad image to fit all patches, 2) create patches

Parameters:

image (Tensor)

Returns:

patches stacked (N_patch, c, h, w), List of (top, left) pad coordinates, padded image, image pad coordinates

Return type:

Tuple[ Tensor, List[Tuple[int, int]], Tuple[int, int], Tensor, Tuple[int, int]]

class DetectPatchifier[source]

Handle patchification and unpatchification

Parameters:

patch_size (Tuple[int, int]) – size of patches to create
overlap (float) – overlap between patches
border_penalty (float, optional) – penalty to apply on patch border objects before nms. Defaults to 0.5.
nms_iou_threshold (float, optional) – nms threshold. Defaults to 0.45.
final_score_threshold (float, optional) – final score (after penalty) threshold. Defaults to 0.4.

Attributes: patch_size (Tuple[int, int]) overlap (float): overlap between patches border_penalty (float) postprocess (`PostProcesser`)

Methods

unpatchify(pred_patches, origins, image_padded_size, padded_image_coords, original_image_size)[source]

merge patchs predictions together while applying penalty, postprocess etc. Note that since InstanceMasks do not handle overlapping objects, you need to treat directly the patches in the postprocess. To do so you derive boxes, then use boxes to filter the masks.

Parameters:

pred_patches (BatchedFormat)
origins (List[Tuple[int, int]])
image_padded_size (Tuple[int, int])
padded_image_coords (Tuple[int, int, int, int])
original_image_size (Tuple[int, int])

class Evaluator[source]

Evaluator class : evaluate a given Predictor (model + patch_size + additional Configurations) on a dataset with given metrics. The results are saved in a generated csv file (metrics at dataset level) and in a xlsx file (metrics at dataset and sample levels). Highlighting samples that deviate from the mean or median (giving deviation_method) by nb_sigma sigma in the xlsx file. Create visualizations of predictions giving number_visu. Returns the dictionnary with metrics at dataset level and the dictionnary with metrics at sample level. Prints a dataframe with metrics at dataset level (same content as in the generated csv file).

Parameters:

predictor (Predictor) – Predictor class to evaluate
metrics (list) – List of metrics to evaluate the Predictor on
deviation_method (Literal["mean", "median"], optional) – method to compute outlayers. Defaults to “mean”.
nb_sigma (Union[int, float], optional) – number of standard deviations for outlayers. Defaults to 2.

Example:

>>> from deepvisiontools import Evaluator, Predictor
>>> from deepvisiontools.metrics import DetectF1Score
>>> predictor = Predictor(model=\path   o\model.pth)
>>> evaluator = Evaluator(predictor, metrics=[DetectF1Score()])
>>> evaluator.evaluate(mydataset, "results")

Attributes: predictor (Predictor) metrics (List[BaseMetric]) nb_sigma (int) data_type (Literal["instance_mask", "bbox", "keypoint", "semantic_mask"]) deviation_method (`Literal["mean", "median"])

Methods

evaluate(dataset, result_folder, number_visu='all')[source]

Run evaluation on dataset. Compute metrics for dataset and for each sample of the dataset.

Parameters:

dataset (DeepVisionDataset)
result_folder (str | Path)
number_visu (Literal['all'] | int)

class PostProcesser[source]

Handles postprocessing

Parameters:

nms_iou_th (float) – nms threshold
final_score_threshold (float) – final score thresholding

handle_box_duplicates(pred)[source]

Main function that call further function depending on handling mode

Parameters:

pred (BaseFormat) – data to handle duplicates

Returns:

handled duplicate new data

Return type:

BaseFormat

class Predictor[source]

Predictor class for deepvisiontools. Load a model and apply on image, get prediction. Can handle patchification for large image prediction.

Parameters:

model (Union[BaseModel, str, Path]) – model path / instance of BaseModel to be used.
preprocessing (Callable, optional) – used preprocesser. Defaults to build_preprocessing().
patch_size (Union[Tuple[int, int], None], optional) – size of the patchs to be used for large image inference. If None will run the full image. Defaults to None.
overlap (float, optional) – Overlap between patches used in case of patchification. Defaults to 0.4.
border_padding (int, optional) – default image padding when using patchification. Defaults to 100.
batch_size (int, optional) – batch size for patchification. Defaults to 1.
border_penalty (float, optional) – apply a penalty on patch border predictions : makes nms more efficient. Higher is more stringent. Max to 1 and Min to 0. Defaults to 0.5.
nms_iou_threshold (float, optional) – nms threshold to be used when upatchifying. Defaults to 0.45.
final_score_threshold (float, optional) – Apply a score thresholding after penalty and after nms. Defaults to 0.4.
categories (Dict[int, str], optional) – To rename your categories in the visualization.
patchifier (``Union[BasePatchifier, None], optional) – If None use default SemanticPatchifier or DetectPatchifier according to Configuration().data_type. Default to None.
verbose (bool, optional) – if set to True will display progress state in patchs predictions. Default to True.

Example:

>>> from deepvisiontools import Predictor
>>> img = "path/to/img"
>>> predictor = Predictor(model=\path   o\model.pth)
>>> results = predictor.predict(img)

Attributes: model (BaseModel) preprocessing (Callable) patch_size (Union[Tuple[int, int], None]) padder (Transform) batch_size (int) cropper (Transform) patchifier (BasePatchifier) categories (Dict[int, str]) verbose (bool)

Methods

filter_empty_patches(preds_batch_patch, pad_origins)[source]

remove empty patches for unpatchification

Parameters:

preds_batch_patch (BatchedFormat)
pad_origins (List[Tuple[int, int]])

forward_pass(batch_patchs)[source]

Run predictions on image / batch of patches

Parameters:: batch_patchs (Tensor)
Return type:: BatchedFormat

predict(image, visu_path='')[source]

Main function of `Predictor` : call everything needed for prediction.

Parameters:

image (Union[str, Path, Tensor]) – _description_
visu_path (Union[str, Path], optional) – path to visualization to be saved. Defaults to “”.

Returns:

prediction as deepvisiontools format.

Return type:

BaseFormat

class PredictorDataLoader[source]

Wrap predictor patchification output as loader with given batch_size for forward

Parameters:

patches (Tensor)
batch_size (int)

class SemanticPatchifier[source]

Semantic segmentation patchifier/unpatchifier

Parameters:

patch_size (Tuple[int, int])
overlap (float)

deepvisiontools.metrics

class ClassWiseDetectAccuracy[source]: Similar as DetectAccuracy but with multiclass detail. Samplewise is not provided in that case. Multiclass is handled by removing all other classes objects than the considered one in target and prediction for tp, fp, tn, fn computation

class ClassWiseDetectF1score[source]: Similar as DetectF1score but with multiclass detail. Samplewise is not provided in that case. Multiclass is handled by removing all other classes objects than the considered one in target and prediction for tp, fp, tn, fn computation

class ClassWiseDetectMetric[source]

Base class that agregates n_classes DetectMetric(s) to obtain class dependant performances. Note that samplewise scores are not performed here.

Parameters:

func (Callable) – function to apply to tp, fp, tn, fn
name (str, optional) – metric’s name (useful for tensorboard monitoring). Defaults to “ClassWiseDetectionMetric”.

Attributes

classmetrics (List[DetectMetric]): list of detectmetrics specialized in each classes.

compute()[source]

Return metrics values.

Returns:: dictionnary with all “global” DetectMetric in self.classmetrics
Return type:: Dict[str, Tensor]

compute_last_sample()[source]

Return metrics values of the last sample in self.stats.: Used in combination with self.update

Returns:: dictionnary with metric value for all classes combined and for each class
Return type:: Dict[str, Float]

global_macro_compute()[source]

Compute metric with global/macro averraging. Return also metric/class tensor.

Return type:: Tuple[Tensor, Tensor]

global_micro_compute()[source]

Compute metric with global/micro averagging.

Return type:: Tensor

reset()[source]: Reset all metrics in self.classmetrics. Override from torchmetrics Metric

samplewise_macro()[source]

Compute metric with samplewise/macro averagging.

Return type:: Tensor

samplewise_micro()[source]

Compute metric with samplewise/micro averagging.

Return type:: Tensor

to(device)[source]

Move all metrics in self.classmetrics to device. Override from torchmetrics Metric

Parameters:: device (Any)

update(prediction, target)[source]

Update all DetectMetrics in self.classmetrics according to prediction / target.

Parameters:

prediction (Union[BaseFormat, BatchedFormat])
target (Union[BaseFormat, BatchedFormat])

class ClassWiseDetectPrecision[source]: Similar as DetectPrecision but with multiclass detail. Samplewise is not provided in that case. Multiclass is handled by removing all other classes objects than the considered one in target and prediction for tp, fp, tn, fn computation

class ClassWiseDetectRecall[source]: Similar as DetectRecall but with multiclass detail. Samplewise is not provided in that case. Multiclass is handled by removing all other classes objects than the considered one in target and prediction for tp, fp, tn, fn computation

class ClassifAccuracy[source]: Classification accuracy score

class ClassifF1score[source]: Classification F1 score

class ClassifMetric[source]

Child class of torchmetrics metrics for classification. Allow to take Format as inputs and return dict of metric.

Parameters:

func (Callable)
name (str)
kwargs (Any)

compute()[source]: Comput metric with all averag strategy and return a dict with all values.

compute_last_sample()[source]

Return metrics values of the last sample in self.stats.: Used in combination with self.update

Returns:: dictionnary with metric value for all classes combined and for each class
Return type:: Dict[str, Float]

global_macro_compute()[source]

Compute metric with global/macro averraging. Return also metric/class tensor.

Return type:: Tuple[Tensor, Tensor]

global_micro_compute()[source]

Compute metric with global/micro averagging.

Return type:: Tensor

samplewise_macro()[source]

Compute metric with samplewise/macro averagging.

Return type:: Tensor

samplewise_micro()[source]

Compute metric with samplewise/micro averagging.

Return type:: Tensor

class ClassifRecall[source]: Classification recall score

class DetectAccuracy[source]: Accuracy for detection task. In case of detection, tn is none : -> 0 for computation

class DetectF1score[source]: F1 score for detection task.

class DetectMetric[source]

Base class for custom detection metric with torchmetrics engine

Parameters:

func (Callable) – function to apply to tp, fp, tn, fn
name (str, optional) – metric’s name (useful for tensorboard monitoring). Defaults to “DetectionMetric”.

compute()[source]

Return metric computed with internal state.

Returns:: dictionnary with aggregation_method: value
Return type:: Dict[str, Tensor]

compute_last_sample()[source]

Return metrics values of the last sample in self.stats.: Used in combination with self.update

Returns:: dictionnary with metric value for all classes combined and for each class
Return type:: Dict[str, Float]

global_micro_compute()[source]

Compute metric with global/micro averagging.

Return type:: Tensor

samplewise_micro()[source]

Compute metric with samplewise/micro averagging.

Return type:: Tensor

update(prediction, target)[source]

Update metric’s internal state with prediction target comparison (tp, fp, tn, fn)

Parameters:

prediction (Union[BaseFormat, BatchedFormat])
target (Union[BaseFormat, BatchedFormat])

class DetectPrecision[source]: Precision for detection task.

class DetectRecall[source]: Recall for detection task.

class Matcher[source]

Class that handles the matching of prediction and targets to get tp, fp, fn

match_boxes(pred, targ)[source]

compute box cross ious for matching

Parameters:

pred (BaseFormat)
targ (BaseFormat)

Return type:

Tuple[int, int, int, Tuple[Tensor, Tensor]]

match_instance_masks(pred, targ)[source]

compute instance_mask cross ious for matching

Parameters:

pred (InstanceMaskFormat)
targ (InstanceMaskFormat)

match_pred_target(pred, targ)[source]

Matches predictions and targets

Parameters:

pred (Format)
targ (Format)

Returns:

tp, fp, fn, (matched_predictions indices, matched_targets_indices)

Return type:

Tuple[int, int, int, Tuple[Tensor, Tensor]]

class SemanticAccuracy[source]: Semantic accuracy score

class SemanticF1score[source]: Semantic F1 score

class SemanticIoU[source]: Semantic iou score

class SemanticPrecision[source]: Semantic precision score

class SemanticRecall[source]: Semantic recall score

class SemanticSegmentationMetric[source]

Child class of ClassifMetric. Move from instance to semantic segmentation paradigm to provide stats based on classes masks (instead of objects).

Parameters:

func (Callable)
name (str)
kwargs (Any)

update(prediction, target)[source]

Convert target & prediction to semantic mask to compute stats in semantic segmentation paradigm. Update internal state.

Parameters:

prediction (BaseFormat | BatchedFormat)
target (BaseFormat | BatchedFormat)

deepvisiontools.models

class BaseModel[source]

Base Class for deepvisiontools models.

Attributes

confidence_thr (float): Confidence score threshold to consider object as true prediction.
model_max_detection (int): Maximum number of object to predict on one image.
model_nms_threshold (float): IoU threshold to consider 2 boxes as overlapping for Non Max Suppression algorithm.
num_classes (int): Number of classes.

Methods:

abstract build_results(raw_outputs)[source]

Transform model outputs into BaseFormat for results. This function also apply instances selection on results according to args:

confidence_thr
model_max_detection
model_nms_threshold

Parameters:

raw_outputs (Any) – Model outputs.

Returns:

Model output for batch.

Return type:

BatchedFormats

property device

Send model to device.

Parameters:: device (Literal['cpu', 'cuda']) – Device to send model on.

abstract get_predictions(images)[source]

Prepare images, Apply model forward pass and build results.

Parameters:

images (Tensor) – RGB images Tensor.

Returns:

Predictions for images as BatchedFormats.

Return type:

BatchedFormats

abstract prepare(images, targets=None)[source]

Transform images and targets into model specific format for prediction & loss computation.

Parameters:

images (Tensor) – Batch images.
targets (BatchedFormats, optional) – Batched targets from DetectionDataset.

Returns:

Images data prepared for model.
If targets: images + targets prepared for model.

Return type:

Union[Any, Tuple[Any]]

abstract run_forward(images, targets)[source]

Compute loss from images and if target passed, compute loss & return both loss dict and results.

Parameters:

images (Tensor) – Batch RGB images.
targets (BatchedFormats) – Batch targets.
predict (bool, optional) – To return predictions or not. Defaults to False.

Returns:

Loss dict.
If predict: Predictions.

Return type:

Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormats]]

class Mask2Former[source]

Mask2Former class, child class of Mask2FormerForUniversalSegmentation from hugging face. To use, data_type must be set to instance_mask.

Parameters:

pretrain (Literal["large", "medium", "small", "tiny", ""], optional) – Pretrained architecture. Defaults to “tiny”.
overlap_mask_thr (float, optional) – Defaults to 0.8.

Attributes: processor (Mask2FormerImageProcessor) overlap_mask_thr (float)

Properties: queries (torch.nn.Embedding) : number of queries / dim for embedding. To use setter please provide int or Tuple[int, int]. In case only a int is provided dimensional embedding is 256, otherwise Tuple is query number, dim.

Notes

Type:: When used for large image inference, Mask2Former is less performant if trained on smaller patches. One way out is to increase the query number. Please check property description. Future amelioration on this matter is under developpment.

Methods

build_results(raw_outputs, spatial_size)[source]

Transform model outputs into BatchedFormat for results.

Parameters:

raw_outputs (Mask2FormerForUniversalSegmentationOutput) – Mask2Former output.
spatial_size (Tuple[int, int]) – Size of original image (H, W).

Returns:

Model output as BatchedFormat.

Return type:

BatchedFormats

get_predictions(images)[source]

Prepare images, Apply model forward pass and build results.

Parameters:

images (Tensor) – RGB images Tensor.

Returns:

Predictions for images as BatchedFormat.

Return type:

BatchedFormat

inputs_to_device(input, device)[source]

Send Mask2Former inputs to device.

Parameters:

input (Any)
device (Literal['cpu', 'cuda'])

prepare(images, targets=None)[source]

Transform images and targets into Mask2Former specific format for prediction & loss computation.

Parameters:

images (Tensor) – Batch images.
targets (BatchedFormats, optional) – Batched targets from DetectionDataset.

Returns:

Images data prepared for Mask2Former.
If targets: images + targets prepared for Mask2Former.

Return type:

Union[Any, Tuple[Any]]

prepare_target(target)[source]

Prepare target in Mask2Former format

Parameters:: target (InstanceMaskFormat)
Return type:: Tuple[Tensor, Dict[int, int]]

run_forward(images, targets)[source]

Compute loss from images and if target passed, compute loss & return both loss dict and results.

Parameters:

images (Tensor) – Batch RGB images.
targets (BatchedFormat) – Batch targets.

Returns:

Loss dict and prediction if model in eval mode.

Return type:

Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormat]]

class SMP[source]

Factory class that wraps segmentation-models-pytorch (smp) models into deepvisiontools. These models are used for semantic segmentation tasks. Using this class you can use all available models, encoder and whatever additional arguments from segmentation model pytorch. Please provide further parameters using non positional arguments (ex : arg=myadditionalarg) Note that you can use any smp loss as well by simply providing and instance of smp losses : loss=smp.loss.WantedLoss()

smp : https://github.com/qubvel-org/segmentation_models.pytorch

Parameters:: architecture (SegmentationModel, optional) – SMP model architecture : need to provide a smp class (type). Defaults to smp.Unet.

Example:

>>> from deepvisiontools.models import SMP
>>> import segmentation_models_pytorch as smp
>>> my_model = SMP(smp.Unet, encoder_name="vgg11", loss=smp.losses.FocalLoss(mode="binary"))

class TimmYolo[source]

This class combines any timm library encoder compatible with features_only=True with a Yolo detection head. This leverage complex encodeur, potentially with attention layers, while remaining flexible on the input image size. The idea is to patchify all images that run through the model, perform feature prediction, combine the feature and run the fully convolutional yolo detection head. **Note: ** This model does not have a forward method ! use run() or get_predictions instead.

Parameters:

backbone_name (str, optional) – timm backbone. Defaults to “swin_small_patch4_window7_224”. Has been tested with “vit_large_patch14_dinov2” and “resnet50.a1_in1k” as well
num_classes (int, optional) – Defaults to 1.
pretrained (bool, optional) – Defaults to True.
overlap (float | int | Tuple[int, int] | None, optional) – If different of None use the pixel given value for overlap (careful it must be compatible with the reduction level).
None. (If none it uses the maximum reduction x 2. Defaults to)
internal_batch_size (int, optional) – Number of patch to run simultaneously. Defaults to 1.

build_results(raw_outputs, prebuild_outputs, original_img_size)[source]

Transform model outputs into Batch BboxFormat for results.

Parameters:

raw_outputs (List[Tensor]) – Model outputs.
prebuild_outputs (Tensor) – Extracted boxes from outputs in eval mode.
original_img_size (Size)

Returns:

Batched predictions.

Return type:

BatchedFormats

compute_loss(raw_outputs, targets)[source]

Compute loss with predictions & targets.

Parameters:

raw_outputs (Any) – Raw output of model.
targets (DetectionFormat) – Targets in YOLO format.

Returns:

Loss dict with total loss (key: “loss”) & sublosses.

Return type:

Dict[str, Tensor]

prepare(images, targets=None)[source]

Pad images and target so final patch match exactly image border.

Parameters:

images (_type_) – image to be prepared
targets (_type_, optional) – tragets to be prepared. Defaults to None.

Returns:

prepared images, prepared targets, original image size

Return type:

Tuple[Tensor, BatchedFormat | None, torch.Size]

prepare_target(targets, img_size)[source]

Return target from BatchedFormat to ultralytics yolo format.

Parameters:

targets (BatchedFormat)
img_size (Tuple[int, int])

Returns:

target as per ultralytics Yolo format.

Return type:

Dict[str, Tensor]

retrieve_spatial_size(raw_outputs)[source]

Retrieve image shape from raw_outputs and stride values.

Parameters:

raw_outputs (List[Tensor]) – Raw ouptuts from YOLO model.

Returns:

Size of input image (H, W).

Return type:

Tuple[int]

class Yolo[source]

Yolo detection model. data_type must be either bbox or instance_mask to use this model.

Parameters:

architecture (Literal["yolon", "yolom", "yolol", "yolox"], optional) – Yolo model size. You can add “-p2” or “-p6” to load the p2 or p6 variants. Defaults to “yolon”.
pretrained (bool, optional) – Use pretrained weights. Defaults to True.
reg_max (int, optional) – reg_max argument of yolo models (impacts object size detection). See ultralytics for more information. Defaults to 16.
loss_factor (float, optional) – divide yolo loss value (important for mixed precision to keep it below a certain range). Defaults to 1.

Attributes

criterion (v8DetectionLoss): Yolo loss from ultralytics.
args (Any) : ultralytics Yolo’s configuration params.
pad_requirements (int) : pad requirements as per yolo (image shape multiple of 32 is the basic, but depends for p2 or p6). Note that is set automatically.

Properties

device (Literal["cuda", "cpu"]): model’s device

Methods

build_results(raw_outputs, prebuild_outputs)[source]

Transform model outputs into Batch BboxFormat for results.

Parameters:

raw_outputs (List[Tensor]) – Model outputs.
prebuild_outputs (Tensor) – Extracted boxes from outputs in eval mode.

Returns:

Batched predictions.

Return type:

BatchedFormats

compute_loss(raw_outputs, targets)[source]

Compute loss with predictions & targets.

Parameters:

raw_outputs (Any) – Raw output of model.
targets (DetectionFormat) – Targets in YOLO format.

Returns:

Loss dict with total loss (key: “loss”) & sublosses.

Return type:

Dict[str, Tensor]

get_predictions(images)[source]

Prepare images, Apply YOLO forward pass and build results.

Parameters:

images (Tensor) – RGB images Tensor.

Returns:

Predictions for images as BatchedFormats.

Return type:

BatchedFormats

prepare(images, targets=None)[source]

Pad image / targets to fit yolo divisibility by 32 criterium and move targets to yolo format. If no targets passed simply returns images

Parameters:

images (Tensor) – batched images [N, 3, H, W]
targets (Union[BatchedFormat, None])

Returns:

Either : images_padded, yolo_targets OR images_padded

Return type:

Union[Tuple[Tensor, Dict], Tensor]

prepare_target(targets, img_size)[source]

Return target from BatchedFormat to ultralytics yolo format.

Parameters:

targets (BatchedFormat)
img_size (Tuple[int, int])

Returns:

target as per ultralytics Yolo format.

Return type:

Dict[str, Tensor]

retrieve_spatial_size(raw_outputs)[source]

Retrieve image shape from raw_outputs and stride values.

Parameters:

raw_outputs (List[Tensor]) – Raw ouptuts from YOLO model.

Returns:

Size of input image (H, W).

Return type:

Tuple[int]

run_forward(images, targets)[source]

Compute loss from images and if target passed, compute loss & return both loss dict and results.

Parameters:

images (Tensor) – Batch RGB images.
targets (BatchedFormat) – Batch targets.

Returns:

Loss dict.
If predict: predictions.

Return type:

Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormat]]

class YoloSeg[source]

Yolo detection model. data_type must be either bbox or instance_mask to use this model.

Parameters:

architecture (Literal["yolon", "yolom", "yolol", "yolox"], optional) – Yolo model size. You can add “-p2” or “-p6” to load the p2 or p6 variants. Defaults to “yolon”.
pretrained (bool, optional) – Use pretrained weights. Defaults to True.
reg_max (int, optional) – reg_max argument of yolo models (impacts object size detection). See ultralytics for more information. Defaults to 16.
loss_factor (float, optional) – divide yolo loss value (important for mixed precision to keep it below a certain range). Defaults to 1.

Attributes

criterion (v8DetectionLoss): Yolo loss from ultralytics.
args (Any) : ultralytics Yolo’s configuration params.
pad_requirements (int) : pad requirements for yoloseg (basic is image shape must be multiple of 32)
mask_logit_threshold (int) : mask logit threshold to consider if pixel is class or background. Default is 0.5 but can be changed.

Properties

device (Literal["cuda", "cpu"]): model’s device

Methods

build_results(raw_outputs, get_logit=False)[source]

Transform model outputs into Batch InstanceMaskFormat for results.

Parameters:

raw_outputs (List[Tensor]) – Model outputs.
get_logit (bool)

Returns:

Batched predictions.

Return type:

BatchedFormats

compute_loss(predictions, target)[source]

Compute loss with predictions & targets.

Parameters:

predictions (Any) – Raw output of model.
target (Dict[Any, Any]) – Targets in YOLO format.

Returns:

Loss dict with total loss (key: “loss”) & sublosses.

Return type:

Dict[str, Tensor]

get_predictions(images)[source]

Prepare images, Apply YOLO forward pass and build results.

Parameters:

images (Tensor) – RGB images Tensor.

Returns:

Predictions for images as BatchedFormats.

Return type:

BatchedFormats

prebuild_output(raw_outputs)[source]

Unpack Yolo-seg (eval mode) raw results.

Parameters:

raw_output (Tuple[Tensor, ...]) – Yolo raw eval mode results.
raw_outputs (Tuple[Tensor, ...])

Returns:

boxes (N_batch, N_obj, cxcywh).
cls_scores (N_batch, N_cls).
mask_weights (N_batch, N_obj, 32).
protos (N_batch, protos).

Return type:

Tuple[Tensor, ...]

prepare(images, targets=None)[source]

Pad image / targets to fit yolo divisibility by 32 criterium and move targets to yolo format. If no targets passed simply returns images

Parameters:

images (Tensor) – batched images [N, 3, H, W]
targets (Union[BatchedFormat, None])

Returns:

Either : images_padded, yolo_targets OR images_padded

Return type:

Union[Tuple[Tensor, Dict], Tensor]

prepare_target(targets)[source]

Transform SegmentationFormat targets into yolo-seg targets format.

Parameters:

targets (BatchedFormats) – Batch targets.

Returns:

Targets in YOLO format.

Return type:

Dict[str, Tensor]

retrieve_spatial_size(raw_outputs)[source]

Retrieve image shape from raw_outputs and stride values.

Parameters:

raw_outputs (List[Tensor]) – Raw ouptuts from YOLO model.

Returns:

Size of input image (H, W).

Return type:

Tuple[int]

run_forward(images, targets)[source]

Compute loss from images and if target passed, compute loss & return both loss dict and results.

Parameters:

images (Tensor) – Batch RGB images.
targets (BatchedFormat) – Batch targets.

Returns:

Loss dict.
If predict: predictions.

Return type:

Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormat]]

deepvisiontools.models.mask2former

deepvisiontools.models.yolo

class Yolo[source]

Yolo detection model. data_type must be either bbox or instance_mask to use this model.

Parameters:

architecture (Literal["yolon", "yolom", "yolol", "yolox"], optional) – Yolo model size. You can add “-p2” or “-p6” to load the p2 or p6 variants. Defaults to “yolon”.
pretrained (bool, optional) – Use pretrained weights. Defaults to True.
reg_max (int, optional) – reg_max argument of yolo models (impacts object size detection). See ultralytics for more information. Defaults to 16.
loss_factor (float, optional) – divide yolo loss value (important for mixed precision to keep it below a certain range). Defaults to 1.

Attributes

criterion (v8DetectionLoss): Yolo loss from ultralytics.
args (Any) : ultralytics Yolo’s configuration params.
pad_requirements (int) : pad requirements as per yolo (image shape multiple of 32 is the basic, but depends for p2 or p6). Note that is set automatically.

Properties

device (Literal["cuda", "cpu"]): model’s device

Methods

build_results(raw_outputs, prebuild_outputs)[source]

Transform model outputs into Batch BboxFormat for results.

Parameters:

raw_outputs (List[Tensor]) – Model outputs.
prebuild_outputs (Tensor) – Extracted boxes from outputs in eval mode.

Returns:

Batched predictions.

Return type:

BatchedFormats

compute_loss(raw_outputs, targets)[source]

Compute loss with predictions & targets.

Parameters:

raw_outputs (Any) – Raw output of model.
targets (DetectionFormat) – Targets in YOLO format.

Returns:

Loss dict with total loss (key: “loss”) & sublosses.

Return type:

Dict[str, Tensor]

get_predictions(images)[source]

Prepare images, Apply YOLO forward pass and build results.

Parameters:

images (Tensor) – RGB images Tensor.

Returns:

Predictions for images as BatchedFormats.

Return type:

BatchedFormats

prepare(images, targets=None)[source]

Pad image / targets to fit yolo divisibility by 32 criterium and move targets to yolo format. If no targets passed simply returns images

Parameters:

images (Tensor) – batched images [N, 3, H, W]
targets (Union[BatchedFormat, None])

Returns:

Either : images_padded, yolo_targets OR images_padded

Return type:

Union[Tuple[Tensor, Dict], Tensor]

prepare_target(targets, img_size)[source]

Return target from BatchedFormat to ultralytics yolo format.

Parameters:

targets (BatchedFormat)
img_size (Tuple[int, int])

Returns:

target as per ultralytics Yolo format.

Return type:

Dict[str, Tensor]

retrieve_spatial_size(raw_outputs)[source]

Retrieve image shape from raw_outputs and stride values.

Parameters:

raw_outputs (List[Tensor]) – Raw ouptuts from YOLO model.

Returns:

Size of input image (H, W).

Return type:

Tuple[int]

run_forward(images, targets)[source]

Compute loss from images and if target passed, compute loss & return both loss dict and results.

Parameters:

images (Tensor) – Batch RGB images.
targets (BatchedFormat) – Batch targets.

Returns:

Loss dict.
If predict: predictions.

Return type:

Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormat]]

box_nms_filter(format)[source]

Filter Format according to nms threshold from Configuration()

Parameters:: format (Format)
Returns:: Format
Return type:: BaseFormat

confidence_filter(format)[source]

Filter Format according to confidence threshold from Configuration()

Parameters:: format (Format)
Returns:: Format
Return type:: BaseFormat

normalize_boxes(boxes, img_size)[source]

Normalize boxes to 1 -> h, w -> [0, 1] according to img size

Parameters:

boxes (Tensor) – boxes Tensor
img_size (Tuple[int, int]) – image size

Returns:

normalized boxes

Return type:

Tensor

yolo_pad_requirements(h, w, required=32)[source]

Conmpute pad coordinates to ensure /32 or /64 yolo criterium on height and width

Parameters:

h (int) – height
w (int) – width

Returns:

top, left, right, bottom

Return type:

Tuple[int, int, int, int]

deepvisiontools.models.yoloseg

class YoloSeg[source]

Yolo detection model. data_type must be either bbox or instance_mask to use this model.

Parameters:

architecture (Literal["yolon", "yolom", "yolol", "yolox"], optional) – Yolo model size. You can add “-p2” or “-p6” to load the p2 or p6 variants. Defaults to “yolon”.
pretrained (bool, optional) – Use pretrained weights. Defaults to True.
reg_max (int, optional) – reg_max argument of yolo models (impacts object size detection). See ultralytics for more information. Defaults to 16.
loss_factor (float, optional) – divide yolo loss value (important for mixed precision to keep it below a certain range). Defaults to 1.

Attributes

criterion (v8DetectionLoss): Yolo loss from ultralytics.
args (Any) : ultralytics Yolo’s configuration params.
pad_requirements (int) : pad requirements for yoloseg (basic is image shape must be multiple of 32)
mask_logit_threshold (int) : mask logit threshold to consider if pixel is class or background. Default is 0.5 but can be changed.

Properties

device (Literal["cuda", "cpu"]): model’s device

Methods

build_results(raw_outputs, get_logit=False)[source]

Transform model outputs into Batch InstanceMaskFormat for results.

Parameters:

raw_outputs (List[Tensor]) – Model outputs.
get_logit (bool)

Returns:

Batched predictions.

Return type:

BatchedFormats

compute_loss(predictions, target)[source]

Compute loss with predictions & targets.

Parameters:

predictions (Any) – Raw output of model.
target (Dict[Any, Any]) – Targets in YOLO format.

Returns:

Loss dict with total loss (key: “loss”) & sublosses.

Return type:

Dict[str, Tensor]

get_predictions(images)[source]

Prepare images, Apply YOLO forward pass and build results.

Parameters:

images (Tensor) – RGB images Tensor.

Returns:

Predictions for images as BatchedFormats.

Return type:

BatchedFormats

prebuild_output(raw_outputs)[source]

Unpack Yolo-seg (eval mode) raw results.

Parameters:

raw_output (Tuple[Tensor, ...]) – Yolo raw eval mode results.
raw_outputs (Tuple[Tensor, ...])

Returns:

boxes (N_batch, N_obj, cxcywh).
cls_scores (N_batch, N_cls).
mask_weights (N_batch, N_obj, 32).
protos (N_batch, protos).

Return type:

Tuple[Tensor, ...]

prepare(images, targets=None)[source]

Pad image / targets to fit yolo divisibility by 32 criterium and move targets to yolo format. If no targets passed simply returns images

Parameters:

images (Tensor) – batched images [N, 3, H, W]
targets (Union[BatchedFormat, None])

Returns:

Either : images_padded, yolo_targets OR images_padded

Return type:

Union[Tuple[Tensor, Dict], Tensor]

prepare_target(targets)[source]

Transform SegmentationFormat targets into yolo-seg targets format.

Parameters:

targets (BatchedFormats) – Batch targets.

Returns:

Targets in YOLO format.

Return type:

Dict[str, Tensor]

retrieve_spatial_size(raw_outputs)[source]

Retrieve image shape from raw_outputs and stride values.

Parameters:

raw_outputs (List[Tensor]) – Raw ouptuts from YOLO model.

Returns:

Size of input image (H, W).

Return type:

Tuple[int]

run_forward(images, targets)[source]

Compute loss from images and if target passed, compute loss & return both loss dict and results.

Parameters:

images (Tensor) – Batch RGB images.
targets (BatchedFormat) – Batch targets.

Returns:

Loss dict.
If predict: predictions.

Return type:

Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormat]]

proto2mask(protos, weights, boxes, shape)[source]

Combine protos and weights to get masks, then crop instances from boxes (Useful in predictions).

Parameters:

protos (Tensor) – Sub masks (32, …).
weights (Tensor) – YOLO mask weights (32, …).
boxes (Tensor) – Boxes (N, 4) in XYXY format.
shape (Tuple[int]) – Original image size (H, W).

Returns:

YOLO segmentation mask.

Return type:

Tensor

deepvisiontools.preprocessing

build_preprocessing(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])[source]

Defaults values are from Imagenet.

Parameters:

mean (List[float], optional) – mean values for each channels Defaults to [0.485, 0.456, 0.406].
std (List[float], optional) – std values for each channels. Defaults to [0.229, 0.224, 0.225].

Return type:

T.Compose

get_channels_statistics(image_folder)[source]

Iterate over image folder and output mean and std for each channels for the dataset of images.

Parameters:: image_folder (str) – path to folder of images
Returns:: values for mean and std
Return type:: Tuple[List[float]]

load_image(image_path)[source]

Load image using torchvision. Handles png, tiff, jpg, jpeg extensions.

Parameters:: image_path (str) – Path to image.
Returns:: image in torch Tensor [3, H, W].
Return type:: Tensor

load_mask(mask_path)[source]

Load image using torchvision. Handles png, tiff, jpg, jpeg extensions.

Parameters:

image_path (str) – Path to image.
mask_path (str | Path)

Returns:

image in torch Tensor [3, H, W].

Return type:

Tensor

save_image(image, path)[source]

Transform image in PIL format and save to given path.

Parameters:

image (Tensor | Image)
path (str | Path)

Return type:

save_mask(mask, path)[source]

Transform mask in PIL format and save to given path.

Parameters:

mask (Tensor | Image)
path (str | Path)

Return type:

deepvisiontools.train

class Aggregator[source]

Aggregator aggregate losses across batchs.

Attributes:

iterations

Number of iterations.

Type:: int

losses

Dictionnary of epoch losses (over iterations).

Type:: Dict[str, Tensor]

Methods:

compute()[source]

Return loss dict with values divided by iterations (Mean accross samples).

Returns:

Losses over iterations.

Return type:

Dict[str, Tensor]

update(batch_losses)[source]

Update internal loss dict with new losses.

Parameters:: batch_losses (Dict[str, Tensor]) – Dict of losses.

class Trainer[source]

Class that handles training in deepvisiontools. Handles train / valid epochs, monitoring (via tensorboard) and metrics computation.

Parameters:

model (BaseModel) – deepvisiontools model.
optimizer (Optimizer) – torch optimizer (Ex: Adam())
metrics (List[Union[DetectMetric, ClassWiseDetectMetric, SemanticSegmentationMetric, ClassifMetric]], optional) – List of deepvisiontools metrics. Check available metrics in deepvisiontools.metrics.available_metrics Defaults to [].
log_dir (str, optional) – tensorboard output directory. If “” no monitoring is provided. Defaults to “”.

Example:

Attributes

model (BaseModel): deepvisiontools model.
optimizer (Optimizer): torch optimizer (Ex: Adam())
metrics (List[Union[DetectMetric, ClassWiseDetectMetric, SemanticSegmentationMetric, ClassifMetric]], optional): List of deepvisiontools metrics. Check available metrics in deepvisiontools.metrics.available_metrics Defaults to [].
board (SummaryWriter): tensorboard output directory.

Properties

device (Literal["cpu", "cuda"]) : the setter move evrything that’s needed to desired device.

Methods

epoch(loader, ep_number, tag='')[source]

Run trainning epoch.

Parameters:

loader (DeepVisionLoader) – DeepVisionLoader.
ep_number (int) – Epoch number.
tag (str, optional) – Tag to link to epoch. Defaults to “”.

Returns:

Epochs values (Losses & metrics).

Return type:

Dict[str, Tensor]

log_string(epoch_dict)[source]

Transform epoch dict in string.

Parameters:

epoch_dict (Dict[str, Tensor]) – Dict of epoch values to display.

Returns:

String to print with epoch values.

Return type:

str

train_epoch(loader, ep_number, tag='Train')[source]

Run train epoch.

Parameters:

loader (DetectionLoader) – DetectionLoader.
ep_number (int) – Epoch number.
tag (str, optional) – Tag to link to epoch. Defaults to “Train”.

Returns:

Epochs values (Losses).

Return type:

Dict[str, Tensor]

train_step(images, targets, scaler)[source]

Run forward pass, loss computation and backward pass.

Parameters:

images (Tensor) – Batch images
targets (BatchedFormat) – Batch targets.
scaler (GradScaler)

Returns:

Dict of losses containing (total loss at key ‘loss’).

Return type:

Dict[str, Tensor]

valid_epoch(loader, ep_number, tag='Valid')[source]

Run train epoch.

Parameters:

loader (DetectionLoader) – DetectionLoader.
ep_number (int) – Epoch number.
tag (str, optional) – Tag to link to epoch. Defaults to “Valid”.

Returns:

Epochs values (Losses & metrics).

Return type:

Dict[str, Tensor]

valid_step(images, targets, scaler)[source]

Run forward, compute metrics, return loss dict and metrics.

Parameters:

images (Tensor) – Batch images.
targets (BatchedFormat) – Targets.
scaler (GradScaler)

Returns:

Losses and metrics values.

Return type:

Tuple[Dict[str, Tensor], Dict[str, Dict[str, Tensor]]]

deepvisiontools.utils

class Visualizer[source]

From image and target generates a visualization image as Tensor.

Parameters:

image (Tensor) – Original image
target (BaseFormat) – Target to visualize
categories (Dict[int, str], optional) – Categories to be used as labels as Dict[int, str]. If set to None will use label indexes. Defaults to None.
save_path (Union[str, Path], optional) – Path to save visualization, if set to “” will not save the visualization. Defaults to “”.
class_colors (List[Sequence[float]], optional) – Colors to be used for classes. Needs to be RGB normalized (divided by 255). Defaults to cc.glasbey_bw.
instance_colors (List[Sequence[float]], optional) – Colors to be used for instances (see class colors constraints). Defaults to cc.glasbey_hv.
desired_min_size (int, optional) – Resize to this specific min size value (preserving shape). Defaults to 1200.
show (bool, optional) – Either to display it on the flight or not. Defaults to False.
window_mode (Literal["dual", "single"], optional) – if dual : provide a combination of image + visu, otherwise will provide only visu. Defaults to “dual”.

visualization(image, target, categories=None, save_path='', class_colors=[[0.843137, 0.0, 0.0], [0.54902, 0.235294, 1.0], [0.007843, 0.533333, 0.0], [0.0, 0.67451, 0.780392], [0.596078, 1.0, 0.0], [1.0, 0.498039, 0.819608], [0.423529, 0.0, 0.309804], [1.0, 0.647059, 0.188235], [0.0, 0.0, 0.615686], [0.52549, 0.439216, 0.407843], [0.0, 0.286275, 0.258824], [0.309804, 0.164706, 0.0], [0.0, 0.992157, 0.811765], [0.737255, 0.717647, 1.0], [0.584314, 0.705882, 0.478431], [0.752941, 0.015686, 0.72549], [0.145098, 0.4, 0.635294], [0.156863, 0.0, 0.254902], [0.862745, 0.701961, 0.686275], [0.996078, 0.960784, 0.564706], [0.313725, 0.270588, 0.356863], [0.643137, 0.486275, 0.0], [1.0, 0.443137, 0.4], [0.247059, 0.505882, 0.431373], [0.509804, 0.0, 0.05098], [0.639216, 0.482353, 0.701961], [0.203922, 0.305882, 0.0], [0.607843, 0.894118, 1.0], [0.921569, 0.0, 0.466667], [0.176471, 0.0, 0.039216], [0.368627, 0.564706, 1.0], [0.0, 0.780392, 0.12549], [0.345098, 0.003922, 0.666667], [0.0, 0.117647, 0.0], [0.603922, 0.278431, 0.0], [0.588235, 0.623529, 0.65098], [0.607843, 0.258824, 0.360784], [0.0, 0.121569, 0.196078], [0.784314, 0.768627, 0.0], [1.0, 0.815686, 1.0], [0.0, 0.745098, 0.603922], [0.215686, 0.082353, 1.0], [0.176471, 0.145098, 0.145098], [0.87451, 0.345098, 1.0], [0.745098, 0.905882, 0.752941], [0.498039, 0.270588, 0.596078], [0.321569, 0.309804, 0.235294], [0.847059, 0.4, 0.0], [0.392157, 0.454902, 0.219608], [0.756863, 0.45098, 0.533333], [0.431373, 0.454902, 0.541176], [0.501961, 0.615686, 0.011765], [0.745098, 0.545098, 0.396078], [0.388235, 0.2, 0.223529], [0.792157, 0.803922, 0.854902], [0.423529, 0.921569, 0.513725], [0.133333, 0.25098, 0.411765], [0.635294, 0.498039, 1.0], [0.996078, 0.011765, 0.796078], [0.462745, 0.737255, 0.992157], [0.85098, 0.764706, 0.509804], [0.807843, 0.639216, 0.807843], [0.427451, 0.313725, 0.0], [0.0, 0.411765, 0.454902], [0.278431, 0.623529, 0.368627], [0.580392, 0.776471, 0.74902], [0.976471, 1.0, 0.0], [0.752941, 0.329412, 0.270588], [0.0, 0.396078, 0.235294], [0.356863, 0.313725, 0.658824], [0.32549, 0.12549, 0.392157], [0.309804, 0.372549, 1.0], [0.494118, 0.560784, 0.466667], [0.72549, 0.031373, 0.980392], [0.545098, 0.572549, 0.764706], [0.701961, 0.0, 0.207843], [0.533333, 0.376471, 0.494118], [0.623529, 0.0, 0.458824], [1.0, 0.870588, 0.768627], [0.317647, 0.031373, 0.0], [0.101961, 0.031373, 0.0], [0.298039, 0.537255, 0.713725], [0.0, 0.87451, 0.87451], [0.784314, 1.0, 0.980392], [0.188235, 0.207843, 0.082353], [1.0, 0.152941, 0.278431], [1.0, 0.592157, 0.666667], [0.015686, 0.0, 0.101961], [0.788235, 0.376471, 0.694118], [0.764706, 0.635294, 0.215686], [0.486275, 0.309804, 0.227451], [0.976471, 0.619608, 0.466667], [0.337255, 0.396078, 0.392157], [0.819608, 0.576471, 1.0], [0.176471, 0.121569, 0.411765], [0.254902, 0.105882, 0.203922], [0.686275, 0.576471, 0.596078], [0.384314, 0.619608, 0.6], [0.741176, 0.870588, 0.482353], [1.0, 0.368627, 0.580392], [0.058824, 0.160784, 0.137255], [0.721569, 0.745098, 0.67451], [0.454902, 0.231373, 0.396078], [0.062745, 0.0, 0.05098], [0.498039, 0.431373, 0.741176], [0.619608, 0.419608, 0.231373], [1.0, 0.27451, 0.0], [0.498039, 0.0, 0.529412], [1.0, 0.807843, 0.243137], [0.188235, 0.231373, 0.262745], [0.996078, 0.647059, 1.0], [0.541176, 0.007843, 0.243137], [0.462745, 0.172549, 0.003922], [0.039216, 0.541176, 0.588235], [0.019608, 0.0, 0.321569], [0.556863, 0.839216, 0.196078], [0.32549, 0.768627, 0.45098], [0.278431, 0.34902, 0.443137], [0.345098, 0.007843, 0.133333], [0.65098, 0.133333, 0.003922], [0.564706, 0.576471, 0.298039], [0.0, 0.262745, 0.117647], [0.505882, 0.0, 0.819608], [0.184314, 0.14902, 0.247059], [0.74902, 0.223529, 0.517647], [0.960784, 1.0, 0.835294], [0.0, 0.827451, 1.0], [0.415686, 0.0, 0.972549], [0.611765, 0.733333, 0.823529], [0.478431, 0.85098, 0.670588], [0.411765, 0.341176, 0.364706], [0.0, 0.411765, 0.019608], [0.211765, 0.211765, 0.611765], [0.003922, 0.513725, 0.278431], [0.266667, 0.117647, 0.094118], [0.027451, 0.647059, 0.937255], [1.0, 0.505882, 0.188235], [0.654902, 0.333333, 0.721569], [0.407843, 0.352941, 0.513725], [0.45098, 1.0, 1.0], [0.85098, 0.529412, 0.007843], [0.733333, 0.827451, 1.0], [0.556863, 0.215686, 0.184314], [0.654902, 0.627451, 0.501961], [0.0, 0.490196, 0.890196], [0.556863, 0.494118, 0.560784], [0.6, 0.266667, 0.533333], [0.0, 0.945098, 0.207843], [0.682353, 0.666667, 0.788235], [0.627451, 0.380392, 0.384314], [0.298039, 0.227451, 0.466667], [0.423529, 0.509804, 0.513725], [0.945098, 0.866667, 0.905882], [1.0, 0.733333, 0.827451], [0.219608, 0.647059, 0.137255], [0.705882, 1.0, 0.658824], [0.047059, 0.070588, 0.027451], [0.843137, 0.321569, 0.431373], [0.584314, 0.623529, 0.996078], [0.490196, 0.498039, 0.0], [0.462745, 0.623529, 0.72549], [0.858824, 0.529412, 0.498039], [0.066667, 0.07451, 0.098039], [0.831373, 0.509804, 0.831373], [0.623529, 0.0, 0.74902], [0.862745, 0.937255, 1.0], [0.556863, 0.670588, 0.603922], [0.443137, 0.392157, 0.258824], [0.290196, 0.235294, 0.243137], [0.031373, 0.305882, 0.372549], [0.611765, 0.721569, 0.266667], [0.847059, 0.870588, 0.835294], [0.796078, 1.0, 0.423529], [0.701961, 0.392157, 0.921569], [0.27451, 0.364706, 0.2], [0.0, 0.619608, 0.490196], [0.760784, 0.254902, 0.0], [0.309804, 0.737255, 0.733333], [0.85098, 0.545098, 0.694118], [0.356863, 0.45098, 0.713725], [0.294118, 0.254902, 0.003922], [0.584314, 0.513725, 0.368627], [0.286275, 0.454902, 0.545098], [1.0, 0.45098, 1.0], [0.513725, 0.415686, 0.113725], [0.862745, 0.811765, 1.0], [0.494118, 0.419608, 0.996078], [0.388235, 0.462745, 0.376471], [1.0, 0.756863, 0.572549], [0.34902, 0.368627, 0.0], [0.894118, 0.035294, 0.901961], [0.72549, 0.694118, 0.717647], [0.827451, 0.176471, 0.254902], [0.196078, 0.258824, 0.215686], [0.85098, 0.639216, 0.388235], [0.356863, 0.545098, 0.2], [0.184314, 0.121569, 0.0], [0.596078, 0.905882, 0.843137], [0.164706, 0.384314, 0.341176], [0.807843, 0.447059, 0.301961], [0.364706, 0.239216, 0.156863], [0.0, 0.34902, 0.85098], [0.678431, 0.580392, 0.839216], [0.419608, 0.117647, 0.580392], [0.705882, 0.003922, 0.368627], [0.254902, 0.0, 0.27451], [0.615686, 1.0, 0.811765], [0.894118, 0.282353, 0.615686], [0.890196, 0.890196, 0.278431], [0.862745, 0.886275, 0.647059], [0.0, 0.156863, 0.352941], [0.666667, 0.356863, 0.509804], [0.0, 0.0, 0.862745], [0.294118, 0.305882, 0.317647], [0.854902, 0.74902, 0.835294], [0.0, 0.301961, 0.6], [0.533333, 0.392157, 0.619608], [0.415686, 0.117647, 0.113725], [0.556863, 0.321569, 0.772549], [0.721569, 0.854902, 0.87451], [0.866667, 0.701961, 0.992157], [0.482353, 0.282353, 0.329412], [0.298039, 0.45098, 0.0], [0.270588, 0.0, 0.466667], [0.698039, 0.372549, 0.0], [0.572549, 0.819608, 0.52549], [0.333333, 0.2, 0.298039], [0.411765, 0.690196, 0.521569], [0.670588, 0.576471, 0.690196], [0.905882, 0.329412, 0.258824], [0.560784, 0.54902, 0.541176], [0.439216, 0.678431, 0.317647], [0.670588, 0.486275, 0.454902], [0.0, 0.203922, 0.235294], [0.145098, 0.058824, 0.07451], [0.905882, 0.690196, 0.0], [0.478431, 0.8, 0.862745], [0.094118, 0.078431, 0.227451], [0.615686, 0.321569, 0.223529], [0.733333, 0.482353, 0.192157], [0.717647, 0.792157, 0.580392], [0.192157, 0.031373, 0.0], [0.639216, 0.584314, 0.023529], [0.0, 0.854902, 0.729412], [0.454902, 0.627451, 0.870588], [0.388235, 0.235294, 0.45098], [1.0, 0.854902, 0.560784], [0.466667, 0.721569, 0.0], [0.25098, 0.184314, 0.113725], [0.345098, 0.529412, 0.34902], [0.176471, 0.0, 0.129412], [0.960784, 0.631373, 0.831373], [0.854902, 0.0, 0.666667], [0.462745, 0.160784, 0.286275], [0.741176, 0.898039, 0.0], [0.764706, 0.760784, 0.364706]], instance_colors=[[0.188235, 0.635294, 0.854902], [0.988235, 0.309804, 0.188235], [0.898039, 0.682353, 0.219608], [0.427451, 0.564706, 0.309804], [0.545098, 0.545098, 0.545098], [0.090196, 0.745098, 0.811765], [0.580392, 0.403922, 0.741176], [0.839216, 0.152941, 0.156863], [0.121569, 0.466667, 0.705882], [0.890196, 0.466667, 0.760784], [0.54902, 0.337255, 0.294118], [0.737255, 0.741176, 0.133333], [0.227451, 0.003922, 0.513725], [0.0, 0.262745, 0.0], [0.058824, 1.0, 0.662745], [0.368627, 0.0, 0.25098], [0.776471, 0.741176, 1.0], [0.258824, 0.313725, 0.321569], [0.721569, 0.0, 0.501961], [1.0, 0.717647, 0.701961], [0.490196, 0.007843, 0.0], [0.380392, 0.14902, 1.0], [1.0, 1.0, 0.603922], [0.682353, 0.788235, 0.670588], [0.0, 0.52549, 0.486275], [0.333333, 0.227451, 0.0], [0.580392, 0.988235, 1.0], [0.0, 0.74902, 0.0], [0.490196, 0.0, 0.627451], [0.670588, 0.447059, 0.0], [0.568627, 1.0, 0.0], [0.003922, 0.745098, 0.541176], [0.0, 0.270588, 0.482353], [0.784314, 0.509804, 0.435294], [1.0, 0.121569, 0.513725], [0.866667, 0.0, 1.0], [0.019608, 0.454902, 0.0], [0.392157, 0.266667, 0.380392], [0.533333, 0.560784, 1.0], [1.0, 0.713725, 0.956863], [0.32549, 0.384314, 0.215686], [0.807843, 0.521569, 1.0], [0.407843, 0.415686, 0.517647], [0.745098, 0.705882, 0.745098], [0.647059, 0.376471, 0.537255], [0.584314, 0.827451, 1.0], [0.003922, 0.0, 0.972549], [1.0, 0.501961, 0.007843], [0.545098, 0.160784, 0.270588], [0.678431, 0.627451, 0.427451], [0.32549, 0.270588, 0.545098], [0.784314, 1.0, 0.85098], [0.666667, 0.27451, 0.0], [1.0, 0.47451, 0.560784], [0.513725, 0.827451, 0.443137], [0.564706, 0.619608, 0.74902], [0.580392, 0.0, 0.960784], [0.921569, 0.815686, 0.607843], [0.678431, 0.545098, 0.694118], [0.0, 0.388235, 0.290196], [1.0, 0.862745, 0.0], [0.533333, 0.466667, 0.317647], [0.494118, 0.670588, 0.639216], [0.0, 0.0, 0.592157], [0.960784, 0.0, 0.776471], [0.396078, 0.2, 0.160784], [0.0, 0.4, 0.470588], [0.015686, 0.890196, 0.784314], [0.654902, 0.215686, 0.682353], [0.772549, 0.858824, 0.882353], [0.301961, 0.431373, 1.0], [0.607843, 0.576471, 0.003922], [0.803922, 0.345098, 0.419608], [0.937255, 0.870588, 0.996078], [0.47451, 0.352941, 0.0], [0.372549, 0.533333, 0.603922], [0.705882, 1.0, 0.572549], [0.368627, 0.447059, 0.419608], [0.321569, 0.0, 0.4], [0.019608, 0.529412, 0.317647], [0.517647, 0.12549, 0.435294], [0.235294, 0.588235, 0.019608], [0.396078, 0.45098, 0.0], [0.945098, 0.627451, 0.423529], [0.372549, 0.313725, 0.270588], [0.741176, 0.0, 0.290196], [0.815686, 0.407843, 0.152941], [0.843137, 0.588235, 0.670588], [0.537255, 0.364706, 1.0], [0.509804, 0.423529, 0.462745], [0.168627, 0.333333, 0.72549], [0.431373, 0.486275, 0.733333], [0.905882, 0.835294, 0.827451], [0.364706, 0.0, 0.094118], [0.486275, 0.231373, 0.003922], [0.501961, 0.694118, 0.490196], [0.784314, 0.85098, 0.490196], [0.0, 0.909804, 0.231373], [0.486275, 0.698039, 1.0], [1.0, 0.333333, 1.0], [0.643137, 0.152941, 0.129412], [0.113725, 0.894118, 1.0], [0.490196, 0.686275, 0.231373], [0.482353, 0.294118, 0.568627], [0.878431, 1.0, 0.282353], [0.419608, 0.0, 0.768627], [0.803922, 0.658824, 0.592157], [0.745098, 0.388235, 0.768627], [0.537255, 0.803922, 0.807843], [0.27451, 0.011765, 0.784314], [0.368627, 0.572549, 0.47451], [0.254902, 0.290196, 0.003922], [0.019608, 0.654902, 0.615686], [0.811765, 0.54902, 0.215686], [1.0, 0.972549, 0.815686], [0.262745, 0.329412, 0.443137], [0.709804, 0.266667, 1.0], [0.811765, 0.286275, 0.576471], [0.811765, 0.643137, 0.87451], [0.580392, 0.831373, 0.0], [0.654902, 0.580392, 0.854902], [0.176471, 0.647059, 0.345098], [0.552941, 0.890196, 0.713725], [0.643137, 0.662745, 0.615686], [0.423529, 0.360784, 0.717647], [1.0, 0.494118, 0.368627], [0.654902, 0.513725, 0.541176], [0.686275, 0.745098, 0.847059], [0.164706, 0.768627, 1.0], [0.65098, 0.407843, 0.239216], [0.964706, 0.568627, 0.996078], [0.529412, 0.294118, 0.392157], [1.0, 0.047059, 0.294118], [0.129412, 0.368627, 0.137255], [0.258824, 0.572549, 1.0], [0.529412, 0.513725, 0.615686], [0.403922, 0.176471, 0.270588], [0.694118, 0.309804, 0.254902], [0.0, 0.305882, 0.32549], [0.372549, 0.105882, 0.0], [0.678431, 0.254902, 0.403922], [0.313725, 0.196078, 0.403922], [0.839216, 1.0, 0.992157], [0.498039, 0.709804, 0.819608], [0.662745, 0.72549, 0.411765], [1.0, 0.588235, 0.796078], [0.784314, 0.454902, 0.584314], [0.211765, 0.313725, 0.223529], [1.0, 0.815686, 0.388235], [0.368627, 0.345098, 0.384314], [0.529412, 0.580392, 0.462745], [0.662745, 0.470588, 1.0], [0.011765, 0.784314, 0.388235], [0.905882, 0.745098, 0.831373], [0.831373, 0.890196, 0.815686], [0.529412, 0.403922, 0.564706], [0.537255, 0.486275, 0.152941], [0.803922, 0.862745, 1.0], [0.666667, 0.403922, 0.419608], [0.196078, 0.203922, 0.454902], [1.0, 0.368627, 0.662745], [0.0, 0.607843, 0.690196], [0.443137, 1.0, 0.866667], [0.470588, 0.360784, 0.219608], [0.313725, 0.396078, 0.607843], [0.8, 0.0, 0.701961], [0.341176, 0.482353, 0.333333], [0.317647, 0.431373, 0.482353], [0.003922, 0.372549, 0.572549], [0.666667, 0.741176, 0.745098], [0.003922, 0.498039, 0.6], [0.015686, 0.866667, 0.592157], [0.529412, 0.227451, 0.172549], [0.941176, 0.588235, 0.556863], [0.458824, 0.776471, 0.666667], [0.439216, 0.411765, 0.364706], [0.8, 0.862745, 0.035294], [0.686275, 0.521569, 0.341176], [0.847059, 0.0, 0.458824], [0.615686, 0.247059, 0.505882], [0.85098, 0.270588, 0.0], [0.866667, 0.403922, 0.329412], [0.372549, 1.0, 0.47451], [0.835294, 0.694118, 0.45098], [0.384314, 0.14902, 0.368627], [0.729412, 0.635294, 0.239216], [0.85098, 0.94902, 0.701961], [0.341176, 0.007843, 0.560784], [0.631373, 0.607843, 0.666667], [0.301961, 0.290196, 0.152941], [0.643137, 0.662745, 1.0], [0.67451, 0.909804, 0.858824], [0.6, 0.34902, 0.003922], [0.67451, 0.0, 0.886275], [0.278431, 0.509804, 0.184314], [0.796078, 0.764706, 0.678431], [0.0, 0.772549, 0.713725], [0.380392, 0.32549, 0.470588], [0.2, 0.427451, 0.407843], [0.647059, 0.572549, 0.501961], [0.517647, 0.6, 0.635294], [0.992157, 0.341176, 0.392157], [0.439216, 0.588235, 0.823529], [0.447059, 0.552941, 0.027451], [0.498039, 0.0, 0.298039], [0.082353, 0.188235, 0.627451], [0.819608, 0.756863, 0.886275], [0.788235, 0.521569, 0.815686], [0.423529, 0.270588, 0.294118], [0.498039, 0.0, 0.141176], [0.0, 0.635294, 0.47451], [0.698039, 0.662745, 0.811765], [0.976471, 0.0, 0.0], [0.690196, 0.913725, 1.0], [0.576471, 0.619608, 0.313725], [0.447059, 0.478431, 0.509804], [0.85098, 0.180392, 0.333333], [0.278431, 0.380392, 0.003922], [0.0, 0.34902, 1.0], [0.466667, 0.25098, 0.709804], [0.67451, 0.894118, 0.376471], [0.403922, 0.270588, 0.145098], [0.321569, 0.364706, 0.317647], [0.584314, 0.45098, 0.407843], [0.662745, 0.894118, 0.603922], [0.639216, 0.0, 0.345098], [0.85098, 0.384314, 0.964706], [0.556863, 0.490196, 0.811765], [1.0, 0.741176, 0.576471], [0.639216, 0.0, 0.572549], [0.603922, 1.0, 0.72549], [0.654902, 0.760784, 1.0], [0.956863, 0.384314, 0.0], [0.898039, 0.941176, 1.0], [0.721569, 0.611765, 0.643137], [0.376471, 0.588235, 0.580392], [1.0, 0.623529, 0.207843], [0.54902, 0.160784, 0.0], [0.447059, 0.419608, 0.196078], [0.87451, 0.509804, 0.305882], [0.686275, 0.482353, 0.835294], [0.737255, 0.176471, 0.0], [0.482353, 0.435294, 0.639216], [0.282353, 0.262745, 0.384314], [0.780392, 0.639216, 1.0], [0.0, 0.301961, 0.156863], [0.768627, 0.776471, 0.556863], [0.878431, 0.282353, 0.843137], [0.905882, 0.913725, 0.396078], [0.898039, 0.756863, 0.043137], [0.0, 0.956863, 0.945098], [0.623529, 0.356863, 0.635294], [0.298039, 0.254902, 0.717647], [0.396078, 0.2, 0.556863], [0.462745, 0.494118, 0.423529], [0.662745, 0.541176, 0.211765]], desired_min_size=1200, show=False, window_mode='dual')[source]

From image and target generates a visualization image as Tensor.

Parameters:

image (Tensor) – Original image
target (BaseFormat) – Target to visualize
categories (Dict[int, str], optional) – Categories to be used as labels as Dict[int, str]. If set to None will use label indexes. Defaults to None.
save_path (Union[str, Path], optional) – Path to save visualization, if set to “” will not save the visualization. Defaults to “”.
class_colors (List[Sequence[float]], optional) – Colors to be used for classes. Needs to be RGB normalized (divided by 255). Defaults to cc.glasbey_bw.
instance_colors (List[Sequence[float]], optional) – Colors to be used for instances. Defaults to cc.glasbey_hv.
desired_min_size (int, optional) – Resize to this specific min size value (preserving shape). Defaults to 1200.
show (bool, optional) – Either to display it on the flight or not. Defaults to False.
window_mode (Literal["dual", "single"], optional) – if dual : provide a combination of image + visu, otherwise will provide only visu. Defaults to “dual”.

Returns:

visualization Tensor

Return type:

Tensor