deepvisiontools.config
- class Configuration[source]
Configuration class for deepvisiontools (Singleton -> can be instancied once and then point to same object). Store all configuration information about the library configuration. If you wish to changes parameters later on, you can simply modify the corresponding attributes.
- Parameters:
device (
Literal["cpu", "cuda"], optional) – Device to be used by default when creating objects, running models etc. Defaults to “cpu”.data_type (
Literal["instance_mask", "bbox", "keypoint", "semantic_mask"], optional) – Default format to use in dataset, models, prediction etc. Defaults to “bbox”.num_classes (
int, optional) – Number of classes in model. Defaults to 1.mask_min_size (
int, optional) – Minimal size of mask to be considered : below this threshold annotation will be ignored. Defaults to 15.semantic_mask_logits_combination (
Literal["avg", "min", "max"]) – How to combine logits in patchification or adding semantic masks. avg takes the mean, min takes the minimum and max takes the maximum.splitted_mask_handling (
bool, optional) – If set to True redefine masks that are splitted (after cropping for example) so they belong to independant objects. Defaults to False.model_nms_threshold (
float, optional) – Default nms iou threshold used in models. Defaults to 0.45.model_confidence_threshold (
float, optional) – Default model confidence threshold to consider it a valid object prediction. Defaults to 0.5.model_max_detection (
int, optional) – Maximum number of objects outputed by a model (useful for some models such as yolo type models). Defaults to 300.metrics_matcher_type (
Literal["bbox", "instance_mask"], optional) – Object matcher data type used in metrics (note that instance_mask is slower because needs transition to cpu to save gpu memory). If data_type is instance mask and matcher type is bbox will convert it to bbox for matching in metrics. Defaults to “bbox”.metrics_match_iou_threshold (
float, optional) – Metrics matcher iou threshold. Defaults to 0.45.patchifier_mode (
Literal["bbox", "instance_mask"], optional) – Patchifier data type used for nms and duplicate supresser. If data_type is mask and patchifier to bbox will convert for according operations. Defaults to “bbox”.seed (
Union[False, int], optional) – use a manual seed to enforce reproducibility (you probably want to also switch deterministic to True in that case). If False it ends reproducibility. Defaults to False.deterministic (
bool, optional) – Use deterministic algorithms. Helps further reproducibility (see also seeds). Be careful : some models can’t be deterministic so sometimes you need to switch it to False even if you are manually seeding. Defaults to False.
Example:
>>> from deepvisiontools import Configuration() >>> config = Configuration(data_type = "instance_mask") # can instantiate with given parameter >>> config.device = "cuda" # can modify parameters by modifying attributes / properties
- Attributes
data_type (
Literal["instance_mask", "bbox", "keypoint", "semantic_mask"], optional): Default format to use in dataset, models, prediction etc. Defaults to “bbox”.num_classes (
int): Number of classes in model. Defaults to 1.mask_min_size (
int): Minimal size of mask to be considered : below this threshold annotation will be ignored. Defaults to 15.semantic_mask_logits_combination (
Literal["avg", "min", "max"]): How to combine logits in patchification or adding semantic masks. avg takes the mean, min takes the minimum and max takes the maximum.splitted_mask_handling (
bool): If set to True redefine masks that are splitted (after cropping for example) so they belong to independant objects. Defaults to False.model_nms_threshold (
float): Default nms iou threshold used in models. Defaults to 0.45.model_confidence_threshold (
float): Default model confidence threshold to consider it a valid object prediction. Defaults to 0.5.model_max_detection (
int): Maximum number of objects outputed by a model (useful for some models such as yolo type models). Defaults to 300.metrics_matcher_type (
Literal["bbox", "instance_mask"]): Object matcher data type used in metrics. If data_type is instance mask and matcher type is bbox will convert it to bbox for matching in metrics. Defaults to “bbox”.metrics_match_iou_threshold (
float): Metrics matcher iou threshold. Defaults to 0.45.patchifier_mode (
Literal["bbox", "instance_mask"]): Patchifier data type used for nms and duplicate supresser. If data_type is mask and patchifier to bbox will convert for according operations. Defaults to “bbox”.
- Properties
device (
Literal["cpu", "cuda"]): Device to be used by default when creating objects, running models etc. Defaults to “cpu”.seed (
Union[False, int], optional): use a manual seed to enforce reproducibility (you probably want to also switch deterministic to True in that case). If False it ends reproducibility. Defaults to False.deterministic (
bool, optional): Use deterministic algorithms. Helps further reproducibility (see also seeds). Be careful : some models can’t be deterministic so sometimes you need to switch it to False even if you are manually seeding. Defaults to False.
Notes
If you use instance_mask, bbox will included when needed from the masks.
you can change model_nms_threshold and model_confidence_threshold for the entire lib by modifying the attributes
3) In instance mode, by default the target
Formatremove small objects in case their masks contains less than min_mask_threshold (default 5 pixels). Change the attribute to modify this behaviour. 4) The option splitted_mask_handling is by default False. If you set to True, when performing transformation on object mask that split it into discontinuous sub-masks the library creates new objects for every sub-masks. Otherwise they will still be describing the same unique object
deepvisiontools.data
- class AbstractBatchAugmenter[source]
Abstract class for augmentation within DataLoader (combine elements of batch together such as mosaic type augmentation) Note : these augmentations always come after normal augmentations that are implemented in Dataset instead of dataloader for this one.
- class Augmentation[source]
Class that handles augmentation in dataset. Call on different Formats (data_type) specific methods :param augmentations: List of torchvision.transforms.v2 Transform classes (or from deepvisiontools.data.additional_augmentations) :type augmentations: List[T.Transform]
- Parameters:
augmentations (List[Transform])
- class DeepVisionDataset[source]
Detection dataset class for deepvisiontools : load and return image, annotation, image name.
- Parameters:
dataset_path (Union[str, Path]) – path to dataset folder.
reader (BaseReader, optional) – Class to read data from dataset folder. Defaults to CocoReader.
preprocessing (Callable, optional) – Preprocessing images (normalization). Defaults to build_preprocessing().
augmentation (List[Transform], optional) – Augmentation to apply to images / annotations. Must be from torchvision.transforms.v2.Transform Defaults to None.
label_converter (Dict[int, int], optional) – Convert labels to another value. For e.g : {0: 2, 1: 5} etc. Defaults to None.
category_ids (Union[Dict[int, str], None])
Example:
>>> from deepvisiontools import DeepVisionDataset >>> data_path = "path/to/data" >>> dataset = DeepVisionDataset(data_path) >>> image, target, image_name = dataset[1] >>> print(type(image), type(target), type(image_name)) <class 'torch.Tensor' >, <class 'BboxFormat' >, <class 'str'> >>> print(image.shape, target.size, image_name) torch.Size([3,512,512]), 5, 'img_01.png'
- Attributes
dataset_path (
Path): path to dataset folder.reader (
BaseReader): Class to read data from dataset folder. Defaults to CocoReader.preprocessing (
Callable): Preprocessing images (normalization). Defaults to build_preprocessing().augmentation (
List[Transform]): Augmentation to apply to images / annotations. Must be from torchvision.transforms.v2.Transform Defaults to None.category_ids (
Dict[int, str]): Dict that associate a name to a category label index. Defaults is equal to self.reader.category_idslabel_converter (
Dict[int, int]): Convert labels to another value. For e.g : {0: 2, 1: 5} etc. Defaults to None.
Methods:
- export_dataset(destination_folder, number_visu='all', file_extension='')[source]
Export dataset accordingly to BaseReader class. For example CocoReader will export in following structure: Dataset Name -> Image_dir, coco_annotations.json
- Parameters:
destination_folder (Union[str, Path]) – Path to new dataset folder.
number_visu (Union[Literal["all"], int], optional) – number of visualization to create. If “all” will derive all of them. Defaults to “all”.
file_extension (str, optional) – if requires a specific file extension. If “” will use BaseReader’s. Defaults to “”.
- keep_indexes(indexes)[source]
Filter dataset by keeping only indices given in arg.
- Parameters:
indexes (
Union[list, slice, Tensor]) – can be slice, Tensor or list. To use slice please use : slice(i, j) with i, j desired slice indexes in arg.- Return type:
- split(sequence)[source]
split dataset in 3 new datasets according to proportions
- Parameters:
sequence (Sequence[float, float, float]) – proportions to split the dataset into. Sum must be 1.
- Return type:
Tuple[DeepVisionDataset, DeepVisionDataset, DeepVisionDataset]
Example:
>>> dataset = DeepVisionDataset("path/to/dataset") >>> train_dataset, valid_dataset, test_dataset = dataset.split((0.6, 0.2, 0.2))
- class DeepVisionLoader[source]
Child class of
DataLoaderthat batchify images and BaseFormats. DetectionLoader support any features from torch Dataloaders (Sampler, etc..).- Parameters:
*args
*kwargs
Example:
>>> from deepvisiontools import DeepVisionLoader >>> loader = DeepVisionLoader(dataset, batch_size=2) >>> for batch in loader: >>> img, target, img_name = batch
Methods:
- collate_fn(batch)[source]
- Parameters:
batch (
List[Tuple[Tensor, BaseFormat]]) – List of pairs image/target.- Returns:
Batch images (N, 3, H, W).
BaseFormats wrapped into BatchedFormats class.
- Return type:
Tuple[Tensor, BatchedFormats]
- pad_to_larger(images, targets)[source]
Pad images and targets to larger image size.
- Parameters:
images (
List[Tensor]) – Images.targets (
List[BaseFormat]) – Targets.
- Return type:
Tuple[List[Tensor], List[BaseFormat]]
- class MosaicBatchAugmenter[source]
This Batch augmentation generate a mosaic containing n images from a batch (mix some patch of images / targets into one image). If the number of image is larger than batch size shift to smaller possibility (for e.g. n = 4 batch_size=3 -> n becomes 2). if number of image to be mixed is smaller than batch_size, create new mosaics if possible : for e.g batchsize = 5, n=2 -> generate 2 mosaics from the first 2 images, then an additional 2 images with remaining, and finally the remaining is 1. The remaining images are left untouched
- Args:
mixed_img_numb (
Literal[1, 2, 4, 6, 8, 9, 12], optional): Number of img per mosaic. Defaults to 2. probability (float, optional): _description_. Defaults to 0.5.
- Parameters:
mixed_img_numb (Literal[1, 2, 4, 6, 8, 9, 12])
probability (float)
- class RandomCenterCropAndResize[source]
With a given probability, apply CenterCrop and Resize from torchvision.transforms.v2. NB : here we resize only and systematically if cropped.
- Args:
crop (
Union[int, Sequence[int]]): Size to crop resize (Union[int, Sequence[int]]): Size to resize p (float, optional): probability. Defaults to 0.5.
- Parameters:
crop (Sequence[int])
resize (Sequence[int])
- class RandomChangeBackground[source]
With a given probability p, swap image background. New background is taken from an image folder for which path is provided. Note 1 : it is implemented only for instance_mask, semantic_mask and bbox data type Note 2 : new background image type must be one of .jpg, .jpeg, .png, .tif, .tiff, .PNG, .JPG, .JPEG, .TIF, .TIFF :param background_dir_path: Path to background folder :type background_dir_path:
Union[str, Path]:param p: Probability. Defaults to 0.5. :type p:float, optional- Parameters:
background_dir_path (str | Path)
p (float)
- class RandomCropAndResize[source]
With a given probability, apply RandomCrop and Resize from torchvision.transforms.v2. NB : here we resize only and systematically if cropped.
- Parameters:
crop (
Union[int, Sequence[int]]) – Size to cropresize (
Union[int, Sequence[int]]) – Size to resizep (
float, optional) – probability. Defaults to 0.5.
Methods:
- class RandomPadAndResize[source]
With a given probability, apply Pad and Resize from torchvision.transforms.v2. This looks like a zoom out effect by decreasing spatial resolution. NB : here we resize only and systematically if Padded.
- Args:
MaxPad (
Union[int, Sequence[int]]): maximum padding bounds can be int for common padding bound for all borders or sequence of 4 ints for (t, l, b, r) resize (Tuple[int, int]): Size to resize p (float, optional): probability to apply transformation. Defaults to 0.5.
- Parameters:
maxpad (Sequence[int])
resize (Tuple[int, int])
deepvisiontools.data.data_reader
- class BaseReader[source]
Base class for readers. __len__ and __getitem__ methods must be implemented in concrete class. Your concrete class must implement concrete category_id property that returns Dict[int, str] where int is label and str category name. Your concrete class must have a class attribute describing annotation file type (“json” for json file, “png” for image etc.) You must implement export_annotation and group_export methods in concrete classes See CocoReader class for concrete implementation
- class CocoReader[source]
Child class of BaseReader. Coco format reader class. Handles dataset with structure:
Dataset Name -> Image_dir, coco_annotations.json
Note : bboxes must be in XYWH format
- Parameters:
annotation_path (Union[str, Path]) – path to json file or to dataset directory.
- Attributes
annot_dict (
Dict[Any, Any]): coco dict loaded.
- Properties
category_ids Dict[int, str]: label / category correspondance
Methods
- export_annotation(image_name, image, format, categories)[source]
from image, image name, categories and target (as BaseFormat) returns a writeable coco dict.
- Parameters:
image_name (str)
image (Tensor)
format (BaseFormat)
categories (Dict[int, str]) – Dict of label / category name correspondance
- Returns:
image_name, coco dict
- Return type:
Tuple[str, Dict[Any, Any]]
- get_img_anns(index)[source]
return from index image as img name, spatial size as Tuple[int, int] (h, w) and all annotations for given image index
- Parameters:
index (int)
- Returns:
img_name, spatial_size, list of coco anns
- Return type:
Tuple[str, Tuple[int, int], List[dict]]
deepvisiontools.formats
- class BaseData[source]
Abstract class for base data.
- abstract apply_augmentation(image, transform)[source]
Need to be defined in concrete class : apply augmentation on it
- Parameters:
image (
Tensor) – image to augmenttransform (
Transform) – torchvision transform v2 augmentation
- Returns:
transformed BaseData, present tensor, transformed image
- Return type:
Tuple[BaseData, Tensor, Tensor]
- class BaseFormat[source]
Base class to wrap BaseData (masks, boxes and others elements) of targets / predictions with labels and scores in deepvisiontools.
- Parameters:
data (BaseData)
labels (Tensor)
scores (Union[Tensor, None], optional) – Defaults to None.
- Properties
device (
Literal["cpu", "cuda"]): When changed, move data, labels and scores stored into same device.data (
BaseData): value of data like InstanceMaskData, BboxData etc.scores (
Union[Tensor, None]): scores as a 1d tensor.labels (
Tensor): labels as a 1d tensor.nb_object (
int): number of objectscanvas_size (
Tuple[int, int]): Size of associated image (h, w)
Methods:
- apply_augmentation(image, transform)[source]
Apply augmentation. Handles labels as well and image.
- Parameters:
form (
BaseFormat)image (
Tensor)transform (
Transform)
- Returns:
augmented format, present Tensor, augmented image
- Return type:
Tuple[BaseFormat, Tensor, Tensor]
- sanitize()[source]
Sanitize the format.
- Returns:
sanitized Format, indices of present objects
- Return type:
Tuple[BaseFormat, Tensor]
- class BatchedFormat[source]
A class that handles a list of Formats
- Parameters:
formats (
List[BaseFormat])
- Properties
device (
Literal["cpu", "cuda"]): When changed, move all formats into same device.formats (
List[BaseFormat]): contains all stored formats.size (
int): number of formats
Methods
- classmethod cat(batches)[source]
batches need to be a list of BatchedFormat of same type !
- Parameters:
batches (List[BatchedFormat])
- class BboxData[source]
Bounding box data class (child of BaseData)
- Parameters:
bbox (Union[BoundingBoxes, Tensor]) – tensor value of bounding box. Shape must be [N, 4]
format (Literal["XYXY", "XYWH", "CXCYWH"], optional) – format of created BoundingBox. Defaults to “XYXY”.
canvas_size (Tuple[int, int], optional) – Size of associated image [h, w]. Defaults to None.
- Properties
device (
Literal["cpu", "cuda"])value (
BoundingBoxes): Tensor value of bounding boxformat (Literal[“XYXY”, “XYWH”, “CXCYWH”]): if changed directly will automatically re-derive value
nb_object (
int): number of objects contained.canvas_size (Tuple[int, int])
Methods:
- apply_augmentation(image, transform)[source]
Apply transform on self and associated image
- Parameters:
image (
Tensor)transform (
Transform)
- Returns:
augmented data, present Tensor, image
- Return type:
Tuple[BboxData, Tensor, Tensor]
- crop(t, l, h, w)[source]
Crop the BboxData object and update values, canvas etc. Note : forcing XYXY format to be compatible with torchvision func but restore format after.
- Parameters:
t (
int) – top coordinate of cropl (
int) – left coordinate of croph (
int) – height value of cropw (
int) – width value of crop
- Return type:
Tuple[BboxData, Tensor]
- classmethod empty(canvas_size)[source]
Return an empty BboxData with value = Tensor of shape [0, 4]
- Parameters:
canvas_size (Tuple[int, int]) – size of associated image.
- Returns:
empty BboxData
- Return type:
- classmethod from_mask(mask)[source]
Generate BboxData object from mask
- Parameters:
mask (
Union[InstanceMaskData, Tensor])- Returns:
BboxData- Return type:
- pad(t, l, r, b)[source]
Pad the
BboxDataobject and update values, canvas etc. Note : forcing XYXY format to be compatible with torchvision func but restore format after.- Parameters:
t (
int) – top value of cropl (
int) – left value of cropr (
int) – right value of cropb (
int) – bottom value of crop
- Return type:
Tuple[BboxData, Tensor]
- class BboxFormat[source]
Class for Bounding box format (Child class of BaseFormat). contains BBoxData value, labels and scores.
- Parameters:
data (BBoxData)
labels (Tensor)
scores (Tensor | None, optional)
Properties & attributes : cf BaseFormat
Methods
- classmethod empty(canvas_size)[source]
Create an empty BboxFormat of dimension canvas_size
- Parameters:
canvas_size (Tuple[int, int])
- Returns:
BboxFormat
- Return type:
- classmethod from_instance_mask(mask)[source]
Create a BboxFormat from InstanceMaskFormat
- Parameters:
mask (InstanceMaskFormat)
- Returns:
BboxFormat
- Return type:
- class FormatOperatorHandler[source]
Class that handles operations on format such as crop, pad, sanitize etc.
Methods:
- apply_augmentation(form, image, transform)[source]
Apply augmentation on BaseData through its method. Handles labels as well and image.
- Parameters:
form (
BaseFormat)image (
Tensor)transform (
Transform)
- Returns:
augmented format, present Tensor, augmented image
- Return type:
Tuple[BaseFormat, Tensor, Tensor]
- apply_base_method(form, func, **kwargs)[source]
Apply a base method (crop, pad, sanitize etc.) from BaseData and handles labels modifications. The method must return as well a present objects Tensor : [BaseData, Tensor]
- Parameters:
func (
str) – func to be called (ex: crop, pad …)form (BaseFormat)
- Returns:
New format and tensor of present objects after operation
- Return type:
Tuple[Format, Tensor]
- class InstanceMaskData[source]
Instance segmentation data class (Child class of BaseData)
- Parameters:
mask (Union[Mask, Tensor]) – Stacked mask (Tensor) of shape [H, W]. Each object is indexed in [1…N] range.
- Properties
device (
Literal["cpu", "cuda"])value (
BoundingBoxes): Tensor value of stacked instance masknb_object (
int): number of objects contained.canvas_size (Tuple[int, int]): dim of mask (h, w)
Methods:
- apply_augmentation(image, transform)[source]
Apply transform on self and associated image
- Parameters:
image (
Tensor)transform (
Transform)
- Returns:
augmented data, present Tensor, image
- Return type:
Tuple[InstanceMaskData, Tensor, Tensor]
- crop(t, l, h, w)[source]
Crop InstanceMaskData to desired coordinates.
- Parameters:
t (int) – top crop coord
l (int) – left crop coord
h (int) – height of crop
w (int) – width of crop
- Returns:
padded InstanceMaskData, indices of present objects
- Return type:
Tuple[InstanceMaskData, Tensor]
- classmethod empty(canvas_size)[source]
generate empty instance mask (full of 0) with given canvas_size
- classmethod from_binary_masks(mask)[source]
Generate InstanceMaskData from one_hot (binary) mask of shape [N, H, W] where N = number of objects. Note that background must not be included.
- Parameters:
mask (Tensor) – one hot mask of shape [N, H, W]
- Returns:
Stacked InstanceMaskData
- Return type:
- pad(t, l, r, b)[source]
Pad InstanceMaskData to desired coordinates. Note : the order t, l, r, b is different between deepvisiontools and torchvision.
- Parameters:
t (int) – top padding
l (int) – left padding
r (int) – right padding
b (int) – bottom padding
- Returns:
padded InstanceMaskData, indices of present objects
- Return type:
Tuple[InstanceMaskData, Tensor]
- class InstanceMaskFormat[source]
Class for Instance Segmentation Format (Child class of BaseFormat). contains InstanceMaskData value, labels and scores.
- Parameters:
data (InstanceMaskData)
labels (Tensor)
scores (Tensor | None, optional)
Properties & attributes : cf BaseFormat
Methods
- class SemanticMaskData[source]
Semantic segmentation data class (Child class of BaseData)
- Parameters:
mask (Union[Mask, Tensor]) – Semantic mask with value in [0, …, N_cls] range.
- Properties
device (
Literal["cpu", "cuda"])value (
Mask): Tensor value of semantic masknb_object (
int): number of objects contained. In this case, nb of objects is the number of differents classes present in mask.canvas_size (Tuple[int, int]): dim of mask (h, w)
Methods:
- apply_augmentation(image, transform, scores=None)[source]
Apply transform on self and associated image
- Parameters:
image (
Tensor)transform (
Transform)scores (
Tensor|None) – associated logits score if present
- Returns:
augmented data, present Tensor, image
or -
Tuple[InstanceMaskData, Tensor, Tensor]:augmented data, present Tensor, image, augmented scores
- Return type:
Tuple[InstanceMaskData, Tensor, Tensor]
- crop(t, l, h, w)[source]
Crop SemanticMaskData to desired coordinates.
- Parameters:
t (int) – top crop coord
l (int) – left crop coord
h (int) – height of crop
w (int) – width of crop
- Returns:
padded InstanceMaskData, indices of present objects
- Return type:
Tuple[SemanticMaskData, Tensor]
- classmethod empty(canvas_size)[source]
generate empty semantic mask data (full of 0) with given canvas_size
- pad(t, l, r, b)[source]
Pad SemanticMaskData to desired coordinates. Note : the order t, l, r, b is different between deepvisiontools and torchvision.
- Parameters:
t (int) – top padding
l (int) – left padding
r (int) – right padding
b (int) – bottom padding
- Returns:
padded InstanceMaskData, indices of present objects
- Return type:
Tuple[SemanticMaskData, Tensor]
- class SemanticMaskFormat[source]
Class for Semantic Mask format (Child class of BaseSemanticFormat).
- Parameters:
data (SemanticMaskData)
scores (Tensor | None, optional)
Properties & attributes : cf BaseSemanticFormat
Methods
- classmethod empty(canvas_size, scores=None)[source]
Create an empty SemanticMaskFormat of dimension canvas_size
- Parameters:
canvas_size (Tuple[int, int])
scores (Tensor | None)
- Returns:
SemanticMaskFormat
- Return type:
- classmethod from_instance_mask(mask, scores=None)[source]
Create a SemanticMaskFormat from InstanceMaskFormat
- Parameters:
mask (InstanceMaskFormat)
scores (Tensor | None)
- Returns:
SemanticMaskFormat
- Return type:
- combine_logits(logit1, logit2)[source]
Used for semantic mask data, and particularly in patchification. Take 2 logits mask and combine them according to Configuration().semantic_mask_logits_combination If one mask has strictly zeros somewhere, just take the other value. If both are zeros become zeros. If both are non zeros, combine them.
- Parameters:
logit1 (
Tensor)logit2 (
Tensor)
- Returns:
Tuple[Tensor]- Return type:
Tuple[Tensor]
- get_preds_and_logits(logit1, logit2)[source]
Combine 2 logits (used in __add__ of semanticmaskformat) and return semantic mask and logits
- Parameters:
logit1 (Tensor)
logit2 (Tensor)
- Return type:
Tuple[Tensor]
- mask2boxes(mask)[source]
from stacked (id object = 1 … N) mask (H, W) returns tensor of shape (N, 4)
- Parameters:
mask (Tensor)
- Return type:
Tensor
- reindex_mask_with_splitted_objects(mask)[source]
Function that reidex masks objects by creating new objects if they are disconnected.
- Parameters:
mask (
Tensor) – Input mask tensor containing disconnected part of given objects.- Returns:
New mask indexed with 1 per object after separating disconnected objects, indexes of original common objects they belonged.
- Return type:
Tuple[Tensor, Tensor]
Detailed explanation : Imagine a mask with 2 objects 0 and 1 and the first is separated in two parts disconnected. The new mask will contain 3 objects and the indices will be [0, 0, 1]
deepvisiontools.inference
- class BasePatchifier[source]
Abstract class for patchifier. If you want to implement a custom one you need to implement unpatchify method
- pad_to(image, new_size)[source]
Pad image to given size :param image: :type image: Tensor :param new_size: :type new_size: Tuple[int, int]
- Returns:
padded image, (t, l, r, b)
- Return type:
Tuple[Tensor, Tuple[int, int, int, int]]
- Parameters:
image (Tensor)
new_size (Tuple[int, int])
- patchify(image)[source]
Create patches for image prediction : 1) Pad image to fit all patches, 2) create patches
- Parameters:
image (
Tensor)- Returns:
patches stacked (N_patch, c, h, w), List of (top, left) pad coordinates, padded image, image pad coordinates
- Return type:
Tuple[ Tensor, List[Tuple[int, int]], Tuple[int, int], Tensor, Tuple[int, int]]
- class DetectPatchifier[source]
Handle patchification and unpatchification
- Parameters:
patch_size (
Tuple[int, int]) – size of patches to createoverlap (
float) – overlap between patchesborder_penalty (
float, optional) – penalty to apply on patch border objects before nms. Defaults to 0.5.nms_iou_threshold (
float, optional) – nms threshold. Defaults to 0.45.final_score_threshold (
float, optional) – final score (after penalty) threshold. Defaults to 0.4.
- Attributes
patch_size (
Tuple[int, int]) overlap (float): overlap between patches border_penalty (float) postprocess (`PostProcesser`)
Methods
- unpatchify(pred_patches, origins, image_padded_size, padded_image_coords, original_image_size)[source]
merge patchs predictions together while applying penalty, postprocess etc. Note that since InstanceMasks do not handle overlapping objects, you need to treat directly the patches in the postprocess. To do so you derive boxes, then use boxes to filter the masks.
- Parameters:
pred_patches (BatchedFormat)
origins (List[Tuple[int, int]])
image_padded_size (Tuple[int, int])
padded_image_coords (Tuple[int, int, int, int])
original_image_size (Tuple[int, int])
- class Evaluator[source]
Evaluator class : evaluate a given Predictor (model + patch_size + additional Configurations) on a dataset with given metrics. The results are saved in a generated csv file (metrics at dataset level) and in a xlsx file (metrics at dataset and sample levels). Highlighting samples that deviate from the mean or median (giving deviation_method) by nb_sigma sigma in the xlsx file. Create visualizations of predictions giving number_visu. Returns the dictionnary with metrics at dataset level and the dictionnary with metrics at sample level. Prints a dataframe with metrics at dataset level (same content as in the generated csv file).
- Parameters:
predictor (
Predictor) – Predictor class to evaluatemetrics (
list) – List of metrics to evaluate the Predictor ondeviation_method (
Literal["mean", "median"], optional) – method to compute outlayers. Defaults to “mean”.nb_sigma (
Union[int, float], optional) – number of standard deviations for outlayers. Defaults to 2.
Example:
>>> from deepvisiontools import Evaluator, Predictor >>> from deepvisiontools.metrics import DetectF1Score >>> predictor = Predictor(model=\path o\model.pth) >>> evaluator = Evaluator(predictor, metrics=[DetectF1Score()]) >>> evaluator.evaluate(mydataset, "results")
- Attributes
predictor (
Predictor) metrics (List[BaseMetric]) nb_sigma (int) data_type (Literal["instance_mask", "bbox", "keypoint", "semantic_mask"]) deviation_method (`Literal["mean", "median"])
Methods
- evaluate(dataset, result_folder, number_visu='all')[source]
Run evaluation on dataset. Compute metrics for dataset and for each sample of the dataset.
- Parameters:
dataset (DeepVisionDataset)
result_folder (str | Path)
number_visu (Literal['all'] | int)
- class PostProcesser[source]
Handles postprocessing
- Parameters:
nms_iou_th (
float) – nms thresholdfinal_score_threshold (
float) – final score thresholding
- class Predictor[source]
Predictor class for deepvisiontools. Load a model and apply on image, get prediction. Can handle patchification for large image prediction.
- Parameters:
model (
Union[BaseModel, str, Path]) – model path / instance of BaseModel to be used.preprocessing (
Callable, optional) – used preprocesser. Defaults to build_preprocessing().patch_size (
Union[Tuple[int, int], None], optional) – size of the patchs to be used for large image inference. If None will run the full image. Defaults to None.overlap (
float, optional) – Overlap between patches used in case of patchification. Defaults to 0.4.border_padding (
int, optional) – default image padding when using patchification. Defaults to 100.batch_size (
int, optional) – batch size for patchification. Defaults to 1.border_penalty (
float, optional) – apply a penalty on patch border predictions : makes nms more efficient. Higher is more stringent. Max to 1 and Min to 0. Defaults to 0.5.nms_iou_threshold (
float, optional) – nms threshold to be used when upatchifying. Defaults to 0.45.final_score_threshold (
float, optional) – Apply a score thresholding after penalty and after nms. Defaults to 0.4.categories (
Dict[int, str], optional) – To rename your categories in the visualization.patchifier (``Union[BasePatchifier, None], optional) – If None use default SemanticPatchifier or DetectPatchifier according to Configuration().data_type. Default to None.
verbose (
bool, optional) – if set to True will display progress state in patchs predictions. Default to True.
Example:
>>> from deepvisiontools import Predictor >>> img = "path/to/img" >>> predictor = Predictor(model=\path o\model.pth) >>> results = predictor.predict(img)
- Attributes
model (
BaseModel) preprocessing (Callable) patch_size (Union[Tuple[int, int], None]) padder (Transform) batch_size (int) cropper (Transform) patchifier (BasePatchifier) categories (Dict[int, str]) verbose (bool)
Methods
- filter_empty_patches(preds_batch_patch, pad_origins)[source]
remove empty patches for unpatchification
- Parameters:
preds_batch_patch (BatchedFormat)
pad_origins (List[Tuple[int, int]])
- forward_pass(batch_patchs)[source]
Run predictions on image / batch of patches
- Parameters:
batch_patchs (Tensor)
- Return type:
- predict(image, visu_path='')[source]
Main function of
`Predictor`: call everything needed for prediction.- Parameters:
image (
Union[str, Path, Tensor]) – _description_visu_path (
Union[str, Path], optional) – path to visualization to be saved. Defaults to “”.
- Returns:
prediction as deepvisiontools format.
- Return type:
BaseFormat
deepvisiontools.metrics
- class ClassWiseDetectAccuracy[source]
Similar as DetectAccuracy but with multiclass detail. Samplewise is not provided in that case. Multiclass is handled by removing all other classes objects than the considered one in target and prediction for tp, fp, tn, fn computation
- class ClassWiseDetectF1score[source]
Similar as DetectF1score but with multiclass detail. Samplewise is not provided in that case. Multiclass is handled by removing all other classes objects than the considered one in target and prediction for tp, fp, tn, fn computation
- class ClassWiseDetectMetric[source]
Base class that agregates n_classes DetectMetric(s) to obtain class dependant performances. Note that samplewise scores are not performed here.
- Parameters:
func (Callable) – function to apply to tp, fp, tn, fn
name (str, optional) – metric’s name (useful for tensorboard monitoring). Defaults to “ClassWiseDetectionMetric”.
- Attributes
classmetrics (
List[DetectMetric]): list of detectmetrics specialized in each classes.
- compute()[source]
Return metrics values.
- Returns:
dictionnary with all “global” DetectMetric in self.classmetrics
- Return type:
Dict[str, Tensor]
- compute_last_sample()[source]
- Return metrics values of the last sample in self.stats.
Used in combination with self.update
- Returns:
dictionnary with metric value for all classes combined and for each class
- Return type:
Dict[str, Float]
- global_macro_compute()[source]
Compute metric with global/macro averraging. Return also metric/class tensor.
- Return type:
Tuple[Tensor, Tensor]
- to(device)[source]
Move all metrics in self.classmetrics to device. Override from torchmetrics Metric
- Parameters:
device (Any)
- update(prediction, target)[source]
Update all DetectMetrics in self.classmetrics according to prediction / target.
- Parameters:
prediction (Union[BaseFormat, BatchedFormat])
target (Union[BaseFormat, BatchedFormat])
- class ClassWiseDetectPrecision[source]
Similar as DetectPrecision but with multiclass detail. Samplewise is not provided in that case. Multiclass is handled by removing all other classes objects than the considered one in target and prediction for tp, fp, tn, fn computation
- class ClassWiseDetectRecall[source]
Similar as DetectRecall but with multiclass detail. Samplewise is not provided in that case. Multiclass is handled by removing all other classes objects than the considered one in target and prediction for tp, fp, tn, fn computation
- class ClassifMetric[source]
Child class of torchmetrics metrics for classification. Allow to take Format as inputs and return dict of metric.
- Parameters:
func (Callable)
name (str)
kwargs (Any)
- compute_last_sample()[source]
- Return metrics values of the last sample in self.stats.
Used in combination with self.update
- Returns:
dictionnary with metric value for all classes combined and for each class
- Return type:
Dict[str, Float]
- class DetectAccuracy[source]
Accuracy for detection task. In case of detection, tn is none : -> 0 for computation
- class DetectMetric[source]
Base class for custom detection metric with torchmetrics engine
- Parameters:
func (Callable) – function to apply to tp, fp, tn, fn
name (str, optional) – metric’s name (useful for tensorboard monitoring). Defaults to “DetectionMetric”.
- compute()[source]
Return metric computed with internal state.
- Returns:
dictionnary with aggregation_method: value
- Return type:
Dict[str, Tensor]
- compute_last_sample()[source]
- Return metrics values of the last sample in self.stats.
Used in combination with self.update
- Returns:
dictionnary with metric value for all classes combined and for each class
- Return type:
Dict[str, Float]
- update(prediction, target)[source]
Update metric’s internal state with prediction target comparison (tp, fp, tn, fn)
- Parameters:
prediction (Union[BaseFormat, BatchedFormat])
target (Union[BaseFormat, BatchedFormat])
- class Matcher[source]
Class that handles the matching of prediction and targets to get tp, fp, fn
- match_boxes(pred, targ)[source]
compute box cross ious for matching
- Parameters:
pred (BaseFormat)
targ (BaseFormat)
- Return type:
Tuple[int, int, int, Tuple[Tensor, Tensor]]
- match_instance_masks(pred, targ)[source]
compute instance_mask cross ious for matching
- Parameters:
pred (InstanceMaskFormat)
targ (InstanceMaskFormat)
- class SemanticSegmentationMetric[source]
Child class of ClassifMetric. Move from instance to semantic segmentation paradigm to provide stats based on classes masks (instead of objects).
- Parameters:
func (Callable)
name (str)
kwargs (Any)
- update(prediction, target)[source]
Convert target & prediction to semantic mask to compute stats in semantic segmentation paradigm. Update internal state.
- Parameters:
prediction (BaseFormat | BatchedFormat)
target (BaseFormat | BatchedFormat)
deepvisiontools.models
- class BaseModel[source]
Base Class for deepvisiontools models.
- Attributes
confidence_thr (
float): Confidence score threshold to consider object as true prediction.model_max_detection (
int): Maximum number of object to predict on one image.model_nms_threshold (
float): IoU threshold to consider 2 boxes as overlapping for Non Max Suppression algorithm.num_classes (
int): Number of classes.
Methods:
- abstract build_results(raw_outputs)[source]
Transform model outputs into BaseFormat for results. This function also apply instances selection on results according to args:
confidence_thr
model_max_detection
model_nms_threshold
- Parameters:
raw_outputs (
Any) – Model outputs.- Returns:
Model output for batch.
- Return type:
BatchedFormats
- property device
Send model to device.
- Parameters:
device (
Literal['cpu', 'cuda']) – Device to send model on.
- abstract get_predictions(images)[source]
Prepare images, Apply model forward pass and build results.
- Parameters:
images (
Tensor) – RGB images Tensor.- Returns:
Predictions for images as BatchedFormats.
- Return type:
BatchedFormats
- abstract prepare(images, targets=None)[source]
Transform images and targets into model specific format for prediction & loss computation.
- Parameters:
images (
Tensor) – Batch images.targets (
BatchedFormats, optional) – Batched targets from DetectionDataset.
- Returns:
Images data prepared for model.
If targets: images + targets prepared for model.
- Return type:
Union[Any, Tuple[Any]]
- abstract run_forward(images, targets)[source]
Compute loss from images and if target passed, compute loss & return both loss dict and results.
- Parameters:
images (
Tensor) – Batch RGB images.targets (
BatchedFormats) – Batch targets.predict (
bool, optional) – To return predictions or not. Defaults to False.
- Returns:
Loss dict.
If predict: Predictions.
- Return type:
Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormats]]
- class Mask2Former[source]
Mask2Former class, child class of Mask2FormerForUniversalSegmentation from hugging face. To use, data_type must be set to instance_mask.
- Parameters:
pretrain (Literal["large", "medium", "small", "tiny", ""], optional) – Pretrained architecture. Defaults to “tiny”.
overlap_mask_thr (float, optional) – Defaults to 0.8.
- Attributes
processor (
Mask2FormerImageProcessor) overlap_mask_thr (float)
- Properties
queries (
torch.nn.Embedding) : number of queries / dim for embedding. To use setter please provide int or Tuple[int, int]. In case only a int is provided dimensional embedding is 256, otherwise Tuple is query number, dim.
- Notes
- Type:
When used for large image inference, Mask2Former is less performant if trained on smaller patches. One way out is to increase the query number. Please check property description. Future amelioration on this matter is under developpment.
Methods
- build_results(raw_outputs, spatial_size)[source]
Transform model outputs into BatchedFormat for results.
- Parameters:
raw_outputs (
Mask2FormerForUniversalSegmentationOutput) – Mask2Former output.spatial_size (
Tuple[int, int]) – Size of original image (H, W).
- Returns:
Model output as BatchedFormat.
- Return type:
BatchedFormats
- get_predictions(images)[source]
Prepare images, Apply model forward pass and build results.
- Parameters:
images (
Tensor) – RGB images Tensor.- Returns:
Predictions for images as BatchedFormat.
- Return type:
BatchedFormat
- inputs_to_device(input, device)[source]
Send Mask2Former inputs to device.
- Parameters:
input (Any)
device (Literal['cpu', 'cuda'])
- prepare(images, targets=None)[source]
Transform images and targets into Mask2Former specific format for prediction & loss computation.
- Parameters:
images (
Tensor) – Batch images.targets (
BatchedFormats, optional) – Batched targets from DetectionDataset.
- Returns:
Images data prepared for Mask2Former.
If targets: images + targets prepared for Mask2Former.
- Return type:
Union[Any, Tuple[Any]]
- prepare_target(target)[source]
Prepare target in Mask2Former format
- Parameters:
target (InstanceMaskFormat)
- Return type:
Tuple[Tensor, Dict[int, int]]
- run_forward(images, targets)[source]
Compute loss from images and if target passed, compute loss & return both loss dict and results.
- Parameters:
images (
Tensor) – Batch RGB images.targets (
BatchedFormat) – Batch targets.
- Returns:
Loss dict and prediction if model in eval mode.
- Return type:
Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormat]]
- class SMP[source]
Factory class that wraps segmentation-models-pytorch (smp) models into deepvisiontools. These models are used for semantic segmentation tasks. Using this class you can use all available models, encoder and whatever additional arguments from segmentation model pytorch. Please provide further parameters using non positional arguments (ex : arg=myadditionalarg) Note that you can use any smp loss as well by simply providing and instance of smp losses : loss=smp.loss.WantedLoss()
smp : https://github.com/qubvel-org/segmentation_models.pytorch
- Parameters:
architecture (
SegmentationModel, optional) – SMP model architecture : need to provide a smp class (type). Defaults to smp.Unet.
Example:
>>> from deepvisiontools.models import SMP >>> import segmentation_models_pytorch as smp >>> my_model = SMP(smp.Unet, encoder_name="vgg11", loss=smp.losses.FocalLoss(mode="binary"))
- class TimmYolo[source]
This class combines any timm library encoder compatible with features_only=True with a Yolo detection head. This leverage complex encodeur, potentially with attention layers, while remaining flexible on the input image size. The idea is to patchify all images that run through the model, perform feature prediction, combine the feature and run the fully convolutional yolo detection head. **Note: ** This model does not have a forward method ! use run() or get_predictions instead.
- Parameters:
backbone_name (str, optional) – timm backbone. Defaults to “swin_small_patch4_window7_224”. Has been tested with “vit_large_patch14_dinov2” and “resnet50.a1_in1k” as well
num_classes (int, optional) – Defaults to 1.
pretrained (bool, optional) – Defaults to True.
overlap (float | int | Tuple[int, int] | None, optional) – If different of None use the pixel given value for overlap (careful it must be compatible with the reduction level).
None. (If none it uses the maximum reduction x 2. Defaults to)
internal_batch_size (int, optional) – Number of patch to run simultaneously. Defaults to 1.
- build_results(raw_outputs, prebuild_outputs, original_img_size)[source]
Transform model outputs into Batch BboxFormat for results.
- Parameters:
raw_outputs (
List[Tensor]) – Model outputs.prebuild_outputs (
Tensor) – Extracted boxes from outputs in eval mode.original_img_size (Size)
- Returns:
Batched predictions.
- Return type:
BatchedFormats
- compute_loss(raw_outputs, targets)[source]
Compute loss with predictions & targets.
- Parameters:
raw_outputs (
Any) – Raw output of model.targets (
DetectionFormat) – Targets in YOLO format.
- Returns:
Loss dict with total loss (key: “loss”) & sublosses.
- Return type:
Dict[str, Tensor]
- prepare(images, targets=None)[source]
Pad images and target so final patch match exactly image border.
- Parameters:
images (
_type_) – image to be preparedtargets (
_type_, optional) – tragets to be prepared. Defaults to None.
- Returns:
prepared images, prepared targets, original image size
- Return type:
Tuple[Tensor, BatchedFormat | None, torch.Size]
- prepare_target(targets, img_size)[source]
Return target from BatchedFormat to ultralytics yolo format.
- Parameters:
targets (BatchedFormat)
img_size (Tuple[int, int])
- Returns:
target as per ultralytics Yolo format.
- Return type:
Dict[str, Tensor]
- class Yolo[source]
Yolo detection model. data_type must be either bbox or instance_mask to use this model.
- Parameters:
architecture (
Literal["yolon", "yolom", "yolol", "yolox"], optional) – Yolo model size. You can add “-p2” or “-p6” to load the p2 or p6 variants. Defaults to “yolon”.pretrained (
bool, optional) – Use pretrained weights. Defaults to True.reg_max (
int, optional) – reg_max argument of yolo models (impacts object size detection). See ultralytics for more information. Defaults to 16.loss_factor (
float, optional) – divide yolo loss value (important for mixed precision to keep it below a certain range). Defaults to 1.
- Attributes
criterion (
v8DetectionLoss): Yolo loss from ultralytics.args (
Any) : ultralytics Yolo’s configuration params.pad_requirements (
int) : pad requirements as per yolo (image shape multiple of 32 is the basic, but depends for p2 or p6). Note that is set automatically.
- Properties
device (
Literal["cuda", "cpu"]): model’s device
Methods
- build_results(raw_outputs, prebuild_outputs)[source]
Transform model outputs into Batch BboxFormat for results.
- Parameters:
raw_outputs (
List[Tensor]) – Model outputs.prebuild_outputs (
Tensor) – Extracted boxes from outputs in eval mode.
- Returns:
Batched predictions.
- Return type:
BatchedFormats
- compute_loss(raw_outputs, targets)[source]
Compute loss with predictions & targets.
- Parameters:
raw_outputs (
Any) – Raw output of model.targets (
DetectionFormat) – Targets in YOLO format.
- Returns:
Loss dict with total loss (key: “loss”) & sublosses.
- Return type:
Dict[str, Tensor]
- get_predictions(images)[source]
Prepare images, Apply YOLO forward pass and build results.
- Parameters:
images (
Tensor) – RGB images Tensor.- Returns:
Predictions for images as BatchedFormats.
- Return type:
BatchedFormats
- prepare(images, targets=None)[source]
Pad image / targets to fit yolo divisibility by 32 criterium and move targets to yolo format. If no targets passed simply returns images
- Parameters:
images (
Tensor) – batched images [N, 3, H, W]targets (
Union[BatchedFormat, None])
- Returns:
Either : images_padded, yolo_targets OR images_padded
- Return type:
Union[Tuple[Tensor, Dict], Tensor]
- prepare_target(targets, img_size)[source]
Return target from BatchedFormat to ultralytics yolo format.
- Parameters:
targets (BatchedFormat)
img_size (Tuple[int, int])
- Returns:
target as per ultralytics Yolo format.
- Return type:
Dict[str, Tensor]
- retrieve_spatial_size(raw_outputs)[source]
Retrieve image shape from raw_outputs and stride values.
- Parameters:
raw_outputs (
List[Tensor]) – Raw ouptuts from YOLO model.- Returns:
Size of input image (H, W).
- Return type:
Tuple[int]
- run_forward(images, targets)[source]
Compute loss from images and if target passed, compute loss & return both loss dict and results.
- Parameters:
images (
Tensor) – Batch RGB images.targets (
BatchedFormat) – Batch targets.
- Returns:
Loss dict.
If predict: predictions.
- Return type:
Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormat]]
- class YoloSeg[source]
Yolo detection model. data_type must be either bbox or instance_mask to use this model.
- Parameters:
architecture (
Literal["yolon", "yolom", "yolol", "yolox"], optional) – Yolo model size. You can add “-p2” or “-p6” to load the p2 or p6 variants. Defaults to “yolon”.pretrained (
bool, optional) – Use pretrained weights. Defaults to True.reg_max (
int, optional) – reg_max argument of yolo models (impacts object size detection). See ultralytics for more information. Defaults to 16.loss_factor (
float, optional) – divide yolo loss value (important for mixed precision to keep it below a certain range). Defaults to 1.
- Attributes
criterion (
v8DetectionLoss): Yolo loss from ultralytics.args (
Any) : ultralytics Yolo’s configuration params.pad_requirements (
int) : pad requirements for yoloseg (basic is image shape must be multiple of 32)mask_logit_threshold (
int) : mask logit threshold to consider if pixel is class or background. Default is 0.5 but can be changed.
- Properties
device (
Literal["cuda", "cpu"]): model’s device
Methods
- build_results(raw_outputs, get_logit=False)[source]
Transform model outputs into Batch InstanceMaskFormat for results.
- Parameters:
raw_outputs (
List[Tensor]) – Model outputs.get_logit (bool)
- Returns:
Batched predictions.
- Return type:
BatchedFormats
- compute_loss(predictions, target)[source]
Compute loss with predictions & targets.
- Parameters:
predictions (
Any) – Raw output of model.target (
Dict[Any, Any]) – Targets in YOLO format.
- Returns:
Loss dict with total loss (key: “loss”) & sublosses.
- Return type:
Dict[str, Tensor]
- get_predictions(images)[source]
Prepare images, Apply YOLO forward pass and build results.
- Parameters:
images (
Tensor) – RGB images Tensor.- Returns:
Predictions for images as BatchedFormats.
- Return type:
BatchedFormats
- prebuild_output(raw_outputs)[source]
Unpack Yolo-seg (eval mode) raw results.
- Parameters:
raw_output (
Tuple[Tensor, ...]) – Yolo raw eval mode results.raw_outputs (Tuple[Tensor, ...])
- Returns:
boxes (N_batch, N_obj, cxcywh).
cls_scores (N_batch, N_cls).
mask_weights (N_batch, N_obj, 32).
protos (N_batch, protos).
- Return type:
Tuple[Tensor, ...]
- prepare(images, targets=None)[source]
Pad image / targets to fit yolo divisibility by 32 criterium and move targets to yolo format. If no targets passed simply returns images
- Parameters:
images (
Tensor) – batched images [N, 3, H, W]targets (
Union[BatchedFormat, None])
- Returns:
Either : images_padded, yolo_targets OR images_padded
- Return type:
Union[Tuple[Tensor, Dict], Tensor]
- prepare_target(targets)[source]
Transform SegmentationFormat targets into yolo-seg targets format.
- Parameters:
targets (
BatchedFormats) – Batch targets.- Returns:
Targets in YOLO format.
- Return type:
Dict[str, Tensor]
- retrieve_spatial_size(raw_outputs)[source]
Retrieve image shape from raw_outputs and stride values.
- Parameters:
raw_outputs (
List[Tensor]) – Raw ouptuts from YOLO model.- Returns:
Size of input image (H, W).
- Return type:
Tuple[int]
- run_forward(images, targets)[source]
Compute loss from images and if target passed, compute loss & return both loss dict and results.
- Parameters:
images (
Tensor) – Batch RGB images.targets (
BatchedFormat) – Batch targets.
- Returns:
Loss dict.
If predict: predictions.
- Return type:
Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormat]]
deepvisiontools.models.mask2former
deepvisiontools.models.yolo
- class Yolo[source]
Yolo detection model. data_type must be either bbox or instance_mask to use this model.
- Parameters:
architecture (
Literal["yolon", "yolom", "yolol", "yolox"], optional) – Yolo model size. You can add “-p2” or “-p6” to load the p2 or p6 variants. Defaults to “yolon”.pretrained (
bool, optional) – Use pretrained weights. Defaults to True.reg_max (
int, optional) – reg_max argument of yolo models (impacts object size detection). See ultralytics for more information. Defaults to 16.loss_factor (
float, optional) – divide yolo loss value (important for mixed precision to keep it below a certain range). Defaults to 1.
- Attributes
criterion (
v8DetectionLoss): Yolo loss from ultralytics.args (
Any) : ultralytics Yolo’s configuration params.pad_requirements (
int) : pad requirements as per yolo (image shape multiple of 32 is the basic, but depends for p2 or p6). Note that is set automatically.
- Properties
device (
Literal["cuda", "cpu"]): model’s device
Methods
- build_results(raw_outputs, prebuild_outputs)[source]
Transform model outputs into Batch BboxFormat for results.
- Parameters:
raw_outputs (
List[Tensor]) – Model outputs.prebuild_outputs (
Tensor) – Extracted boxes from outputs in eval mode.
- Returns:
Batched predictions.
- Return type:
BatchedFormats
- compute_loss(raw_outputs, targets)[source]
Compute loss with predictions & targets.
- Parameters:
raw_outputs (
Any) – Raw output of model.targets (
DetectionFormat) – Targets in YOLO format.
- Returns:
Loss dict with total loss (key: “loss”) & sublosses.
- Return type:
Dict[str, Tensor]
- get_predictions(images)[source]
Prepare images, Apply YOLO forward pass and build results.
- Parameters:
images (
Tensor) – RGB images Tensor.- Returns:
Predictions for images as BatchedFormats.
- Return type:
BatchedFormats
- prepare(images, targets=None)[source]
Pad image / targets to fit yolo divisibility by 32 criterium and move targets to yolo format. If no targets passed simply returns images
- Parameters:
images (
Tensor) – batched images [N, 3, H, W]targets (
Union[BatchedFormat, None])
- Returns:
Either : images_padded, yolo_targets OR images_padded
- Return type:
Union[Tuple[Tensor, Dict], Tensor]
- prepare_target(targets, img_size)[source]
Return target from BatchedFormat to ultralytics yolo format.
- Parameters:
targets (BatchedFormat)
img_size (Tuple[int, int])
- Returns:
target as per ultralytics Yolo format.
- Return type:
Dict[str, Tensor]
- retrieve_spatial_size(raw_outputs)[source]
Retrieve image shape from raw_outputs and stride values.
- Parameters:
raw_outputs (
List[Tensor]) – Raw ouptuts from YOLO model.- Returns:
Size of input image (H, W).
- Return type:
Tuple[int]
- run_forward(images, targets)[source]
Compute loss from images and if target passed, compute loss & return both loss dict and results.
- Parameters:
images (
Tensor) – Batch RGB images.targets (
BatchedFormat) – Batch targets.
- Returns:
Loss dict.
If predict: predictions.
- Return type:
Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormat]]
- box_nms_filter(format)[source]
Filter Format according to nms threshold from Configuration()
- Parameters:
format (
Format)- Returns:
Format- Return type:
- confidence_filter(format)[source]
Filter Format according to confidence threshold from Configuration()
- Parameters:
format (
Format)- Returns:
Format- Return type:
deepvisiontools.models.yoloseg
- class YoloSeg[source]
Yolo detection model. data_type must be either bbox or instance_mask to use this model.
- Parameters:
architecture (
Literal["yolon", "yolom", "yolol", "yolox"], optional) – Yolo model size. You can add “-p2” or “-p6” to load the p2 or p6 variants. Defaults to “yolon”.pretrained (
bool, optional) – Use pretrained weights. Defaults to True.reg_max (
int, optional) – reg_max argument of yolo models (impacts object size detection). See ultralytics for more information. Defaults to 16.loss_factor (
float, optional) – divide yolo loss value (important for mixed precision to keep it below a certain range). Defaults to 1.
- Attributes
criterion (
v8DetectionLoss): Yolo loss from ultralytics.args (
Any) : ultralytics Yolo’s configuration params.pad_requirements (
int) : pad requirements for yoloseg (basic is image shape must be multiple of 32)mask_logit_threshold (
int) : mask logit threshold to consider if pixel is class or background. Default is 0.5 but can be changed.
- Properties
device (
Literal["cuda", "cpu"]): model’s device
Methods
- build_results(raw_outputs, get_logit=False)[source]
Transform model outputs into Batch InstanceMaskFormat for results.
- Parameters:
raw_outputs (
List[Tensor]) – Model outputs.get_logit (bool)
- Returns:
Batched predictions.
- Return type:
BatchedFormats
- compute_loss(predictions, target)[source]
Compute loss with predictions & targets.
- Parameters:
predictions (
Any) – Raw output of model.target (
Dict[Any, Any]) – Targets in YOLO format.
- Returns:
Loss dict with total loss (key: “loss”) & sublosses.
- Return type:
Dict[str, Tensor]
- get_predictions(images)[source]
Prepare images, Apply YOLO forward pass and build results.
- Parameters:
images (
Tensor) – RGB images Tensor.- Returns:
Predictions for images as BatchedFormats.
- Return type:
BatchedFormats
- prebuild_output(raw_outputs)[source]
Unpack Yolo-seg (eval mode) raw results.
- Parameters:
raw_output (
Tuple[Tensor, ...]) – Yolo raw eval mode results.raw_outputs (Tuple[Tensor, ...])
- Returns:
boxes (N_batch, N_obj, cxcywh).
cls_scores (N_batch, N_cls).
mask_weights (N_batch, N_obj, 32).
protos (N_batch, protos).
- Return type:
Tuple[Tensor, ...]
- prepare(images, targets=None)[source]
Pad image / targets to fit yolo divisibility by 32 criterium and move targets to yolo format. If no targets passed simply returns images
- Parameters:
images (
Tensor) – batched images [N, 3, H, W]targets (
Union[BatchedFormat, None])
- Returns:
Either : images_padded, yolo_targets OR images_padded
- Return type:
Union[Tuple[Tensor, Dict], Tensor]
- prepare_target(targets)[source]
Transform SegmentationFormat targets into yolo-seg targets format.
- Parameters:
targets (
BatchedFormats) – Batch targets.- Returns:
Targets in YOLO format.
- Return type:
Dict[str, Tensor]
- retrieve_spatial_size(raw_outputs)[source]
Retrieve image shape from raw_outputs and stride values.
- Parameters:
raw_outputs (
List[Tensor]) – Raw ouptuts from YOLO model.- Returns:
Size of input image (H, W).
- Return type:
Tuple[int]
- run_forward(images, targets)[source]
Compute loss from images and if target passed, compute loss & return both loss dict and results.
- Parameters:
images (
Tensor) – Batch RGB images.targets (
BatchedFormat) – Batch targets.
- Returns:
Loss dict.
If predict: predictions.
- Return type:
Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormat]]
- proto2mask(protos, weights, boxes, shape)[source]
Combine protos and weights to get masks, then crop instances from boxes (Useful in predictions).
- Parameters:
protos (
Tensor) – Sub masks (32, …).weights (
Tensor) – YOLO mask weights (32, …).boxes (
Tensor) – Boxes (N, 4) in XYXY format.shape (
Tuple[int]) – Original image size (H, W).
- Returns:
YOLO segmentation mask.
- Return type:
Tensor
deepvisiontools.preprocessing
- build_preprocessing(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])[source]
Defaults values are from Imagenet.
- Parameters:
mean (List[float], optional) – mean values for each channels Defaults to [0.485, 0.456, 0.406].
std (List[float], optional) – std values for each channels. Defaults to [0.229, 0.224, 0.225].
- Return type:
T.Compose
- get_channels_statistics(image_folder)[source]
Iterate over image folder and output mean and std for each channels for the dataset of images.
- Parameters:
image_folder (str) – path to folder of images
- Returns:
values for mean and std
- Return type:
Tuple[List[float]]
- load_image(image_path)[source]
Load image using torchvision. Handles png, tiff, jpg, jpeg extensions.
- Parameters:
image_path (str) – Path to image.
- Returns:
image in torch Tensor [3, H, W].
- Return type:
Tensor
- load_mask(mask_path)[source]
Load image using torchvision. Handles png, tiff, jpg, jpeg extensions.
- Parameters:
image_path (str) – Path to image.
mask_path (str | Path)
- Returns:
image in torch Tensor [3, H, W].
- Return type:
Tensor
deepvisiontools.train
- class Aggregator[source]
Aggregator aggregate losses across batchs.
Attributes:
- iterations
Number of iterations.
- Type:
int
- losses
Dictionnary of epoch losses (over iterations).
- Type:
Dict[str, Tensor]
Methods:
- class Trainer[source]
Class that handles training in deepvisiontools. Handles train / valid epochs, monitoring (via tensorboard) and metrics computation.
- Parameters:
model (
BaseModel) – deepvisiontools model.optimizer (
Optimizer) – torch optimizer (Ex: Adam())metrics (
List[Union[DetectMetric, ClassWiseDetectMetric, SemanticSegmentationMetric, ClassifMetric]], optional) – List of deepvisiontools metrics. Check available metrics in deepvisiontools.metrics.available_metrics Defaults to [].log_dir (
str, optional) – tensorboard output directory. If “” no monitoring is provided. Defaults to “”.
Example:
- Attributes
model (
BaseModel): deepvisiontools model.optimizer (
Optimizer): torch optimizer (Ex: Adam())metrics (
List[Union[DetectMetric, ClassWiseDetectMetric, SemanticSegmentationMetric, ClassifMetric]], optional): List of deepvisiontools metrics. Check available metrics in deepvisiontools.metrics.available_metrics Defaults to [].board (
SummaryWriter): tensorboard output directory.
- Properties
device (
Literal["cpu", "cuda"]) : the setter move evrything that’s needed to desired device.
Methods
- epoch(loader, ep_number, tag='')[source]
Run trainning epoch.
- Parameters:
loader (
DeepVisionLoader) – DeepVisionLoader.ep_number (
int) – Epoch number.tag (
str, optional) – Tag to link to epoch. Defaults to “”.
- Returns:
Epochs values (Losses & metrics).
- Return type:
Dict[str, Tensor]
- log_string(epoch_dict)[source]
Transform epoch dict in string.
- Parameters:
epoch_dict (
Dict[str, Tensor]) – Dict of epoch values to display.- Returns:
String to print with epoch values.
- Return type:
str
- train_epoch(loader, ep_number, tag='Train')[source]
Run train epoch.
- Parameters:
loader (
DetectionLoader) – DetectionLoader.ep_number (
int) – Epoch number.tag (
str, optional) – Tag to link to epoch. Defaults to “Train”.
- Returns:
Epochs values (Losses).
- Return type:
Dict[str, Tensor]
- train_step(images, targets, scaler)[source]
Run forward pass, loss computation and backward pass.
- Parameters:
images (
Tensor) – Batch imagestargets (
BatchedFormat) – Batch targets.scaler (GradScaler)
- Returns:
Dict of losses containing (total loss at key ‘loss’).
- Return type:
Dict[str, Tensor]
- valid_epoch(loader, ep_number, tag='Valid')[source]
Run train epoch.
- Parameters:
loader (
DetectionLoader) – DetectionLoader.ep_number (
int) – Epoch number.tag (
str, optional) – Tag to link to epoch. Defaults to “Valid”.
- Returns:
Epochs values (Losses & metrics).
- Return type:
Dict[str, Tensor]
- valid_step(images, targets, scaler)[source]
Run forward, compute metrics, return loss dict and metrics.
- Parameters:
images (
Tensor) – Batch images.targets (
BatchedFormat) – Targets.scaler (GradScaler)
- Returns:
Losses and metrics values.
- Return type:
Tuple[Dict[str, Tensor], Dict[str, Dict[str, Tensor]]]
deepvisiontools.utils
- class Visualizer[source]
From image and target generates a visualization image as Tensor.
- Parameters:
image (
Tensor) – Original imagetarget (
BaseFormat) – Target to visualizecategories (
Dict[int, str], optional) – Categories to be used as labels as Dict[int, str]. If set to None will use label indexes. Defaults to None.save_path (
Union[str, Path], optional) – Path to save visualization, if set to “” will not save the visualization. Defaults to “”.class_colors (
List[Sequence[float]], optional) – Colors to be used for classes. Needs to be RGB normalized (divided by 255). Defaults to cc.glasbey_bw.instance_colors (
List[Sequence[float]], optional) – Colors to be used for instances (see class colors constraints). Defaults to cc.glasbey_hv.desired_min_size (
int, optional) – Resize to this specific min size value (preserving shape). Defaults to 1200.show (
bool, optional) – Either to display it on the flight or not. Defaults to False.window_mode (
Literal["dual", "single"], optional) – if dual : provide a combination of image + visu, otherwise will provide only visu. Defaults to “dual”.
- visualization(image, target, categories=None, save_path='', class_colors=[[0.843137, 0.0, 0.0], [0.54902, 0.235294, 1.0], [0.007843, 0.533333, 0.0], [0.0, 0.67451, 0.780392], [0.596078, 1.0, 0.0], [1.0, 0.498039, 0.819608], [0.423529, 0.0, 0.309804], [1.0, 0.647059, 0.188235], [0.0, 0.0, 0.615686], [0.52549, 0.439216, 0.407843], [0.0, 0.286275, 0.258824], [0.309804, 0.164706, 0.0], [0.0, 0.992157, 0.811765], [0.737255, 0.717647, 1.0], [0.584314, 0.705882, 0.478431], [0.752941, 0.015686, 0.72549], [0.145098, 0.4, 0.635294], [0.156863, 0.0, 0.254902], [0.862745, 0.701961, 0.686275], [0.996078, 0.960784, 0.564706], [0.313725, 0.270588, 0.356863], [0.643137, 0.486275, 0.0], [1.0, 0.443137, 0.4], [0.247059, 0.505882, 0.431373], [0.509804, 0.0, 0.05098], [0.639216, 0.482353, 0.701961], [0.203922, 0.305882, 0.0], [0.607843, 0.894118, 1.0], [0.921569, 0.0, 0.466667], [0.176471, 0.0, 0.039216], [0.368627, 0.564706, 1.0], [0.0, 0.780392, 0.12549], [0.345098, 0.003922, 0.666667], [0.0, 0.117647, 0.0], [0.603922, 0.278431, 0.0], [0.588235, 0.623529, 0.65098], [0.607843, 0.258824, 0.360784], [0.0, 0.121569, 0.196078], [0.784314, 0.768627, 0.0], [1.0, 0.815686, 1.0], [0.0, 0.745098, 0.603922], [0.215686, 0.082353, 1.0], [0.176471, 0.145098, 0.145098], [0.87451, 0.345098, 1.0], [0.745098, 0.905882, 0.752941], [0.498039, 0.270588, 0.596078], [0.321569, 0.309804, 0.235294], [0.847059, 0.4, 0.0], [0.392157, 0.454902, 0.219608], [0.756863, 0.45098, 0.533333], [0.431373, 0.454902, 0.541176], [0.501961, 0.615686, 0.011765], [0.745098, 0.545098, 0.396078], [0.388235, 0.2, 0.223529], [0.792157, 0.803922, 0.854902], [0.423529, 0.921569, 0.513725], [0.133333, 0.25098, 0.411765], [0.635294, 0.498039, 1.0], [0.996078, 0.011765, 0.796078], [0.462745, 0.737255, 0.992157], [0.85098, 0.764706, 0.509804], [0.807843, 0.639216, 0.807843], [0.427451, 0.313725, 0.0], [0.0, 0.411765, 0.454902], [0.278431, 0.623529, 0.368627], [0.580392, 0.776471, 0.74902], [0.976471, 1.0, 0.0], [0.752941, 0.329412, 0.270588], [0.0, 0.396078, 0.235294], [0.356863, 0.313725, 0.658824], [0.32549, 0.12549, 0.392157], [0.309804, 0.372549, 1.0], [0.494118, 0.560784, 0.466667], [0.72549, 0.031373, 0.980392], [0.545098, 0.572549, 0.764706], [0.701961, 0.0, 0.207843], [0.533333, 0.376471, 0.494118], [0.623529, 0.0, 0.458824], [1.0, 0.870588, 0.768627], [0.317647, 0.031373, 0.0], [0.101961, 0.031373, 0.0], [0.298039, 0.537255, 0.713725], [0.0, 0.87451, 0.87451], [0.784314, 1.0, 0.980392], [0.188235, 0.207843, 0.082353], [1.0, 0.152941, 0.278431], [1.0, 0.592157, 0.666667], [0.015686, 0.0, 0.101961], [0.788235, 0.376471, 0.694118], [0.764706, 0.635294, 0.215686], [0.486275, 0.309804, 0.227451], [0.976471, 0.619608, 0.466667], [0.337255, 0.396078, 0.392157], [0.819608, 0.576471, 1.0], [0.176471, 0.121569, 0.411765], [0.254902, 0.105882, 0.203922], [0.686275, 0.576471, 0.596078], [0.384314, 0.619608, 0.6], [0.741176, 0.870588, 0.482353], [1.0, 0.368627, 0.580392], [0.058824, 0.160784, 0.137255], [0.721569, 0.745098, 0.67451], [0.454902, 0.231373, 0.396078], [0.062745, 0.0, 0.05098], [0.498039, 0.431373, 0.741176], [0.619608, 0.419608, 0.231373], [1.0, 0.27451, 0.0], [0.498039, 0.0, 0.529412], [1.0, 0.807843, 0.243137], [0.188235, 0.231373, 0.262745], [0.996078, 0.647059, 1.0], [0.541176, 0.007843, 0.243137], [0.462745, 0.172549, 0.003922], [0.039216, 0.541176, 0.588235], [0.019608, 0.0, 0.321569], [0.556863, 0.839216, 0.196078], [0.32549, 0.768627, 0.45098], [0.278431, 0.34902, 0.443137], [0.345098, 0.007843, 0.133333], [0.65098, 0.133333, 0.003922], [0.564706, 0.576471, 0.298039], [0.0, 0.262745, 0.117647], [0.505882, 0.0, 0.819608], [0.184314, 0.14902, 0.247059], [0.74902, 0.223529, 0.517647], [0.960784, 1.0, 0.835294], [0.0, 0.827451, 1.0], [0.415686, 0.0, 0.972549], [0.611765, 0.733333, 0.823529], [0.478431, 0.85098, 0.670588], [0.411765, 0.341176, 0.364706], [0.0, 0.411765, 0.019608], [0.211765, 0.211765, 0.611765], [0.003922, 0.513725, 0.278431], [0.266667, 0.117647, 0.094118], [0.027451, 0.647059, 0.937255], [1.0, 0.505882, 0.188235], [0.654902, 0.333333, 0.721569], [0.407843, 0.352941, 0.513725], [0.45098, 1.0, 1.0], [0.85098, 0.529412, 0.007843], [0.733333, 0.827451, 1.0], [0.556863, 0.215686, 0.184314], [0.654902, 0.627451, 0.501961], [0.0, 0.490196, 0.890196], [0.556863, 0.494118, 0.560784], [0.6, 0.266667, 0.533333], [0.0, 0.945098, 0.207843], [0.682353, 0.666667, 0.788235], [0.627451, 0.380392, 0.384314], [0.298039, 0.227451, 0.466667], [0.423529, 0.509804, 0.513725], [0.945098, 0.866667, 0.905882], [1.0, 0.733333, 0.827451], [0.219608, 0.647059, 0.137255], [0.705882, 1.0, 0.658824], [0.047059, 0.070588, 0.027451], [0.843137, 0.321569, 0.431373], [0.584314, 0.623529, 0.996078], [0.490196, 0.498039, 0.0], [0.462745, 0.623529, 0.72549], [0.858824, 0.529412, 0.498039], [0.066667, 0.07451, 0.098039], [0.831373, 0.509804, 0.831373], [0.623529, 0.0, 0.74902], [0.862745, 0.937255, 1.0], [0.556863, 0.670588, 0.603922], [0.443137, 0.392157, 0.258824], [0.290196, 0.235294, 0.243137], [0.031373, 0.305882, 0.372549], [0.611765, 0.721569, 0.266667], [0.847059, 0.870588, 0.835294], [0.796078, 1.0, 0.423529], [0.701961, 0.392157, 0.921569], [0.27451, 0.364706, 0.2], [0.0, 0.619608, 0.490196], [0.760784, 0.254902, 0.0], [0.309804, 0.737255, 0.733333], [0.85098, 0.545098, 0.694118], [0.356863, 0.45098, 0.713725], [0.294118, 0.254902, 0.003922], [0.584314, 0.513725, 0.368627], [0.286275, 0.454902, 0.545098], [1.0, 0.45098, 1.0], [0.513725, 0.415686, 0.113725], [0.862745, 0.811765, 1.0], [0.494118, 0.419608, 0.996078], [0.388235, 0.462745, 0.376471], [1.0, 0.756863, 0.572549], [0.34902, 0.368627, 0.0], [0.894118, 0.035294, 0.901961], [0.72549, 0.694118, 0.717647], [0.827451, 0.176471, 0.254902], [0.196078, 0.258824, 0.215686], [0.85098, 0.639216, 0.388235], [0.356863, 0.545098, 0.2], [0.184314, 0.121569, 0.0], [0.596078, 0.905882, 0.843137], [0.164706, 0.384314, 0.341176], [0.807843, 0.447059, 0.301961], [0.364706, 0.239216, 0.156863], [0.0, 0.34902, 0.85098], [0.678431, 0.580392, 0.839216], [0.419608, 0.117647, 0.580392], [0.705882, 0.003922, 0.368627], [0.254902, 0.0, 0.27451], [0.615686, 1.0, 0.811765], [0.894118, 0.282353, 0.615686], [0.890196, 0.890196, 0.278431], [0.862745, 0.886275, 0.647059], [0.0, 0.156863, 0.352941], [0.666667, 0.356863, 0.509804], [0.0, 0.0, 0.862745], [0.294118, 0.305882, 0.317647], [0.854902, 0.74902, 0.835294], [0.0, 0.301961, 0.6], [0.533333, 0.392157, 0.619608], [0.415686, 0.117647, 0.113725], [0.556863, 0.321569, 0.772549], [0.721569, 0.854902, 0.87451], [0.866667, 0.701961, 0.992157], [0.482353, 0.282353, 0.329412], [0.298039, 0.45098, 0.0], [0.270588, 0.0, 0.466667], [0.698039, 0.372549, 0.0], [0.572549, 0.819608, 0.52549], [0.333333, 0.2, 0.298039], [0.411765, 0.690196, 0.521569], [0.670588, 0.576471, 0.690196], [0.905882, 0.329412, 0.258824], [0.560784, 0.54902, 0.541176], [0.439216, 0.678431, 0.317647], [0.670588, 0.486275, 0.454902], [0.0, 0.203922, 0.235294], [0.145098, 0.058824, 0.07451], [0.905882, 0.690196, 0.0], [0.478431, 0.8, 0.862745], [0.094118, 0.078431, 0.227451], [0.615686, 0.321569, 0.223529], [0.733333, 0.482353, 0.192157], [0.717647, 0.792157, 0.580392], [0.192157, 0.031373, 0.0], [0.639216, 0.584314, 0.023529], [0.0, 0.854902, 0.729412], [0.454902, 0.627451, 0.870588], [0.388235, 0.235294, 0.45098], [1.0, 0.854902, 0.560784], [0.466667, 0.721569, 0.0], [0.25098, 0.184314, 0.113725], [0.345098, 0.529412, 0.34902], [0.176471, 0.0, 0.129412], [0.960784, 0.631373, 0.831373], [0.854902, 0.0, 0.666667], [0.462745, 0.160784, 0.286275], [0.741176, 0.898039, 0.0], [0.764706, 0.760784, 0.364706]], instance_colors=[[0.188235, 0.635294, 0.854902], [0.988235, 0.309804, 0.188235], [0.898039, 0.682353, 0.219608], [0.427451, 0.564706, 0.309804], [0.545098, 0.545098, 0.545098], [0.090196, 0.745098, 0.811765], [0.580392, 0.403922, 0.741176], [0.839216, 0.152941, 0.156863], [0.121569, 0.466667, 0.705882], [0.890196, 0.466667, 0.760784], [0.54902, 0.337255, 0.294118], [0.737255, 0.741176, 0.133333], [0.227451, 0.003922, 0.513725], [0.0, 0.262745, 0.0], [0.058824, 1.0, 0.662745], [0.368627, 0.0, 0.25098], [0.776471, 0.741176, 1.0], [0.258824, 0.313725, 0.321569], [0.721569, 0.0, 0.501961], [1.0, 0.717647, 0.701961], [0.490196, 0.007843, 0.0], [0.380392, 0.14902, 1.0], [1.0, 1.0, 0.603922], [0.682353, 0.788235, 0.670588], [0.0, 0.52549, 0.486275], [0.333333, 0.227451, 0.0], [0.580392, 0.988235, 1.0], [0.0, 0.74902, 0.0], [0.490196, 0.0, 0.627451], [0.670588, 0.447059, 0.0], [0.568627, 1.0, 0.0], [0.003922, 0.745098, 0.541176], [0.0, 0.270588, 0.482353], [0.784314, 0.509804, 0.435294], [1.0, 0.121569, 0.513725], [0.866667, 0.0, 1.0], [0.019608, 0.454902, 0.0], [0.392157, 0.266667, 0.380392], [0.533333, 0.560784, 1.0], [1.0, 0.713725, 0.956863], [0.32549, 0.384314, 0.215686], [0.807843, 0.521569, 1.0], [0.407843, 0.415686, 0.517647], [0.745098, 0.705882, 0.745098], [0.647059, 0.376471, 0.537255], [0.584314, 0.827451, 1.0], [0.003922, 0.0, 0.972549], [1.0, 0.501961, 0.007843], [0.545098, 0.160784, 0.270588], [0.678431, 0.627451, 0.427451], [0.32549, 0.270588, 0.545098], [0.784314, 1.0, 0.85098], [0.666667, 0.27451, 0.0], [1.0, 0.47451, 0.560784], [0.513725, 0.827451, 0.443137], [0.564706, 0.619608, 0.74902], [0.580392, 0.0, 0.960784], [0.921569, 0.815686, 0.607843], [0.678431, 0.545098, 0.694118], [0.0, 0.388235, 0.290196], [1.0, 0.862745, 0.0], [0.533333, 0.466667, 0.317647], [0.494118, 0.670588, 0.639216], [0.0, 0.0, 0.592157], [0.960784, 0.0, 0.776471], [0.396078, 0.2, 0.160784], [0.0, 0.4, 0.470588], [0.015686, 0.890196, 0.784314], [0.654902, 0.215686, 0.682353], [0.772549, 0.858824, 0.882353], [0.301961, 0.431373, 1.0], [0.607843, 0.576471, 0.003922], [0.803922, 0.345098, 0.419608], [0.937255, 0.870588, 0.996078], [0.47451, 0.352941, 0.0], [0.372549, 0.533333, 0.603922], [0.705882, 1.0, 0.572549], [0.368627, 0.447059, 0.419608], [0.321569, 0.0, 0.4], [0.019608, 0.529412, 0.317647], [0.517647, 0.12549, 0.435294], [0.235294, 0.588235, 0.019608], [0.396078, 0.45098, 0.0], [0.945098, 0.627451, 0.423529], [0.372549, 0.313725, 0.270588], [0.741176, 0.0, 0.290196], [0.815686, 0.407843, 0.152941], [0.843137, 0.588235, 0.670588], [0.537255, 0.364706, 1.0], [0.509804, 0.423529, 0.462745], [0.168627, 0.333333, 0.72549], [0.431373, 0.486275, 0.733333], [0.905882, 0.835294, 0.827451], [0.364706, 0.0, 0.094118], [0.486275, 0.231373, 0.003922], [0.501961, 0.694118, 0.490196], [0.784314, 0.85098, 0.490196], [0.0, 0.909804, 0.231373], [0.486275, 0.698039, 1.0], [1.0, 0.333333, 1.0], [0.643137, 0.152941, 0.129412], [0.113725, 0.894118, 1.0], [0.490196, 0.686275, 0.231373], [0.482353, 0.294118, 0.568627], [0.878431, 1.0, 0.282353], [0.419608, 0.0, 0.768627], [0.803922, 0.658824, 0.592157], [0.745098, 0.388235, 0.768627], [0.537255, 0.803922, 0.807843], [0.27451, 0.011765, 0.784314], [0.368627, 0.572549, 0.47451], [0.254902, 0.290196, 0.003922], [0.019608, 0.654902, 0.615686], [0.811765, 0.54902, 0.215686], [1.0, 0.972549, 0.815686], [0.262745, 0.329412, 0.443137], [0.709804, 0.266667, 1.0], [0.811765, 0.286275, 0.576471], [0.811765, 0.643137, 0.87451], [0.580392, 0.831373, 0.0], [0.654902, 0.580392, 0.854902], [0.176471, 0.647059, 0.345098], [0.552941, 0.890196, 0.713725], [0.643137, 0.662745, 0.615686], [0.423529, 0.360784, 0.717647], [1.0, 0.494118, 0.368627], [0.654902, 0.513725, 0.541176], [0.686275, 0.745098, 0.847059], [0.164706, 0.768627, 1.0], [0.65098, 0.407843, 0.239216], [0.964706, 0.568627, 0.996078], [0.529412, 0.294118, 0.392157], [1.0, 0.047059, 0.294118], [0.129412, 0.368627, 0.137255], [0.258824, 0.572549, 1.0], [0.529412, 0.513725, 0.615686], [0.403922, 0.176471, 0.270588], [0.694118, 0.309804, 0.254902], [0.0, 0.305882, 0.32549], [0.372549, 0.105882, 0.0], [0.678431, 0.254902, 0.403922], [0.313725, 0.196078, 0.403922], [0.839216, 1.0, 0.992157], [0.498039, 0.709804, 0.819608], [0.662745, 0.72549, 0.411765], [1.0, 0.588235, 0.796078], [0.784314, 0.454902, 0.584314], [0.211765, 0.313725, 0.223529], [1.0, 0.815686, 0.388235], [0.368627, 0.345098, 0.384314], [0.529412, 0.580392, 0.462745], [0.662745, 0.470588, 1.0], [0.011765, 0.784314, 0.388235], [0.905882, 0.745098, 0.831373], [0.831373, 0.890196, 0.815686], [0.529412, 0.403922, 0.564706], [0.537255, 0.486275, 0.152941], [0.803922, 0.862745, 1.0], [0.666667, 0.403922, 0.419608], [0.196078, 0.203922, 0.454902], [1.0, 0.368627, 0.662745], [0.0, 0.607843, 0.690196], [0.443137, 1.0, 0.866667], [0.470588, 0.360784, 0.219608], [0.313725, 0.396078, 0.607843], [0.8, 0.0, 0.701961], [0.341176, 0.482353, 0.333333], [0.317647, 0.431373, 0.482353], [0.003922, 0.372549, 0.572549], [0.666667, 0.741176, 0.745098], [0.003922, 0.498039, 0.6], [0.015686, 0.866667, 0.592157], [0.529412, 0.227451, 0.172549], [0.941176, 0.588235, 0.556863], [0.458824, 0.776471, 0.666667], [0.439216, 0.411765, 0.364706], [0.8, 0.862745, 0.035294], [0.686275, 0.521569, 0.341176], [0.847059, 0.0, 0.458824], [0.615686, 0.247059, 0.505882], [0.85098, 0.270588, 0.0], [0.866667, 0.403922, 0.329412], [0.372549, 1.0, 0.47451], [0.835294, 0.694118, 0.45098], [0.384314, 0.14902, 0.368627], [0.729412, 0.635294, 0.239216], [0.85098, 0.94902, 0.701961], [0.341176, 0.007843, 0.560784], [0.631373, 0.607843, 0.666667], [0.301961, 0.290196, 0.152941], [0.643137, 0.662745, 1.0], [0.67451, 0.909804, 0.858824], [0.6, 0.34902, 0.003922], [0.67451, 0.0, 0.886275], [0.278431, 0.509804, 0.184314], [0.796078, 0.764706, 0.678431], [0.0, 0.772549, 0.713725], [0.380392, 0.32549, 0.470588], [0.2, 0.427451, 0.407843], [0.647059, 0.572549, 0.501961], [0.517647, 0.6, 0.635294], [0.992157, 0.341176, 0.392157], [0.439216, 0.588235, 0.823529], [0.447059, 0.552941, 0.027451], [0.498039, 0.0, 0.298039], [0.082353, 0.188235, 0.627451], [0.819608, 0.756863, 0.886275], [0.788235, 0.521569, 0.815686], [0.423529, 0.270588, 0.294118], [0.498039, 0.0, 0.141176], [0.0, 0.635294, 0.47451], [0.698039, 0.662745, 0.811765], [0.976471, 0.0, 0.0], [0.690196, 0.913725, 1.0], [0.576471, 0.619608, 0.313725], [0.447059, 0.478431, 0.509804], [0.85098, 0.180392, 0.333333], [0.278431, 0.380392, 0.003922], [0.0, 0.34902, 1.0], [0.466667, 0.25098, 0.709804], [0.67451, 0.894118, 0.376471], [0.403922, 0.270588, 0.145098], [0.321569, 0.364706, 0.317647], [0.584314, 0.45098, 0.407843], [0.662745, 0.894118, 0.603922], [0.639216, 0.0, 0.345098], [0.85098, 0.384314, 0.964706], [0.556863, 0.490196, 0.811765], [1.0, 0.741176, 0.576471], [0.639216, 0.0, 0.572549], [0.603922, 1.0, 0.72549], [0.654902, 0.760784, 1.0], [0.956863, 0.384314, 0.0], [0.898039, 0.941176, 1.0], [0.721569, 0.611765, 0.643137], [0.376471, 0.588235, 0.580392], [1.0, 0.623529, 0.207843], [0.54902, 0.160784, 0.0], [0.447059, 0.419608, 0.196078], [0.87451, 0.509804, 0.305882], [0.686275, 0.482353, 0.835294], [0.737255, 0.176471, 0.0], [0.482353, 0.435294, 0.639216], [0.282353, 0.262745, 0.384314], [0.780392, 0.639216, 1.0], [0.0, 0.301961, 0.156863], [0.768627, 0.776471, 0.556863], [0.878431, 0.282353, 0.843137], [0.905882, 0.913725, 0.396078], [0.898039, 0.756863, 0.043137], [0.0, 0.956863, 0.945098], [0.623529, 0.356863, 0.635294], [0.298039, 0.254902, 0.717647], [0.396078, 0.2, 0.556863], [0.462745, 0.494118, 0.423529], [0.662745, 0.541176, 0.211765]], desired_min_size=1200, show=False, window_mode='dual')[source]
From image and target generates a visualization image as Tensor.
- Parameters:
image (
Tensor) – Original imagetarget (
BaseFormat) – Target to visualizecategories (
Dict[int, str], optional) – Categories to be used as labels as Dict[int, str]. If set to None will use label indexes. Defaults to None.save_path (
Union[str, Path], optional) – Path to save visualization, if set to “” will not save the visualization. Defaults to “”.class_colors (
List[Sequence[float]], optional) – Colors to be used for classes. Needs to be RGB normalized (divided by 255). Defaults to cc.glasbey_bw.instance_colors (
List[Sequence[float]], optional) – Colors to be used for instances. Defaults to cc.glasbey_hv.desired_min_size (
int, optional) – Resize to this specific min size value (preserving shape). Defaults to 1200.show (
bool, optional) – Either to display it on the flight or not. Defaults to False.window_mode (
Literal["dual", "single"], optional) – if dual : provide a combination of image + visu, otherwise will provide only visu. Defaults to “dual”.
- Returns:
visualization Tensor
- Return type:
Tensor