Semantic Segmentation with Deep Lake and MMSegmentation¶
Integration Interface¶
MMSegmentation works with configs. Deeplake adopted this strategy, and in order to train MMSeg models, you need to create/specify your model and training/validation config. Deep Lake integration's logic is almost the same as MMSegmentation's with some minor modifications. The integrations with MMSeg occurs in the deeplake.integrations.mmseg module. At a high-level, Deep Lake is responsible for the pytorch dataloader that streams data to the training framework, while MMSeg is used for the training, transformation, and evaluation logic. Let us take a look at the config with deeplake changes:
Learn more about MMSegmentation here.
Example Configuration with Deep Lake¶
This tutorial shows how to train a semantic segmentation model using MMSegmentation with data stored in Deep Lake. We'll use a PSPNet model with ResNet-101 backbone trained on COCO data.
Prerequisites¶
Install the required packages:
python -m pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 -f https://download.pytorch.org/whl/torch_stable.html
python -m pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu116/torch1.12.0/index.html
git clone https://github.com/open-mmlab/mmsegmentation.git
cd mmsegmentation
git checkout v0.30.0
python -m pip install -e .
# Old pytorch version does not work with the new numpy versions
python -m pip install numpy==1.24.4 --force-reinstall
Note: We use MMSegmentation versions compatible with Deep Lake's integration.
Setup¶
import os
import deeplake
from mmcv import Config
import mmcv
from deeplake.integrations import mmseg as mmseg_deeplake
# Set your Deep Lake token
token = os.environ["ACTIVELOOP_TOKEN"]
Configuration¶
Here's our PSPNet configuration with Deep Lake integration:
from mmdet.apis import set_random_seed
_base_ = '<mmsegmentation_path>/configs/pspnet/pspnet_r101-d8_512x512_4x4_160k_coco-stuff164k.py'
# Normalize configuration
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (256, 256)
reduce_zero_label=False
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(type='Resize', img_scale=(320, 240), ratio_range=(0.5, 2.0)),
dict(type='RandomCrop', crop_size=crop_size),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size=crop_size, pad_val=0),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(320, 240),
# img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75],
flip=False,
transforms=[
dict(type='Resize', img_scale=(320, 240), keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
evaluation = dict(metric=["mIoU"], interval=10000)
data = dict(
samples_per_gpu=4,
workers_per_gpu=8,
train=dict(
pipeline=train_pipeline,
deeplake_path="hub://activeloop/coco-train-seg-mask",
deeplake_tensors = {"img": "images", "gt_semantic_seg": "seg_masks"},
deeplake_dataloader={"shuffle": False, "num_workers": 0, "drop_last": True}
),
val=dict(
pipeline=test_pipeline,
deeplake_path="hub://activeloop/coco-val-seg-mask"",
deeplake_tensors = {"img": "images", "gt_semantic_seg": "seg_masks"},
deeplake_dataloader={"shuffle": False, "batch_size": 1, "num_workers": 0, "drop_last": True}
)
)
work_dir = "./deeplake_logs"
optimizer = dict(lr=0.02 / 8)
lr_config = dict(warmup=None)
log_config = dict(interval=50)
checkpoint_config = dict(interval=5000)
runner = dict(type="IterBasedRunner", max_iters=100000, max_epochs=None)
device = "cuda"
Training¶
Now we can start the training:
if __name__ == "__main__":
current_loc = os.getcwd()
cfg_file = f"{current_loc}/seg_mask_config.py"
# Read the config file
cfg = Config.fromfile(cfg_file)
cfg.model.decode_head.num_classes = 81
cfg.model.auxiliary_head.num_classes = 81
# build segmentor
model = mmseg_deeplake.build_segmentor(
cfg.model
)
# Create work directory
mmcv.mkdir_or_exist(os.path.abspath(cfg.work_dir))
# train_segmentor
mmseg_deeplake.train_segmentor(
model,
cfg,
distributed=True, # Set to True for multi-GPU training
validate=True, # Set to True if you have validation data
)
Deep Lake Integration Benefits¶
-
Efficient Mask Handling: Deep Lake efficiently stores and loads segmentation masks, which can be large and memory-intensive.
-
Automatic Format Conversion: Deep Lake handles conversion between different mask formats (binary, RLE, polygon) automatically.
-
Smart Batching: Deep Lake's dataloader handles variable-sized images and masks efficiently.
-
Memory Management: Data is loaded on-demand, preventing out-of-memory issues with large datasets.
-
Distributed Training Support: Seamless integration with MMSegmentation's distributed training.
Monitoring Training¶
Monitor training progress:
# Check latest log file
log_file = os.path.join(cfg.work_dir, 'latest.log')
if os.path.exists(log_file):
with open(log_file, 'r') as f:
print(f.read())
Inference¶
After training, use the model for inference:
from mmseg.apis import inference_segmentor, init_segmentor
# Load trained model
checkpoint = os.path.join(cfg.work_dir, 'latest.pth')
model = init_segmentor(config_path, checkpoint)
# Load an image
img = 'path/to/test/image.jpg'
# Run inference
result = inference_segmentor(model, img)
Key Integration Parameters¶
data
: Central to the MMSegmentation configuration file, it specifies the training and validation datasets, transformations, and paths.train
: Contains dataset path, credentials, and transformations for training data.val
: Contains dataset path, credentials, and transformations for validation data.pipeline
: A list of transformations applied to the dataset.deeplake_path
: Path to the Deep Lake dataset for training and validation.deeplake_credentials
: (Optional) Required for private, nonlocal datasets.deeplake_tag_id
: (Optional) Specifies a dataset commit for reproducibility.deeplake_query
: (Optional) Used to load datasets based on a query.deeplake_tensors
: Maps MMSegmentation tensors to Deep Lake tensors:"img"
: Image tensor."gt_semantic_seg"
: Semantic segmentation tensor.
Common Issues and Solutions¶
-
Memory Issues:
- Reduce
samples_per_gpu
in config - Decrease image size in pipeline
- Use smaller batch sizes
- Reduce
-
Performance Issues:
- Increase
num_workers
indeeplake_dataloader
- Enable distributed training
- Use proper GPU settings
- Increase
-
Mask Format Issues:
- Verify mask format in dataset
- Check normalization settings
- Ensure proper padding configuration
Custom Loss Functions¶
# Add custom loss function to decode head
config['model']['decode_head']['loss_decode'] = dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0,
class_weight=[1.0] * 171 # Class weights for imbalanced datasets
)