使用LabVIEW实现基于pytorch的DeepLabv3图像语义分割

LabVIEW深度学习实战 2023-03-22 1633

描述

前言

今天我们一起来看一下如何使用LabVIEW实现语义分割。

一、什么是语义分割

**图像语义分割（semantic segmentation），从字面意思上理解就是让计算机根据图像的语义来进行分割，例如让计算机在输入下面左图的情况下，能够输出右图。语义在语音识别中指的是语音的意思，在图像领域，语义指的是图像的内容，对图片意思的理解，比如下图的语义就是一个人牵着四只羊；分割的意思是从像素的角度分割出图片中的不同对象，对原图中的每个像素都进行标注，比如下图中浅黄色代表人，蓝绿色代表羊。语义分割任务就是将图片中的不同类别，用不同的颜色标记出来，每一个类别使用一种颜色。常用于医学图像，卫星图像，无人车驾驶，机器人等领域。 **

机器学习

如何做到将像素点上色呢？

**语义分割的输出和图像分类网络类似，图像分类类别数是一个一维的one hot 矩阵。例如：三分类的[0,1,0]。语义分割任务最后的输出特征图是一个三维结构，大小与原图类似，其中通道数是类别数，每个通道所标记的像素点，是该类别在图像中的位置，最后通过argmax 取每个通道有用像素合成一张图像，用不同颜色表示其类别位置。语义分割任务其实也是分类任务中的一种，他不过是对每一个像素点进行细分，找到每一个像素点所述的类别。这就是语义分割任务啦~ **

机器学习

二、什么是deeplabv3

**DeepLabv3是一种语义分割架构，它在DeepLabv2的基础上进行了一些修改。为了处理在多个尺度上分割对象的问题，设计了在级联或并行中采用多孔卷积的模块，通过采用多个多孔速率来捕获多尺度上下文。此外，来自 DeepLabv2 的 Atrous Spatial Pyramid Pooling模块增加了编码全局上下文的图像级特征，并进一步提高了性能。 **

机器学习

三、LabVIEW调用DeepLabv3实现图像语义分割

1、模型获取及转换

安装pytorch和torchvision
获取torchvision中的模型：deeplabv3_resnet101(我们获取预训练好的模型)：

original_model= models.segmentation.deeplabv3_resnet101(pretrained=True)

转onnx

def get_pytorch_onnx_model(original_model): # define the directoryforfurther converted model save onnx_model_path = dirname # define thenameoffurther converted model onnx_model_name = "deeplabv3_resnet101.onnx" #createdirectoryforfurther converted model os.makedirs(onnx_model_path, exist_ok=True) #getfullpathtothe converted model full_model_path = os.path.join(onnx_model_path, onnx_model_name) # generate modelinputgenerated_input = Variable( torch.randn(1,3,448,448) ) # model exportintoONNXformattorch.onnx.export( original_model, generated_input, full_model_path,verbose=True, input_names=["input"], output_names=["output",'aux'], opset_version=11)returnfull_model_path

完整获取及模型转换python代码如下：

importosimporttorchimporttorch.onnxfromtorch.autogradimportVariablefromtorchvisionimportmodelsimportre dirname, filename = os.path.split(os.path.abspath(__file__))print(dirname)defget_pytorch_onnx_model(original_model):# define the directory for further converted model saveonnx_model_path = dirname# define the name of further converted modelonnx_model_name ="deeplabv3_resnet101.onnx"# create directory for further converted modelos.makedirs(onnx_model_path, exist_ok=True)# get full path to the converted modelfull_model_path = os.path.join(onnx_model_path, onnx_model_name)# generate model inputgenerated_input = Variable( torch.randn(1,3,448,448) )# model export into ONNX formattorch.onnx.export( original_model, generated_input, full_model_path, verbose=True, input_names=["input"], output_names=["output",'aux'], opset_version=11)returnfull_model_pathdefmain():# initialize PyTorch ResNet-101 modeloriginal_model = models.segmentation.deeplabv3_resnet101(pretrained=True)# get the path to the converted into ONNX PyTorch modelfull_model_path = get_pytorch_onnx_model(original_model)print("PyTorch ResNet-101 model was successfully converted: ", full_model_path)if__name__ =="__main__": main()

我们会发现，基于pytorch的DeepLabv3模型获取和之前的mask rcnn模型大同小异。

2、关于deeplabv3_resnet101

我们使用的模型是：deeplabv3_resnet101，该模型返回两个张量，与输入张量相同，但有21个classes。输出[“out”]包含语义掩码，而输出[“aux”]包含每像素的辅助损失值。在推理模式中，输出[‘aux]没有用处。因此，输出“out”形状为（N、21、H、W）。我们在转模型的时候设置H，W为448，N一般为1；

我们的模型是基于VOC2012数据集

VOC2012数据集分为20类，包括背景为21类，分别如下：