Coder Social home page Coder Social logo

jayboxyz / deeplearning-image-segmentation Goto Github PK

View Code? Open in Web Editor NEW
142.0 142.0 31.0 13 KB

:bread: 基于深度学习方法的图像分割(含语义分割、实例分割、全景分割)。

Python 68.11% IDL 31.89%
cnn deeplearning image segmentation

deeplearning-image-segmentation's People

Contributors

jayboxyz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

deeplearning-image-segmentation's Issues

如何 CRF 后处理?

代码参考:

1、https://github.com/Gurupradeep/FCN-for-Semantic-Segmentation/blob/master/CRF.ipynb
2、CRF对分割图像进行优化处理 https://blog.csdn.net/heavenpeien/article/details/79890993#commentsedit


1、fcn后,利用crf修饰分割所得的图像边缘:https://blog.csdn.net/jiachen0212/article/details/78474913

其实crf主要就是,根据原图image(5割通道的信息,包括RGB和坐标位置 x y)和 图像跑过fcn模型后所得到softmax(具有概率分布的),这两个数据进行再一次的像素点分类。

比如image的尺寸是 mxn,且这里是一个图像二分割问题(就比如最简单的目标和背景的分割),那么每个像素点就有两种可能的归宿---目标/背景(0/1)。所以,softmax的维度则是2xmxn。接下来,crf就是根据image提供的5通道信息,在rgb值和像素点的空间位置两个方向上对现有的softmax进行重新调整,重新分配像素点的0/1归属。

crf认为,空间位置上,距离很近的像素点该是分为同一类的,rgb上也是一样。所有会给予一些惩罚值和一些能量项。这个就得具体研究crf的资料了,李航老师的《统计学习方法》就讲的很详细了。
————————————————
版权声明:本文为CSDN博主「jiachen0212」的原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/jiachen0212/article/details/78474913

语义分割模型代码

先看个他人训练遥感图像数据的一些操作:

【天池直播间】图像语义分割在遥感影像中的应用_哔哩哔哩 (゜-゜)つロ 干杯~-bilibili

创新点:
image

随机切割:
image

膨胀预测:
image


一、图像切割

写在前面,先熟悉如下一些函数:https://www.cnblogs.com/lemonbit/p/6864179.html

  • x0 = random.randrange(0, w - crop_W + 1, 16) #randrange()方法返回指定递增基数集合中的一个随机数,基数默认值为1
  • if np.random.randint(0, 2) == 0: #randint()生成在半开半闭区间[low,high)上离散均匀分布的整数值 0、1
  • numpy.random.random(size=None) 生成[0,1)之间的浮点数

1、滑动裁剪

2、随机裁剪 random crop

# 随机窗口采样
def generate_train_dataset(image_num = 1000,
                           train_image_path='dataset/train/images/',
                           train_label_path='dataset/train/labels/'):
    '''
    该函数用来生成训练集,切图方法为随机切图采样
    :param image_num: 生成样本的个数
    :param train_image_path: 切图保存样本的地址
    :param train_label_path: 切图保存标签的地址
    :return:s
    '''

    # 用来记录所有的子图的数目
    g_count = 1

    images_path = ['dataset/origin/1.png', 'dataset/origin/2.png',
                   'dataset/origin/3.png', 'dataset/origin/4.png']
    labels_path = ['dataset/origin/1_class.png', 'dataset/origin/2_class.png',
                   'dataset/origin/3_class.png', 'dataset/origin/4_class.png']

    # 每张图片生成子图的个数
    image_each = image_num // len(images_path)
    image_path, label_path = [], []
    for i in tqdm(range(len(images_path))):
        count = 0
        image = cv2.imread(images_path[i])
        label = cv2.imread(labels_path[i], cv2.CAP_MODE_GRAY)
        X_height, X_width = image.shape[0], image.shape[1]
        while count < image_each:
            random_width = random.randint(0, X_width - size - 1)
            random_height = random.randint(0, X_height - size - 1)
            image_ogi = image[random_height: random_height + size, random_width: random_width + size,:]
            label_ogi = label[random_height: random_height + size, random_width: random_width + size]

            image_d, label_d = data_augment(image_ogi, label_ogi)

            image_path.append(train_image_path+'%05d.png' % g_count)
            label_path.append(train_label_path+'%05d.png' % g_count)
            cv2.imwrite((train_image_path+'%05d.png' % g_count), image_d)
            cv2.imwrite((train_label_path+'%05d.png' % g_count), label_d)

            count += 1
            g_count += 1
    df = pd.DataFrame({'image': image_path, 'label': label_path})
    # df.to_csv('dataset/path_list.csv', index=False)
    df.to_csv('dataset/'+ path_list, index=False)

二、数据增强

使用 python 图像处理库进行数据增强:

1、pil 库:图片处理库PIL

这是第一份

# 以下函数都是一些数据增强的函数
def gamma_transform(img, gamma):
    gamma_table = [np.power(x / 255.0, gamma) * 255.0 for x in range(size)]

    gamma_table = np.round(np.array(gamma_table)).astype(np.uint8)

    return cv2.LUT(img, gamma_table)


def random_gamma_transform(img, gamma_vari):
    log_gamma_vari = np.log(gamma_vari)

    alpha = np.random.uniform(-log_gamma_vari, log_gamma_vari)

    gamma = np.exp(alpha)

    return gamma_transform(img, gamma)


def rotate(xb, yb, angle):
    M_rotate = cv2.getRotationMatrix2D((size /2, size / 2), angle, 1)

    xb = cv2.warpAffine(xb, M_rotate, (size, size))

    yb = cv2.warpAffine(yb, M_rotate, (size, size))

    return xb, yb


def blur(img):
    img = cv2.blur(img, (3, 3))

    return img


def add_noise(img):
    for i in range(size):  # 添加点噪声

        temp_x = np.random.randint(0, img.shape[0])

        temp_y = np.random.randint(0, img.shape[1])

        img[temp_x][temp_y] = 255

    return img


def data_augment(xb, yb):
    if np.random.random() < 0.25:
        xb, yb = rotate(xb, yb, 90)

    if np.random.random() < 0.25:
        xb, yb = rotate(xb, yb, 180)

    if np.random.random() < 0.25:
        xb, yb = rotate(xb, yb, 270)

    if np.random.random() < 0.25:
        xb = cv2.flip(xb, 1)  # flipcode > 0:沿y轴翻转

        yb = cv2.flip(yb, 1)

    if np.random.random() < 0.25:
        xb = random_gamma_transform(xb, 1.0)

    if np.random.random() < 0.25:
        xb = blur(xb)

    # 双边过滤
    if np.random.random() < 0.25:
        xb =cv2.bilateralFilter(xb,9,75,75)

    #  高斯滤波
    if np.random.random() < 0.25:
        xb = cv2.GaussianBlur(xb,(5,5),1.5)

    if np.random.random() < 0.2:
        xb = add_noise(xb)

    return xb, yb

这里来自【深度学习中的数据增强与实现 https://www.jianshu.com/p/3e9f4812abbc

# -*- coding:utf-8 -*-
"""数据增强
   1. 翻转变换 flip
   2. 随机修剪 random crop
   3. 色彩抖动 color jittering
   4. 平移变换 shift
   5. 尺度变换 scale
   6. 对比度变换 contrast
   7. 噪声扰动 noise
   8. 旋转变换/反射变换 Rotation/reflection
   author: XiJun.Gong
   date:2016-11-29
"""

from PIL import Image, ImageEnhance, ImageOps, ImageFile
import numpy as np
import random
import threading, os, time
import logging

logger = logging.getLogger(__name__)
ImageFile.LOAD_TRUNCATED_IMAGES = True


class DataAugmentation:
    """
    包含数据增强的八种方式
    """


    def __init__(self):
        pass

    @staticmethod
    def openImage(image):
        return Image.open(image, mode="r")

    @staticmethod
    def randomRotation(image, mode=Image.BICUBIC):
        """
         对图像进行随机任意角度(0~360度)旋转
        :param mode 邻近插值,双线性插值,双三次B样条插值(default)
        :param image PIL的图像image
        :return: 旋转转之后的图像
        """
        random_angle = np.random.randint(1, 360)
        return image.rotate(random_angle, mode)

    @staticmethod
    def randomCrop(image):
        """
        对图像随意剪切,考虑到图像大小范围(68,68),使用一个一个大于(36*36)的窗口进行截图
        :param image: PIL的图像image
        :return: 剪切之后的图像

        """
        image_width = image.size[0]
        image_height = image.size[1]
        crop_win_size = np.random.randint(40, 68)
        random_region = (
            (image_width - crop_win_size) >> 1, (image_height - crop_win_size) >> 1, (image_width + crop_win_size) >> 1,
            (image_height + crop_win_size) >> 1)
        return image.crop(random_region)

    @staticmethod
    def randomColor(image):
        """
        对图像进行颜色抖动
        :param image: PIL的图像image
        :return: 有颜色色差的图像image
        """
        random_factor = np.random.randint(0, 31) / 10.  # 随机因子
        color_image = ImageEnhance.Color(image).enhance(random_factor)  # 调整图像的饱和度
        random_factor = np.random.randint(10, 21) / 10.  # 随机因子
        brightness_image = ImageEnhance.Brightness(color_image).enhance(random_factor)  # 调整图像的亮度
        random_factor = np.random.randint(10, 21) / 10.  # 随机因1子
        contrast_image = ImageEnhance.Contrast(brightness_image).enhance(random_factor)  # 调整图像对比度
        random_factor = np.random.randint(0, 31) / 10.  # 随机因子
        return ImageEnhance.Sharpness(contrast_image).enhance(random_factor)  # 调整图像锐度

    @staticmethod
    def randomGaussian(image, mean=0.2, sigma=0.3):
        """
         对图像进行高斯噪声处理
        :param image:
        :return:
        """

        def gaussianNoisy(im, mean=0.2, sigma=0.3):
            """
            对图像做高斯噪音处理
            :param im: 单通道图像
            :param mean: 偏移量
            :param sigma: 标准差
            :return:
            """
            for _i in range(len(im)):
                im[_i] += random.gauss(mean, sigma)
            return im

        # 将图像转化成数组
        img = np.asarray(image)
        img.flags.writeable = True  # 将数组改为读写模式
        width, height = img.shape[:2]
        img_r = gaussianNoisy(img[:, :, 0].flatten(), mean, sigma)
        img_g = gaussianNoisy(img[:, :, 1].flatten(), mean, sigma)
        img_b = gaussianNoisy(img[:, :, 2].flatten(), mean, sigma)
        img[:, :, 0] = img_r.reshape([width, height])
        img[:, :, 1] = img_g.reshape([width, height])
        img[:, :, 2] = img_b.reshape([width, height])
        return Image.fromarray(np.uint8(img))

    @staticmethod
    def saveImage(image, path):
        image.save(path)


def makeDir(path):
    try:
        if not os.path.exists(path):
            if not os.path.isfile(path):
                # os.mkdir(path)
                os.makedirs(path)
            return 0
        else:
            return 1
    except Exception, e:
        print str(e)
        return -2


def imageOps(func_name, image, des_path, file_name, times=5):
    funcMap = {"randomRotation": DataAugmentation.randomRotation,
               "randomCrop": DataAugmentation.randomCrop,
               "randomColor": DataAugmentation.randomColor,
               "randomGaussian": DataAugmentation.randomGaussian
               }
    if funcMap.get(func_name) is None:
        logger.error("%s is not exist", func_name)
        return -1

    for _i in range(0, times, 1):
        new_image = funcMap[func_name](image)
        DataAugmentation.saveImage(new_image, os.path.join(des_path, func_name + str(_i) + file_name))


opsList = {"randomRotation", "randomCrop", "randomColor", "randomGaussian"}


def threadOPS(path, new_path):
    """
    多线程处理事务
    :param src_path: 资源文件
    :param des_path: 目的地文件
    :return:
    """
    if os.path.isdir(path):
        img_names = os.listdir(path)
    else:
        img_names = [path]
    for img_name in img_names:
        print img_name
        tmp_img_name = os.path.join(path, img_name)
        if os.path.isdir(tmp_img_name):
            if makeDir(os.path.join(new_path, img_name)) != -1:
                threadOPS(tmp_img_name, os.path.join(new_path, img_name))
            else:
                print 'create new dir failure'
                return -1
                # os.removedirs(tmp_img_name)
        elif tmp_img_name.split('.')[1] != "DS_Store":
            # 读取文件并进行操作
            image = DataAugmentation.openImage(tmp_img_name)
            threadImage = [0] * 5
            _index = 0
            for ops_name in opsList:
                threadImage[_index] = threading.Thread(target=imageOps,
                                                       args=(ops_name, image, new_path, img_name,))
                threadImage[_index].start()
                _index += 1
                time.sleep(0.2)


if __name__ == '__main__':
    threadOPS("/home/pic-image/train/12306train",
              "/home/pic-image/train/12306train3")

遥感图像语义分割数据集

ISPRS 数据集

《SCAttNet Semantic Segmentation Network with Spatial and Channel Attention Mechanism for High-Resolution Remote Sensing Images》

image

Potsdam 数据集:

《基于深度学习模型的遥感图像分割方法_许玥》:

image

image

Vaihingen数据集:

《基于多类特征深度学习的高分辨率遥感影像分类_刘威》硕士论文:

由于Vaihingen高分辨率遥感影像数据集只对原始正射影像中的16幅影像提供了对
应的地面真实类别标签,所以本文选择了 Vaihingen 数据集中的这 16 幅影像作为实验
数据集,这 16 幅影像数据的编号分别为:1、3、5、7、11、13、15、17、21、23、26、
28、30、32、34 和 37,文中将第 1、3、5、7、11、13、15、17、21、23、26、28、32
幅影像作为训练集,将第 30、34、37 幅影像作为测试集。训练集是专门用来训练网络
模型的,当模型训练完毕后,我们还需要用未训练过的影像来测试网络的分类效果,这
种用于在模型训练完毕后测试网络模型效果的数据集即为测试集。

《基于深度学习的高分辨率遥感影像语义分割的研究与应用_汪志文.caj》

image

image

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.