Coder Social home page Coder Social logo

alibaba-cloud-german-ai-challenge-2018's Introduction

Alibaba-Cloud-German-AI-Challenge-2018

这是天池大数据的比赛
https://tianchi.aliyun.com/competition/introduction.htm?spm=5176.100067.5678.1.3e7731f5WP7NmY&raceId=231683

1.数据集描述:
数据集分为训练集和验证集,图像包括sen1,sen2两部分,其中sen1是sar图像,分为实部和虚部,有两个通道是滤波处理的。sen2是高光谱图像

特点:
a.由两个卫星的图层叠加而成,可以视为32,32,18的图片
b.测试集和验证集均出现类别分布不平衡情况,可以视为多标签不平衡分类问题
c.sen1图层的像素波动很大(从负数到上千),sen2图层像素多为0-1之间的浮点数

2.训练模型(分为预处理,特征提取,模型训练,后处理4个模块,因为各个模块难度不同,从简单往难的做, []表示待执行的步骤)

模型训练进展:
网络采用了L_Resnet_E_IR, 损失函数基于softmax entropy
[2]

预处理模块:
1.对于数据分布不平衡的问题,采用重复过采样,保证训练时的数据分布平衡
[4]

特征提取模块:
对sen1,sen2做特征处理,参考blog http://blog.sina.com.cn/s/articlelist_1984634525_4_1.html

后处理模块:
[5]

todo list:
[2]尝试不同激活函数和损失函数
[4]图像归一化处理,镜像堆对称,随机裁剪,提取中心和四角的子图片x5
[5]先利用神经网络学习特征,然后获取神经网络最后一层的向量,利用传统分类器,如GBDT,LightGBM来分类

1.利用训练集,迭代70000步,在训练集上达到过拟合(拟合度100%),在验证集上面准确率在60%左右
2.训练期间,实现early stopping,34000步时达到最优,此时对于训练集拟合度在90%,验证集准确率61%
3.结合1,2,可以看出训练集过不过拟合对于验证集的分类没有太大影响
4.提交在验证集上性能最优的模型(61%准确率),线上测试集效果不到60%
5.从测试集和验证集准确率近似,提出猜想,测试集和验证集的分布近似,所以决定利用验证集来辅助模型训练
6.将训练集和验证集融合起来进行训练,达到过拟合(100%)后提交结果,线上效果达到75%
7.训练集为train+val, 针对数据的不平衡情况,采用了权重采样处理,保证每个batch的迭代中各个类别的分布比例平衡,采用early stopping,线上达到77.9%
---------------------------------------------------------------------------------------12.6日进展,线上排名前5%
8.构造数据集:从val训练集里面拆分3000个样本出来构造val21000和val3000,保证类别分布一致
9.train+val24000做训练集,val3000做验证集,将18个通道每个通道训练一个神经网络,利用投票记数法,将18个网络模型进行集成,线上72%
10.train+val24000做训练集,val3000做测试集,将18个通道每个通道训练一个神经网络,根据在每个模型在各个类别上的performance构建权重矩阵(18* 17),对18个网络模型进行集成,线上74%
11.5层神经网络,val24000做训练集,只用sen2, val3000上达到92%,线上78.1%
12.L-Resnet-E-IR网络,val24000做训练集,只用sen2, val3000上达到94%,线上79.4%
13.利用线上77%,78%,79%的结果进行加权投票,线上81.7%
---------------------------------------------------------------------------------------12.28日进展,线上排名前113/1300

alibaba-cloud-german-ai-challenge-2018's People

Contributors

colabin avatar zhaochuanyun avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.