zalandoresearch / fashion-mnist Goto Github PK

View Code? Open in Web Editor NEW

11.6K 334.0 3.0K 105.85 MB

A MNIST-like fashion product database. Benchmark :point_down:

Home Page: http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/

License: MIT License

Python 70.32% CSS 1.10% HTML 13.68% JavaScript 13.69% Dockerfile 1.21%

mnist deep-learning benchmark machine-learning dataset computer-vision fashion fashion-mnist gan zalando

fashion-mnist's Introduction

Fashion-MNIST

Table of Contents

Why we made Fashion-MNIST
Get the Data
Usage
Benchmark
Visualization
Contributing
Contact
Citing Fashion-MNIST
License

Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

Here's an example of how the data looks (each class takes three-rows):

Why we made Fashion-MNIST

The original MNIST dataset contains a lot of handwritten digits. Members of the AI/ML/Data Science community love this dataset and use it as a benchmark to validate their algorithms. In fact, MNIST is often the first dataset researchers try. "If it doesn't work on MNIST, it won't work at all", they said. "Well, if it does work on MNIST, it may still fail on others."

To Serious Machine Learning Researchers

Seriously, we are talking about replacing MNIST. Here are some good reasons:

MNIST is too easy. Convolutional nets can achieve 99.7% on MNIST. Classic machine learning algorithms can also achieve 97% easily. Check out our side-by-side benchmark for Fashion-MNIST vs. MNIST, and read "Most pairs of MNIST digits can be distinguished pretty well by just one pixel."
MNIST is overused. In this April 2017 Twitter thread, Google Brain research scientist and deep learning expert Ian Goodfellow calls for people to move away from MNIST.
MNIST can not represent modern CV tasks, as noted in this April 2017 Twitter thread, deep learning expert/Keras author François Chollet.

Get the Data

Many ML libraries already include Fashion-MNIST data/API, give it a try!

You can use direct links to download the dataset. The data is stored in the same format as the original MNIST data.

Name	Content	Examples	Size	Link	MD5 Checksum
`train-images-idx3-ubyte.gz`	training set images	60,000	26 MBytes	Download	`8d4fb7e6c68d591d4c3dfef9ec88bf0d`
`train-labels-idx1-ubyte.gz`	training set labels	60,000	29 KBytes	Download	`25c81989df183df01b3e8a0aad5dffbe`
`t10k-images-idx3-ubyte.gz`	test set images	10,000	4.3 MBytes	Download	`bef4ecab320f06d8554ea6380940ec79`
`t10k-labels-idx1-ubyte.gz`	test set labels	10,000	5.1 KBytes	Download	`bb300cfdad3c16e7a12a480ee83cd310`

Alternatively, you can clone this GitHub repository; the dataset appears under data/fashion. This repo also contains some scripts for benchmark and visualization.

git clone [email protected]:zalandoresearch/fashion-mnist.git

Labels

Each training and test example is assigned to one of the following labels:

Label	Description
0	T-shirt/top
1	Trouser
2	Pullover
3	Dress
4	Coat
5	Sandal
6	Shirt
7	Sneaker
8	Bag
9	Ankle boot

Usage

Loading data with Python (requires NumPy)

Use utils/mnist_reader in this repo:

import mnist_reader
X_train, y_train = mnist_reader.load_mnist('data/fashion', kind='train')
X_test, y_test = mnist_reader.load_mnist('data/fashion', kind='t10k')

Loading data with Tensorflow

Make sure you have downloaded the data and placed it in data/fashion. Otherwise, Tensorflow will download and use the original MNIST.

from tensorflow.examples.tutorials.mnist import input_data
data = input_data.read_data_sets('data/fashion')

data.train.next_batch(BATCH_SIZE)

Note, Tensorflow supports passing in a source url to the read_data_sets. You may use:

data = input_data.read_data_sets('data/fashion', source_url='http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/')

Also, an official Tensorflow tutorial of using tf.keras, a high-level API to train Fashion-MNIST can be found here.

Loading data with other machine learning libraries

To date, the following libraries have included Fashion-MNIST as a built-in dataset. Therefore, you don't need to download Fashion-MNIST by yourself. Just follow their API and you are ready to go.

You are welcome to make pull requests to other open-source machine learning packages, improving their support to Fashion-MNIST dataset.

Loading data with other languages

As one of the Machine Learning community's most popular datasets, MNIST has inspired people to implement loaders in many different languages. You can use these loaders with the Fashion-MNIST dataset as well. (Note: may require decompressing first.) To date, we haven't yet tested all of these loaders with Fashion-MNIST.

Benchmark

We built an automatic benchmarking system based on scikit-learn that covers 129 classifiers (but no deep learning) with different parameters. Find the results here.

You can reproduce the results by running benchmark/runner.py. We recommend building and deploying this Dockerfile.

You are welcome to submit your benchmark; simply create a new issue and we'll list your results here. Before doing that, please make sure it does not already appear in this list. Visit our contributor guidelines for additional details.

The table below collects the submitted benchmarks. Note that we haven't yet tested these results. You are welcome to validate the results using the code provided by the submitter. Test accuracy may differ due to the number of epoch, batch size, etc. To correct this table, please create a new issue.

Classifier	Preprocessing	Fashion test accuracy	MNIST test accuracy	Submitter	Code
2 Conv+pooling	None	0.876	-	Kashif Rasul	🔗
2 Conv+pooling	None	0.916	-	Tensorflow's doc	🔗
2 Conv+pooling+ELU activation (PyTorch)	None	0.903	-	@AbhirajHinge	🔗
2 Conv	Normalization, random horizontal flip, random vertical flip, random translation, random rotation.	0.919	0.971	Kyriakos Efthymiadis	🔗
2 Conv <100K parameters	None	0.925	0.992	@hardmaru	🔗
2 Conv ~113K parameters	Normalization	0.922	0.993	Abel G.	🔗
2 Conv+3 FC ~1.8M parameters	Normalization	0.932	0.994	@Xfan1025	🔗
2 Conv+3 FC ~500K parameters	Augmentation, batch normalization	0.934	0.994	@cmasch	🔗
2 Conv+pooling+BN	None	0.934	-	@khanguyen1207	🔗
2 Conv+2 FC	Random Horizontal Flips	0.939	-	@ashmeet13	🔗
3 Conv+2 FC	None	0.907	-	@Cenk Bircanoğlu	🔗
3 Conv+pooling+BN	None	0.903	0.994	@meghanabhange	🔗
3 Conv+pooling+2 FC+dropout	None	0.926	-	@Umberto Griffo	🔗
3 Conv+BN+pooling	None	0.921	0.992	@gchhablani	🔗
5 Conv+BN+pooling	None	0.931	-	@Noumanmufc1	🔗
CNN with optional shortcuts, dense-like connectivity	standardization+augmentation+random erasing	0.947	-	@kennivich	🔗
GRU+SVM	None	0.888	0.965	@AFAgarap	🔗
GRU+SVM with dropout	None	0.897	0.988	@AFAgarap	🔗
WRN40-4 8.9M params	standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips)	0.967	-	@ajbrock	🔗 🔗
DenseNet-BC 768K params	standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips)	0.954	-	@ajbrock	🔗 🔗
MobileNet	augmentation (horizontal flips)	0.950	-	@苏剑林	🔗
ResNet18	Normalization, random horizontal flip, random vertical flip, random translation, random rotation.	0.949	0.979	Kyriakos Efthymiadis	🔗
GoogleNet with cross-entropy loss	None	0.937	-	@Cenk Bircanoğlu	🔗
AlexNet with Triplet loss	None	0.899	-	@Cenk Bircanoğlu	🔗
SqueezeNet with cyclical learning rate 200 epochs	None	0.900	-	@snakers4	🔗
Dual path network with wide resnet 28-10	standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips)	0.957	-	@Queequeg	🔗
MLP 256-128-100	None	0.8833	-	@heitorrapela	🔗
VGG16 26M parameters	None	0.935	-	@QuantumLiu	🔗 🔗
WRN-28-10	standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips)	0.959	-	@zhunzhong07	🔗
WRN-28-10 + Random Erasing	standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips)	0.963	-	@zhunzhong07	🔗
Human Performance	Crowd-sourced evaluation of human (with no fashion expertise) performance. 1000 randomly sampled test images, 3 labels per image, majority labelling.	0.835	-	Leo	-
Capsule Network 8M parameters	Normalization and shift at most 2 pixel and horizontal flip	0.936	-	@XifengGuo	🔗
HOG+SVM	HOG	0.926	-	@subalde	🔗
XgBoost	scaling the pixel values to mean=0.0 and var=1.0	0.898	0.958	@anktplwl91	🔗
DENSER	-	0.953	0.997	@fillassuncao	🔗 🔗
Dyra-Net	Rescale to unit interval	0.906	-	@Dirk Schäfer	🔗 🔗
Google AutoML	24 compute hours (higher quality)	0.939	-	@Sebastian Heinz	🔗
Fastai	Resnet50+Fine-tuning+Softmax on last layer's activations	0.9312	-	@Sayak	🔗

Other Explorations of Fashion-MNIST

Fashion-MNIST: Year in Review

Fashion-MNIST on Google Scholar

Generative adversarial networks (GANs)

Tensorflow implementation of various GANs and VAEs. (Recommend to read! Note how various GANs generate different results on Fashion-MNIST, which can not be easily observed on the original MNIST.)
Make a ghost wardrobe using DCGAN
fashion-mnist的gan玩具
CGAN output after 5000 steps
GAN Playground - Explore Generative Adversarial Nets in your Browser

Clustering

Video Tutorial

Machine Learning Meets Fashion by Yufeng G @ Google Cloud

Introduction to Kaggle Kernels by Yufeng G @ Google Cloud

动手学深度学习 by Mu Li @ Amazon AI

Apache MXNet으로 배워보는 딥러닝(Deep Learning) - 김무현 (AWS 솔루션즈아키텍트)

Visualization

t-SNE on Fashion-MNIST (left) and original MNIST (right)

PCA on Fashion-MNIST (left) and original MNIST (right)

UMAP on Fashion-MNIST (left) and original MNIST (right)

PyMDE on Fashion-MNIST (left) and original MNIST (right)

Contributing

Thanks for your interest in contributing! There are many ways to get involved; start with our contributor guidelines and then check these open issues for specific tasks.

Contact

To discuss the dataset, please use .

Citing Fashion-MNIST

If you use Fashion-MNIST in a scientific publication, we would appreciate references to the following paper:

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. Han Xiao, Kashif Rasul, Roland Vollgraf. arXiv:1708.07747

Biblatex entry:

@online{xiao2017/online,
  author       = {Han Xiao and Kashif Rasul and Roland Vollgraf},
  title        = {Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms},
  date         = {2017-08-28},
  year         = {2017},
  eprintclass  = {cs.LG},
  eprinttype   = {arXiv},
  eprint       = {cs.LG/1708.07747},
}

Who is citing Fashion-MNIST?

License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

fashion-mnist's People

Contributors

Stargazers

Watchers

Forkers

gitter-badger riomus robinsingh1 johndpope seanreed1111 edtruji ml-lab sam2015 farisology kashif ubaidsayyed54 dreadlord1984 rosssong mnrmja007 googed allan-rop orchestor yashbmewada linecode leezqcst sarthusarth rouseguy iqbal-chowdhury shikharateverest yfliao pandeyiyer allensmile truongthanhdat lyk125 nguyentu1323 morpheus3000 raldam ahmedhani mahmoudzareef banisafar gabrielsilva822 98950 mutual-ai wuyijiang07 mockmew ashishkej tphyhfighting awesome-archive dhananjaymehta jianweilin csjunxu eagles2f evitself kuyun-zhangyang midasc novelmartis winnerineast oppa3109 jiajialin maggie0830 joizhang2012 tandalf zgsxwsdxg dreamclimb anand-singh rubenszimbres read-papers visionalyst tony32769 gjtjx mave5 uberman4740 walkoncross reddysainathn shaoweipng thefifthhead wavelets abhi-infrrd patrickjonesdotca hackerout d4le neuroradiology rohitsaha propellingbits huleg saifrahmed little1tow zhangjuju herbyme jimbog benjamesbabala deveshtarasia caesaryang cliff007 19ai convolutionroc grseb9s yijizhao vijayendra-tripathi kgl-prml noblestreet rukor swachalit haojunyu guanlongtianzi

fashion-mnist's Issues

Please add: Zappr config, GitHub tags to enhance searchability

For the latter: GitHub offers tags near the project one-liner to help people search for projects. Some suggestions: machinelearning, datascience, mnist, datasets, datavisualization, python.

Benchmark: Conv Net - Accuracy: 92.56%

Tried this network topology that can be summarized as follows:

Convolutional layer with 32 feature maps of size 5×5.
Pooling layer taking the max over 2*2 patches.
Convolutional layer with 64 feature maps of size 5×5.
Pooling layer taking the max over 2*2 patches.
Convolutional layer with 128 feature maps of size 1×1.
Pooling layer taking the max over 2*2 patches.
Flatten layer.
Fully connected layer with 1024 neurons and rectifier activation.
Dropout layer with a probability of 50%.
Fully connected layer with 510 neurons and rectifier activation.
Dropout layer with a probability of 50%.
Output layer.

I used Normalization as Preprocessing and 5-fold cross-validation to evaluate the model.
Accuracy scores: [0.92433, 0.92133, 0.923581, 0.92391, 0.92466]
Mean Accuracy: 0.923567
Stdev Accuracy: 0.001175
Final Accuracy: 92.56%

You can find the code here.

Benchmark: Conv Net - Accuracy: 90.26%

Network is as follows:

No pre-processing.
Convolutional layer with 16 feature maps of size 5 x 5 with ELU activation.
Max Pooling layer of size 2 x 2.
Convolutional layer with 32 feature maps of size 5 x 5 with ELU activation.
Max Pooling layer of size 2 x 2.

Accuracy achieved on Test Dataset is 90.26 %.
I know that CNN networks of 2 layers are already present in benchmarks, but those have been implemented in Keras and Tensorflow. This network has been implemented in PyTorch, if that counts.

The code can be found here.

tensorflow's read_data_sets would yield a validation set with the same labels

from tensorflow.examples.tutorials.mnist import input_data
data = input_data.read_data_sets('data/fashion', one_hot=True)

the validation set only contain 0 as labels

Loaders which use crop cause issue since difference in MNIST fashion and MNIST data in terms of pixels always black

Hi,

There is a difference in terms of a pixel which is always black in training images across all 60000 samples.

The original MNIST data has 67 pixels of 28*28 pixels

which are zero across all 60000 train images.

I wrote a MATLAB script to check the same; please use code.zip and run it on both data sets separately.

This makes all the parsers/ loaders which remove padding not work optimally with MNIST Fashion data.
E.g. The loader for MNIST Images for MATLAB mentioned in README.md (under this) doesn't work for MNIST Fashion data
https://de.mathworks.com/matlabcentral/fileexchange/27675-read-digits-and-labels-from-mnist-database?focused=5154133&tab=function

Please remove all Loaders from README.md which remove padding; since there exists atleast one image in MNIST Fashion training data where pixel is non-black.

Benchmark: Wide ResNet and DenseNets

WRN40-4 lands at 3.93% error using standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips), Nesterov Momentum with a half-wave cosine annealing schedule and an initial LR of 0.1, trained for 300 epochs. DenseNet-BC with k=12 and D=100 land at 4.64% error with the same settings and 100 epochs of training, currently running a 300 epoch run. The WRN has 8.9M params, the DenseNet-BC has 768K params.

benchmark: I used it for training my proposed model

I proposed a GRU+SVM model at the university as my undergraduate research, the proposal paper may be read here. Simply put, my proposal is to use SVM as the classification function for GRU instead of the 'conventional' Softmax.

Here are the results of the training using your dataset:

Epoch : 0 completed out of 10, loss : 274.4726867675781, accuracy : 0.7890625
Epoch : 1 completed out of 10, loss : 201.513671875, accuracy : 0.87890625
Epoch : 2 completed out of 10, loss : 201.049072265625, accuracy : 0.859375
Epoch : 3 completed out of 10, loss : 155.28115844726562, accuracy : 0.890625
Epoch : 4 completed out of 10, loss : 145.94015502929688, accuracy : 0.9140625
Epoch : 5 completed out of 10, loss : 148.19613647460938, accuracy : 0.88671875
Epoch : 6 completed out of 10, loss : 155.27915954589844, accuracy : 0.87890625
Epoch : 7 completed out of 10, loss : 192.79263305664062, accuracy : 0.8671875
Epoch : 8 completed out of 10, loss : 151.52243041992188, accuracy : 0.90234375
Epoch : 9 completed out of 10, loss : 171.39292907714844, accuracy : 0.8984375
Accuracy : 0.8878000378608704

The source may be found in my GitHub Gist. The hyper-parameters used were as follows:

BATCH_SIZE = 256
CELL_SIZE = 256
EPOCHS = 10
LEARNING_RATE = 0.01
NUM_CLASSES = 10
SVM_C = 1

Trained using tf.train.AdamOptimizer(), and used tf.nn.static_rnn().

4 CONV layers with max pooling after each 2 CONV layers (93.37%)

The model consists of 2 conv layers with max pooling, dense layer (512) and batch normalization.

Detailed architecture:

Link Jupyter notebook: https://github.com/khanguyen1207/My-Machine-Learning-Corner/blob/master/Zalando%20MNIST/fashion.ipynb
Result : 0.9337 with 40 epochs

VGG-like network with 26557k+ params ,93.5%

Bulit by keras
source code

Duplicate samples and overlap between train and test

I hope I have got this right, but it seems that there are 43 samples duplicated in the training set and 1 sample that is duplicated in the test set. There are also 10 samples in the training set that appear in the test set. This was done by comparing the samples at the byte level.

Here is a list of the duplicates:

Training set duplicates:
[601, 39865]
[831, 24228]
[1826, 23718]
[2024, 53883]
[4974, 6293]
[5520, 49165]
[5790, 11845]
[5822, 33399]
[6139, 37731]
[6280, 41036]
[8485, 31238]
[8841, 28184]
[12571, 56657]
[14096, 32343]
[14710, 22159]
[15587, 28635]
[19308, 20114]
[19668, 21571]
[19760, 39489]
[19888, 24443]
[21072, 32800]
[22852, 28789]
[23052, 57107]
[23413, 33731]
[24785, 46015]
[25297, 40077]
[25629, 49588]
[26314, 49351]
[27045, 40033]
[27421, 31627]
[32113, 38337]
[32300, 33730]
[32303, 56840]
[32888, 41918]
[32922, 54584]
[36634, 39841]
[38261, 41877]
[42756, 53842]
[46667, 57724]
[46782, 54829]
[47929, 54185]
[48480, 59607]
[48955, 51368]
Test set duplicates:
[6334, 8569]
Training set samples overlapping with test set:
Train samples [3763] overlap with test samples [7243]
Train samples [4944] overlap with test samples [7781]
Train samples [6168] overlap with test samples [9227]
Train samples [12404] overlap with test samples [4037]
Train samples [15943] overlap with test samples [6659]
Train samples [22403] overlap with test samples [7762]
Train samples [34617] overlap with test samples [4990]
Train samples [35772] overlap with test samples [7216]
Train samples [48228] overlap with test samples [5867]
Train samples [52205] overlap with test samples [9560]

The code required to generate the above output is as follows (assuming the input images are in the variables train_X and test_X:

def sample_bytes(x):
    result = []
    for i in range(len(x)):
        b = x[i].tobytes()
        result.append(b)
    return result

train_h = sample_bytes(train_X)
test_h = sample_bytes(test_X)

train_dict = {}
test_dict = {}
for i, h in enumerate(train_h):
    train_dict.setdefault(h, []).append(i)
for i, h in enumerate(test_h):
    test_dict.setdefault(h, []).append(i)

print('Training set duplicates:')
for k, v in train_dict.items():
    if len(v) > 1:
        for j in range(1, len(v)):
            assert (ds.train_X_u8[v[0]] == ds.train_X_u8[v[j]]).all()
        print(v)

print('Test set duplicates:')
for k, v in test_dict.items():
    if len(v) > 1:
        for j in range(1, len(v)):
            assert (ds.test_X_u8[v[0]] == ds.test_X_u8[v[j]]).all()
        print(v)

print('Training set samples overlapping with test set:')
for k, v in train_dict.items():
    if k in test_dict:
        assert (ds.train_X_u8[v[0]] == ds.test_X_u8[test_dict[k][0]]).all()
        print('Train samples {} overlap with test samples {}'.format(v, test_dict[k]))

overlap = set(train_h).intersection(set(test_h))
print(len(overlap))
assert overlap == set()

Three Layer CNN With 90.33% accuracy

I tried to implement a three-layered CNN with Batch Normalisation and MaxPooling. I got an accuracy of 90.33 % after 5000 iterations. I used Batch normalisation after every layer to accelerate the performance.

Also, I got an accuracy of 99.04 % on MNIST with the same network

The Architecture is as follows-
1.Convolutional Layer 1 with output feature map 16 and 5*5 kernel ReLU Activation

MaxPooling 1
BatchNormalisation 1

Convolutional Layer 2 with output feature map 32 and 5*5 kernel ReLU Activation

MaxPooling 2
BatchNormalisation 2

Convolutional Layer 3 with output feature map 64 and 5*5 kernel ReLU Activation

MaxPooling 3
BatchNormalisation 3

Fully Connected Layer.

Code

Convert dataset in t7 format for Torch research project usage

Hi,

I've converted the original Fashion-MNIST dataset format from Python/Numpy format to t7 format for Torch users.
The demo code for Fashion-MNIST classification can be seen from train-a-fashion-classifier.

It could be useful for Torch users.

Benchmark: Wide ResNet 28-10 + Random Erasing, Top-1 accuracy: 96.35%

Hi, we achieve 95.99% top-1 accuracy with using WRN-28-10 on Fashion-MNIST using standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips).

When using WRN-28-10 + Random Erasing, it gives 96.35% top-1 accuracy.

The code will be available soon on github.

3 Conv 2 FC layer Benchmark

I use 3 Conv and 2 FC layered network for feature extraction with Cosine Loss function and embedding creation part, and I use Linear SVM for classifier.
I got accuracies as;
Train: 0.9838
Test: 0.9072

benchmark: update on GRU+SVM with Dropout

Hey @hanxiao , it's me again. I saw an update in the dataset, regarding duplicate samples. I did another training using my GRU+SVM (with Dropout) model (from #8 ) on the updated dataset. Here's the result:

Epoch : 0 completed out of 100, loss : 316.9036560058594, accuracy : 0.734375
Epoch : 1 completed out of 100, loss : 201.2646026611328, accuracy : 0.83984375
Epoch : 2 completed out of 100, loss : 253.3709259033203, accuracy : 0.796875
Epoch : 3 completed out of 100, loss : 257.7744140625, accuracy : 0.8359375
Epoch : 4 completed out of 100, loss : 179.52682495117188, accuracy : 0.8671875
Epoch : 5 completed out of 100, loss : 224.97421264648438, accuracy : 0.83984375
Epoch : 6 completed out of 100, loss : 212.19381713867188, accuracy : 0.859375
Epoch : 7 completed out of 100, loss : 200.80978393554688, accuracy : 0.859375
Epoch : 8 completed out of 100, loss : 187.77052307128906, accuracy : 0.85546875
Epoch : 9 completed out of 100, loss : 190.96389770507812, accuracy : 0.86328125
Epoch : 10 completed out of 100, loss : 185.72314453125, accuracy : 0.85546875
Epoch : 11 completed out of 100, loss : 189.3765411376953, accuracy : 0.8515625
Epoch : 12 completed out of 100, loss : 130.086669921875, accuracy : 0.89453125
Epoch : 13 completed out of 100, loss : 151.38232421875, accuracy : 0.8828125
Epoch : 14 completed out of 100, loss : 159.71595764160156, accuracy : 0.88671875
Epoch : 15 completed out of 100, loss : 218.80592346191406, accuracy : 0.84375
Epoch : 16 completed out of 100, loss : 131.5895233154297, accuracy : 0.9140625
Epoch : 17 completed out of 100, loss : 162.96995544433594, accuracy : 0.8671875
Epoch : 18 completed out of 100, loss : 155.52630615234375, accuracy : 0.890625
Epoch : 19 completed out of 100, loss : 159.76901245117188, accuracy : 0.88671875
Epoch : 20 completed out of 100, loss : 137.74642944335938, accuracy : 0.890625
Epoch : 21 completed out of 100, loss : 162.48875427246094, accuracy : 0.890625
Epoch : 22 completed out of 100, loss : 179.6526336669922, accuracy : 0.8828125
Epoch : 23 completed out of 100, loss : 127.58981323242188, accuracy : 0.8984375
Epoch : 24 completed out of 100, loss : 185.6982421875, accuracy : 0.8671875
Epoch : 25 completed out of 100, loss : 159.8983612060547, accuracy : 0.8828125
Epoch : 26 completed out of 100, loss : 160.69525146484375, accuracy : 0.89453125
Epoch : 27 completed out of 100, loss : 173.42813110351562, accuracy : 0.859375
Epoch : 28 completed out of 100, loss : 166.0702667236328, accuracy : 0.87890625
Epoch : 29 completed out of 100, loss : 157.59085083007812, accuracy : 0.87109375
Epoch : 30 completed out of 100, loss : 127.72993469238281, accuracy : 0.9140625
Epoch : 31 completed out of 100, loss : 136.65415954589844, accuracy : 0.90234375
Epoch : 32 completed out of 100, loss : 172.4806365966797, accuracy : 0.8515625
Epoch : 33 completed out of 100, loss : 139.81488037109375, accuracy : 0.8984375
Epoch : 34 completed out of 100, loss : 144.55099487304688, accuracy : 0.85546875
Epoch : 35 completed out of 100, loss : 122.90949249267578, accuracy : 0.8984375
Epoch : 36 completed out of 100, loss : 150.0441131591797, accuracy : 0.890625
Epoch : 37 completed out of 100, loss : 153.2085723876953, accuracy : 0.88671875
Epoch : 38 completed out of 100, loss : 143.91455078125, accuracy : 0.8984375
Epoch : 39 completed out of 100, loss : 117.63712310791016, accuracy : 0.91796875
Epoch : 40 completed out of 100, loss : 93.80998229980469, accuracy : 0.92578125
Epoch : 41 completed out of 100, loss : 136.52537536621094, accuracy : 0.87109375
Epoch : 42 completed out of 100, loss : 137.24530029296875, accuracy : 0.90625
Epoch : 43 completed out of 100, loss : 108.73893737792969, accuracy : 0.921875
Epoch : 44 completed out of 100, loss : 106.48686218261719, accuracy : 0.9296875
Epoch : 45 completed out of 100, loss : 104.41219329833984, accuracy : 0.92578125
Epoch : 46 completed out of 100, loss : 101.19454956054688, accuracy : 0.94140625
Epoch : 47 completed out of 100, loss : 127.536376953125, accuracy : 0.91015625
Epoch : 48 completed out of 100, loss : 109.94172668457031, accuracy : 0.9296875
Epoch : 49 completed out of 100, loss : 85.25288391113281, accuracy : 0.94140625
Epoch : 50 completed out of 100, loss : 112.01800537109375, accuracy : 0.91796875
Epoch : 51 completed out of 100, loss : 107.6760482788086, accuracy : 0.91015625
Epoch : 52 completed out of 100, loss : 121.9848403930664, accuracy : 0.921875
Epoch : 53 completed out of 100, loss : 101.01953887939453, accuracy : 0.9375
Epoch : 54 completed out of 100, loss : 69.95838165283203, accuracy : 0.94921875
Epoch : 55 completed out of 100, loss : 119.3257827758789, accuracy : 0.91796875
Epoch : 56 completed out of 100, loss : 102.73481750488281, accuracy : 0.921875
Epoch : 57 completed out of 100, loss : 89.11821746826172, accuracy : 0.94921875
Epoch : 58 completed out of 100, loss : 110.71992492675781, accuracy : 0.9140625
Epoch : 59 completed out of 100, loss : 105.85194396972656, accuracy : 0.9375
Epoch : 60 completed out of 100, loss : 114.6805648803711, accuracy : 0.921875
Epoch : 61 completed out of 100, loss : 99.33323669433594, accuracy : 0.92578125
Epoch : 62 completed out of 100, loss : 128.26809692382812, accuracy : 0.90625
Epoch : 63 completed out of 100, loss : 117.59638214111328, accuracy : 0.9140625
Epoch : 64 completed out of 100, loss : 86.27313995361328, accuracy : 0.9453125
Epoch : 65 completed out of 100, loss : 114.16581726074219, accuracy : 0.92578125
Epoch : 66 completed out of 100, loss : 102.78227233886719, accuracy : 0.94921875
Epoch : 67 completed out of 100, loss : 88.23193359375, accuracy : 0.9375
Epoch : 68 completed out of 100, loss : 60.24769592285156, accuracy : 0.953125
Epoch : 69 completed out of 100, loss : 97.67103576660156, accuracy : 0.94140625
Epoch : 70 completed out of 100, loss : 86.58494567871094, accuracy : 0.91796875
Epoch : 71 completed out of 100, loss : 98.33272552490234, accuracy : 0.921875
Epoch : 72 completed out of 100, loss : 77.44849395751953, accuracy : 0.94921875
Epoch : 73 completed out of 100, loss : 114.52888488769531, accuracy : 0.9296875
Epoch : 74 completed out of 100, loss : 94.6647720336914, accuracy : 0.9453125
Epoch : 75 completed out of 100, loss : 106.62199401855469, accuracy : 0.921875
Epoch : 76 completed out of 100, loss : 116.0970230102539, accuracy : 0.91015625
Epoch : 77 completed out of 100, loss : 78.5435791015625, accuracy : 0.953125
Epoch : 78 completed out of 100, loss : 125.43787384033203, accuracy : 0.91796875
Epoch : 79 completed out of 100, loss : 112.84344482421875, accuracy : 0.9296875
Epoch : 80 completed out of 100, loss : 65.7440185546875, accuracy : 0.95703125
Epoch : 81 completed out of 100, loss : 115.66653442382812, accuracy : 0.91796875
Epoch : 82 completed out of 100, loss : 76.14566040039062, accuracy : 0.9375
Epoch : 83 completed out of 100, loss : 72.91943359375, accuracy : 0.95703125
Epoch : 84 completed out of 100, loss : 56.55884552001953, accuracy : 0.95703125
Epoch : 85 completed out of 100, loss : 87.09599304199219, accuracy : 0.93359375
Epoch : 86 completed out of 100, loss : 80.97771453857422, accuracy : 0.93359375
Epoch : 87 completed out of 100, loss : 94.14187622070312, accuracy : 0.9453125
Epoch : 88 completed out of 100, loss : 80.44708251953125, accuracy : 0.94140625
Epoch : 89 completed out of 100, loss : 52.18363952636719, accuracy : 0.96875
Epoch : 90 completed out of 100, loss : 93.15214538574219, accuracy : 0.9296875
Epoch : 91 completed out of 100, loss : 97.51387023925781, accuracy : 0.9296875
Epoch : 92 completed out of 100, loss : 82.44243621826172, accuracy : 0.9375
Epoch : 93 completed out of 100, loss : 60.52445983886719, accuracy : 0.96484375
Epoch : 94 completed out of 100, loss : 57.100406646728516, accuracy : 0.96484375
Epoch : 95 completed out of 100, loss : 89.62207794189453, accuracy : 0.94140625
Epoch : 96 completed out of 100, loss : 86.14447784423828, accuracy : 0.9375
Epoch : 97 completed out of 100, loss : 75.90823364257812, accuracy : 0.953125
Epoch : 98 completed out of 100, loss : 65.80587768554688, accuracy : 0.9609375
Epoch : 99 completed out of 100, loss : 114.98580169677734, accuracy : 0.92578125
Accuracy : 0.897300124168396

The hyper-parameters used were as follows:

BATCH_SIZE = 256
CELL_SIZE = 256
DROPOUT_P_KEEP = 0.85
EPOCHS = 100
LEARNING_RATE = 1e-3
SVM_C = 1

Trained using tf.train.AdamOptimizer(), with tf.nn.dynamic_rnn(). The source may still be found here.

The graph from TensorBoard, tracking the training (accuracy at the top, loss at the bottom):

The improved accuracy may not be too much, but I suppose it's still a considerable difference, i.e. ~85.5% v. ~89.7%.

[Suggestion] Point to `readMNIST` function from `darch` package to load fashion dataset

On readme.md, please point to readMNIST function from darch package to load fashion dataset.

benchmark: Try HOG + SVM

Just out of curiosity, I tried an old-fashioned HOG+SVM approach, with almost no tuning of the HOG parameters.

Training time:
50 minutes

Test accuracy:
0.926

You can find the code here (Requires OpenCV!)

Benchmark: dual path network with wide resnet 28-10 as backbone

Classifier: dual path network with WideResNet28-10 as the backbone network (47.75M).
Preprocessing: standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips).
Fashion test accuracy: 95.73%
Code:https://github.com/Queequeg92/DualPathNet
References:
[1] Chen, Yunpeng, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, and Jiashi Feng. "Dual Path Networks." arXiv preprint arXiv:1707.01629 (2017).
[2] Zagoruyko, Sergey, and Nikos Komodakis. "Wide residual networks." arXiv preprint arXiv:1605.07146 (2016).

GoogleNet with Cross Entropy Benchmark

I use GoogleNet for feature extraction with Cross Entropy Loss function and embedding creation part, and I use Linear SVM for classifier.
I got accuracies as;
Train: 0.9980
Test: 0.9365

Capsule Network on fashion-mnist, test error 93.55%

Preprocessing

Scale pixel values to [0,1]
Data augmentation: shift at most 2 pixel and horizontal flip.

Keras model structure

Total params: 8,153,360
Trainable params: 8,141,840
Non-trainable params: 11,520

Training time
200 minutes

Accuracy
93.55%

Source code
https://github.com/XifengGuo/CapsNet-Fashion-MNIST

CapsNet paper
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton. Dynamic Routing Between Capsules. NIPS 2017

CoolNameNet - 94.7% accuracy with 295.968 params

Just ended some home experiments and the last one gave 94.7% test accuracy with 295.968 params. Preprocessing: standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips) + Random Erasing

Source code: https://github.com/EgorDezhic/fashion-classifier

Benchmark: 2 conv avg pool + 1 fc

No preprocessing. See source code for exact network config.

Fashion-MNIST test accuracy: 97.39 %
Digit-MNIST test accuracy: 99.13 %

Source code: https://github.com/rfratila/Vulcan/blob/master/train_mnist_conv.py

Built with Lasagne and Theano

benchmark: GRU+SVM for MNIST dataset

I saw you updated the README to report on comparison between the accuracy for your dataset and MNIST, so I did a benchmark of my proposed GRU+SVM model (issue #8) on MNIST as well.

Here are the results:

Epoch : 0 completed out of 10, loss : 35.062599182128906, accuracy : 0.9296875
Epoch : 1 completed out of 10, loss : 20.101139068603516, accuracy : 0.9609375
Epoch : 2 completed out of 10, loss : 11.310111999511719, accuracy : 0.984375
Epoch : 3 completed out of 10, loss : 14.316896438598633, accuracy : 0.96875
Epoch : 4 completed out of 10, loss : 13.816293716430664, accuracy : 0.9609375
Epoch : 5 completed out of 10, loss : 8.049131393432617, accuracy : 0.984375
Epoch : 6 completed out of 10, loss : 10.147947311401367, accuracy : 0.984375
Epoch : 7 completed out of 10, loss : 8.27488899230957, accuracy : 0.9921875
Epoch : 8 completed out of 10, loss : 21.153032302856445, accuracy : 0.9609375
Epoch : 9 completed out of 10, loss : 10.881654739379883, accuracy : 0.9765625
Accuracy : 0.9658001661300659

The source may be found here, in my GitHub Gist. The hyper-parameters used were as follows:

BATCH_SIZE = 128
CELL_SIZE = 32
HM_EPOCHS = 10
LEARNING_RATE = 0.01
NUM_CLASSES = 10
SVM_C = 0.5

Trained using tf.train.AdamOptimizer(), and used the tf.nn.static_rnn().

DENSER CNN - Performance results on Fashion-MNIST

With DENSER we have obtained an accuracy value of 95.26% on the Fashion-MNIST benchmark, and 99.70% on the standard MNIST dataset.

More information regarding denser can be found in the following links:

https://cdv.dei.uc.pt/denser/
http://github.com/fillassuncao/denser-models (models to be uploaded very soon)
https://arxiv.org/pdf/1801.01563.pdf

The gitter link in README shows as missing picture

Is there a colored dataset?

I'm not an expert -- but shouldn't a canonical fashion dataset, which seems to be the objective of this repo, be in color?

Not that I'm complaining much - this is a great contribution... but i think it will be of more contribution if it were in color and a bit bigger? and preprocess to become grayscale...

benchmark: Update on GRU+SVM with Dropout on MNIST dataset

Since I did 100 epochs for GRU+SVM with Dropout on Fashion dataset, I also did 100 epochs on MNIST (the old one was 10 epochs only). The following were the hyper-parameters used:

BATCH_SIZE = 256
CELL_SIZE = 256
DROPOUT_P_KEEP = 0.85
EPOCHS = 100
LEARNING_RATE = 1e-3
NUM_CLASSES = 10
SVM_C = 1

The following is the result:

Epoch : 0 completed out of 100, loss : 141.47535705566406, accuracy : 0.9140625
Epoch : 1 completed out of 100, loss : 67.05036926269531, accuracy : 0.96875
Epoch : 2 completed out of 100, loss : 51.171600341796875, accuracy : 0.9765625
Epoch : 3 completed out of 100, loss : 72.32965850830078, accuracy : 0.9609375
Epoch : 4 completed out of 100, loss : 37.7554817199707, accuracy : 0.98046875
Epoch : 5 completed out of 100, loss : 24.296039581298828, accuracy : 0.98828125
Epoch : 6 completed out of 100, loss : 37.4559211730957, accuracy : 0.984375
Epoch : 7 completed out of 100, loss : 38.10890197753906, accuracy : 0.97265625
Epoch : 8 completed out of 100, loss : 33.97040939331055, accuracy : 0.97265625
Epoch : 9 completed out of 100, loss : 25.034709930419922, accuracy : 0.99609375
Epoch : 10 completed out of 100, loss : 27.721952438354492, accuracy : 0.98046875
Epoch : 11 completed out of 100, loss : 8.290353775024414, accuracy : 1.0
Epoch : 12 completed out of 100, loss : 25.927515029907227, accuracy : 0.9921875
Epoch : 13 completed out of 100, loss : 11.549110412597656, accuracy : 0.99609375
Epoch : 14 completed out of 100, loss : 34.728797912597656, accuracy : 0.98046875
Epoch : 15 completed out of 100, loss : 21.197731018066406, accuracy : 0.98828125
Epoch : 16 completed out of 100, loss : 11.47766399383545, accuracy : 0.9921875
Epoch : 17 completed out of 100, loss : 13.01932144165039, accuracy : 0.98828125
Epoch : 18 completed out of 100, loss : 4.497049808502197, accuracy : 1.0
Epoch : 19 completed out of 100, loss : 12.586877822875977, accuracy : 0.9921875
Epoch : 20 completed out of 100, loss : 6.10440731048584, accuracy : 0.99609375
Epoch : 21 completed out of 100, loss : 8.886781692504883, accuracy : 0.99609375
Epoch : 22 completed out of 100, loss : 7.0670166015625, accuracy : 1.0
Epoch : 23 completed out of 100, loss : 16.550621032714844, accuracy : 0.98828125
Epoch : 24 completed out of 100, loss : 7.014737129211426, accuracy : 0.99609375
Epoch : 25 completed out of 100, loss : 29.812110900878906, accuracy : 0.98828125
Epoch : 26 completed out of 100, loss : 2.2193398475646973, accuracy : 1.0
Epoch : 27 completed out of 100, loss : 14.020920753479004, accuracy : 0.9921875
Epoch : 28 completed out of 100, loss : 8.520711898803711, accuracy : 0.9921875
Epoch : 29 completed out of 100, loss : 2.4392218589782715, accuracy : 1.0
Epoch : 30 completed out of 100, loss : 25.517803192138672, accuracy : 0.98828125
Epoch : 31 completed out of 100, loss : 11.551563262939453, accuracy : 0.9921875
Epoch : 32 completed out of 100, loss : 10.277920722961426, accuracy : 0.99609375
Epoch : 33 completed out of 100, loss : 10.18214225769043, accuracy : 0.99609375
Epoch : 34 completed out of 100, loss : 2.365241289138794, accuracy : 1.0
Epoch : 35 completed out of 100, loss : 8.35222053527832, accuracy : 0.99609375
Epoch : 36 completed out of 100, loss : 1.9403200149536133, accuracy : 1.0
Epoch : 37 completed out of 100, loss : 4.6265153884887695, accuracy : 1.0
Epoch : 38 completed out of 100, loss : 4.685805797576904, accuracy : 0.99609375
Epoch : 39 completed out of 100, loss : 5.235599040985107, accuracy : 0.99609375
Epoch : 40 completed out of 100, loss : 32.585182189941406, accuracy : 0.98046875
Epoch : 41 completed out of 100, loss : 13.55459213256836, accuracy : 0.98828125
Epoch : 42 completed out of 100, loss : 3.7068004608154297, accuracy : 1.0
Epoch : 43 completed out of 100, loss : 5.768912315368652, accuracy : 0.99609375
Epoch : 44 completed out of 100, loss : 5.215768814086914, accuracy : 0.99609375
Epoch : 45 completed out of 100, loss : 8.629631042480469, accuracy : 0.9921875
Epoch : 46 completed out of 100, loss : 7.393224716186523, accuracy : 0.9921875
Epoch : 47 completed out of 100, loss : 17.475631713867188, accuracy : 0.9921875
Epoch : 48 completed out of 100, loss : 4.962292194366455, accuracy : 0.99609375
Epoch : 49 completed out of 100, loss : 4.288407802581787, accuracy : 1.0
Epoch : 50 completed out of 100, loss : 3.06554913520813, accuracy : 1.0
Epoch : 51 completed out of 100, loss : 2.9363889694213867, accuracy : 1.0
Epoch : 52 completed out of 100, loss : 7.425971031188965, accuracy : 0.99609375
Epoch : 53 completed out of 100, loss : 7.003169536590576, accuracy : 0.99609375
Epoch : 54 completed out of 100, loss : 13.44936466217041, accuracy : 0.99609375
Epoch : 55 completed out of 100, loss : 5.4362664222717285, accuracy : 0.99609375
Epoch : 56 completed out of 100, loss : 10.022172927856445, accuracy : 0.98828125
Epoch : 57 completed out of 100, loss : 2.9892423152923584, accuracy : 1.0
Epoch : 58 completed out of 100, loss : 1.7155311107635498, accuracy : 1.0
Epoch : 59 completed out of 100, loss : 2.44166898727417, accuracy : 1.0
Epoch : 60 completed out of 100, loss : 4.870673656463623, accuracy : 0.99609375
Epoch : 61 completed out of 100, loss : 1.7088404893875122, accuracy : 1.0
Epoch : 62 completed out of 100, loss : 21.897991180419922, accuracy : 0.98828125
Epoch : 63 completed out of 100, loss : 2.563978672027588, accuracy : 1.0
Epoch : 64 completed out of 100, loss : 1.151407241821289, accuracy : 1.0
2017-09-14 12:21:34.450895: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 14379760 get requests, put_count=14379777 evicted_count=2000 eviction_rate=0.000139084 and unsatisfied allocation rate=0.000141657
Epoch : 65 completed out of 100, loss : 1.0514287948608398, accuracy : 1.0
Epoch : 66 completed out of 100, loss : 10.431646347045898, accuracy : 0.99609375
Epoch : 67 completed out of 100, loss : 10.04415512084961, accuracy : 0.99609375
Epoch : 68 completed out of 100, loss : 9.506088256835938, accuracy : 0.99609375
Epoch : 69 completed out of 100, loss : 8.011089324951172, accuracy : 0.99609375
Epoch : 70 completed out of 100, loss : 0.9643533229827881, accuracy : 1.0
Epoch : 71 completed out of 100, loss : 9.283774375915527, accuracy : 0.9921875
Epoch : 72 completed out of 100, loss : 2.125692129135132, accuracy : 1.0
Epoch : 73 completed out of 100, loss : 21.240196228027344, accuracy : 0.9921875
Epoch : 74 completed out of 100, loss : 2.5445051193237305, accuracy : 1.0
Epoch : 75 completed out of 100, loss : 9.342909812927246, accuracy : 0.99609375
Epoch : 76 completed out of 100, loss : 29.229848861694336, accuracy : 0.98828125
Epoch : 77 completed out of 100, loss : 1.9726190567016602, accuracy : 1.0
Epoch : 78 completed out of 100, loss : 8.080221176147461, accuracy : 0.99609375
Epoch : 79 completed out of 100, loss : 7.3532915115356445, accuracy : 0.99609375
Epoch : 80 completed out of 100, loss : 1.3384674787521362, accuracy : 1.0
Epoch : 81 completed out of 100, loss : 6.711606025695801, accuracy : 0.99609375
Epoch : 82 completed out of 100, loss : 0.9907960891723633, accuracy : 1.0
Epoch : 83 completed out of 100, loss : 1.1378357410430908, accuracy : 1.0
Epoch : 84 completed out of 100, loss : 7.504663467407227, accuracy : 0.9921875
Epoch : 85 completed out of 100, loss : 1.9658554792404175, accuracy : 1.0
Epoch : 86 completed out of 100, loss : 1.3581955432891846, accuracy : 1.0
Epoch : 87 completed out of 100, loss : 2.964240789413452, accuracy : 1.0
Epoch : 88 completed out of 100, loss : 3.54362154006958, accuracy : 1.0
Epoch : 89 completed out of 100, loss : 1.5963693857192993, accuracy : 1.0
Epoch : 90 completed out of 100, loss : 4.597883224487305, accuracy : 1.0
Epoch : 91 completed out of 100, loss : 1.353342890739441, accuracy : 1.0
Epoch : 92 completed out of 100, loss : 2.763561964035034, accuracy : 1.0
Epoch : 93 completed out of 100, loss : 4.88947057723999, accuracy : 0.99609375
Epoch : 94 completed out of 100, loss : 4.4988112449646, accuracy : 0.99609375
Epoch : 95 completed out of 100, loss : 11.898427963256836, accuracy : 0.99609375
Epoch : 96 completed out of 100, loss : 1.51198410987854, accuracy : 1.0
Epoch : 97 completed out of 100, loss : 0.946499764919281, accuracy : 1.0
Epoch : 98 completed out of 100, loss : 5.954292297363281, accuracy : 0.99609375
Epoch : 99 completed out of 100, loss : 1.6741831302642822, accuracy : 1.0

The test accuracy was Accuracy : 0.9884001016616821, and this is without learning rate decay.

Link in arxiv has extra period at the end

Hey guys,

Thank you for publishing this new dataset.
No an issue really wiht the dataset, but the link that you have in the arxiv's abstract has an extra period at the end.

Regards

arxiv-Broken link

The link to the dataset on arxiv is broken because of a dot at the end (https://github.com/zalandoresearch/fashion-mnist.)

Truncated Squeeze-net - a couple of runs

Please see my implementation.

Truncated Squeeze net model
~1s per epoch
Ca. 0.9001 accuracy on ca. 200 epochs
Cyclical learning rate

squeeze_net_mnist.zip

GRU+SVM+DROPOUT+LR-DECAY

GRU+SVM+DROPOUT+LR-DECAY in TF

https://github.com/mpekalski/zalando/blob/master/GRU%2BSVM%2BDROPOUT%2BLR-DECAY.ipynb

100 epochs, test accuracy: 0.9841001033782959

No preprocessing.

Benchmark : XgBoost performance on fashion-mnist 89.8% and on MNIST 96.8%

Hello, I tried XgBoost on both Fashion-MNIST and MNIST dataset, with the only pre-processing as scaling the pixel values to mean=0.0 and var=1.0.

Fashion-MNIST
Train accuracy 99.5%
Validation accuracy 90.7%
Test accuracy 89.8%

MNIST
Train accuracy 99.7%
Validation accuracy 97.4%
Test accuracy 96.8%

Notebook link
https://github.com/anktplwl91/fashion_mnist.git

Suggestion: Rename the repository from MNIST to something else

Someone commented about this issue on Reddit (pasted below) and I think you should seriously consider changing the name of the benchmark to something else while it's still early on.

MNIST stands for "Modified National Institute of Standards and Technology" and "National Institute of Standards and Technology" might not be too happy with their name being used. Call it something else. Especially when its an entirely new dataset and not a modification/extension of original NIST dataset.

MNIST-Fashion-CNN

https://github.com/abelusha/MNIST-Fashion-CNN/blob/master/MNIST_Fashon_CNN_using_Keras.ipynb

Preprocessing : Normalization
Result:

Keras based Architecture:

AlexNet with Triplet loss Benchmark

I use AlexNet for feature extraction with Triplet Loss function and embedding creation part, and I use Linear SVM for classifier.
I got accuracies as;
Train: 0.9946
Test: 0.8989

ResNeXt (2017, Facebook AI Research)

HI
When I tested the fashion mnist dataset with 50 epochs, the accuracy was up to 90%. Maybe it will be higher if you increase the epoch and adjust the hyperparameter.

Data augmentation has not been done.

https://github.com/taki0112/ResNeXt-Tensorflow

thank you

Benchmarking

Jupyter notebook:
https://github.com/abelusha/MNIST-Fashion-CNN/blob/master/Fashon_MNIST_CNN_using_Keras_10_Runs.ipynb

I re-Run my CNN model 10 times for taking an average accuracy.
Here is the model Result:

Model HyperParameter:

Model Architecture:

Benchmark (MXNet gluon)

Tried a simple 2-layer conv net with MXNet gluon Accuracy is as follows:

[Epoch 24] Training: accuracy=0.935683
[Epoch 24] Validation: accuracy=0.900500

https://github.com/lianghong/fashion_mnist-on-mxnet

Shape of train images returns 55000 although it says 60000. Did I misunderstood smth?

When, I try to print the data set it gives 55000 as shape? Is there something wrong that I did. I downloaded the data from github and put it in data/fashion as it says.

from tensorflow.examples.tutorials.mnist import input_data
fashion_mnist = input_data.read_data_sets('data/fashion', one_hot=True)

Extracting data/fashion\train-images-idx3-ubyte.gz
Extracting data/fashion\train-labels-idx1-ubyte.gz
Extracting data/fashion\t10k-images-idx3-ubyte.gz
Extracting data/fashion\t10k-labels-idx1-ubyte.gz

print('Features of Fashion MNIST dataset')
print('Shape of training set images: ', fashion_mnist.train.images.shape)
print('Shape of training set labels', fashion_mnist.train.labels.shape)

Features of Fashion MNIST dataset
Shape of training set images:  (55000, 784)
Shape of training set labels (55000, 10)

Adding Japanese README translation

I found a Japanese translation of README.md on http://tensorflow.classcat.com/category/fashion-mnist/
Seems pretty complete to me. I sent a email to the website and asking for the permission to use it as official README-jp.md.

Still waiting their reply.

Simple convolutional neural network with 93.43% accuracy on testset

Hi Han, I evaluated some architectures and parameters. I end up with an accuracy of 93.43% on the fashion-MNIST testset. The same network reached an accuracy of 99.43% on MNIST.

https://github.com/cmasch/zalando-fashion-mnist

The architecture in code:

cnn = Sequential()

cnn.add(InputLayer(input_shape=(img_height,img_width,1)))

# Normalization
cnn.add(BatchNormalization())

# Conv + Maxpooling
cnn.add(Convolution2D(64, (4, 4), padding='same', input_shape=(img_height, img_width, channels), activation='relu'))
cnn.add(MaxPooling2D(pool_size=(2, 2)))

# Dropout
cnn.add(Dropout(0.1))

# Conv + Maxpooling
cnn.add(Convolution2D(64, (4, 4), activation='relu'))
cnn.add(MaxPooling2D(pool_size=(2, 2)))

# Dropout
cnn.add(Dropout(0.3))

# Converting 3D feature to 1D feature Vektor
cnn.add(Flatten())

# Fully Connected Layer
cnn.add(Dense(256, activation='relu'))

# Dropout
cnn.add(Dropout(0.5))

# Fully Connected Layer
cnn.add(Dense(64, activation='relu'))

# Normalization
cnn.add(BatchNormalization())

cnn.add(Dense(num_classes, activation='softmax'))
cnn.compile(loss='categorical_crossentropy',
            optimizer=optimizers.Adam(),
            metrics=['accuracy'])

I tried to keep it simple. Additionally I used augmentation to increase the training data.
Thanks for the great dataset!

Best
Christopher

Suggestion: add implementation details for benchmark results

This is a suggestion. As I find it's hard to reproduce some of the results in the benchmark list, I think it's better to ask all the submitters to include implementation details for their models.

Benchmark: ResNet18 and Simple Conv Net

Tried a simple 2-layer conv net and resnet18 on MNIST and Fashion-MNIST. Accuracy is as follows:

Model	MNIST	Fashion MNIST
ResNet18	0.979	0.949
SimpleNet	0.971	0.919

Preprocessing

Normalization, random horizontal flip, random vertical flip, random translation, random rotation.

You can find the code here.

Clustering performance

@hanxiao
I wonder what's the clustering performance of the state-of-the-art clustering algorithms on fashion-mnist.
I tested my algorithm and got an accuracy of 0.59 and NMI of 0.63.
Have you collected other clustering results?
Thanks.

The download link of t10k-images-idx3-ubyte.gz is wrong

I have downloaded the t10k-images-idx3-ubyte.gz from the readme provided download link, but I checked the md5sum value is 9fb629c4189551a2d022fa330f9573f3, it not the same as readme given, I have deleted it and redownload again for five times, it is wrong.

docs: Additional measure for benchmarking

Perhaps it would be good if we include not only the architecture used, preprocessing, training accuracy, and test accuracy in the benchmarks, but also include the time it took to train on Fashion v. MNIST. What do you think?

S3 bucket is providing different files from repo

Noticed in this repo, but it happens directly in the browser for me as well.

MD5 hashes

manually downloaded from repo S3 links

7edbbf1fc824916c442268ac4dc845cd  - ./t10k-images-idx3-ubyte.gz.md5
b9859d5936603c782c6eb8dd14198360  - ./t10k-labels-idx1-ubyte.gz.md5
053aba987904a004d52cb333753041a3  - ./train-images-idx3-ubyte.gz.md5
7864864ad9592b0ffcc53c942eb67b24  - ./train-labels-idx1-ubyte.gz.md5

downloaded from repo directly

bef4ecab320f06d8554ea6380940ec79  - ./t10k-images-idx3-ubyte.gz.md5
bb300cfdad3c16e7a12a480ee83cd310  - ./t10k-labels-idx1-ubyte.gz.md5 
8d4fb7e6c68d591d4c3dfef9ec88bf0d  - ./train-images-idx3-ubyte.gz.md5
25c81989df183df01b3e8a0aad5dffbe  - ./train-labels-idx1-ubyte.gz.md5

Plain 9 layers CNN for Benchmark

I have several experimental results with different activation function and learning rate with
CNN architecture like this : C(3,32)-C(3,32)-P2-C(3,64)-C(3,64)-P2-FC64-FC64-S10

Activation	Learning Rate	MNIST	Fashion MNIST
RELU	0.01	0.9874	0.9883
RELU	0.001	0.9388	0.9368
SELU	0.01	0.9871	0.9819
SELU	0.001	0.9490	0.8202

For now, my best results are 98.74% and 98.83% for MNIST and Fashion-MNIST, respectively.
The train-val curve could be found in my repository : https://github.com/JMingKuo/fashion-mnist

Benchmark: ConvNet 0.932 on Fashion-MNIST, 0.994 on MNIST

Preprocessing
Normalization

Architecture in Keras

Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 1, 28, 28)         0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 1, 28, 28)         112       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 64, 24, 24)        1664      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 64, 12, 12)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 512, 8, 8)         819712    
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 512, 4, 4)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 8192)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               1048704   
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 64)                8256      
_________________________________________________________________
dropout_2 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                650       
=================================================================
Total params: 1,879,098
Trainable params: 1,879,042
Non-trainable params: 56

Training Time
12+ mins

Accuracy
Fashion: 0.932
MNIST: 0.994

Notebook
https://github.com/Xfan1025/Fashion-MNIST/blob/master/fashion-mnist.ipynb

Is it possible to get the colored version?

I wanted to ask if you could please also publish the colored version of the dataset. To be precise after the processing step 5 (extending) and before 6 (negating). This could, for example, be helpful for Image translation tasks.