Coder Social home page Coder Social logo

quickdraw-dataset's Introduction

The Quick, Draw! Dataset

preview

The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw!. The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located. You can browse the recognized drawings on quickdraw.withgoogle.com/data.

We're sharing them here for developers, researchers, and artists to explore, study, and learn from. If you create something with this dataset, please let us know by e-mail or at A.I. Experiments.

We have also released a tutorial and model for training your own drawing classifier on tensorflow.org.

Please keep in mind that while this collection of drawings was individually moderated, it may still contain inappropriate content.

Content

The raw moderated dataset

The raw data is available as ndjson files seperated by category, in the following format:

Key Type Description
key_id 64-bit unsigned integer A unique identifier across all drawings.
word string Category the player was prompted to draw.
recognized boolean Whether the word was recognized by the game.
timestamp datetime When the drawing was created.
countrycode string A two letter country code (ISO 3166-1 alpha-2) of where the player was located.
drawing string A JSON array representing the vector drawing

Each line contains one drawing. Here's an example of a single drawing:

  { 
    "key_id":"5891796615823360",
    "word":"nose",
    "countrycode":"AE",
    "timestamp":"2017-03-01 20:41:36.70725 UTC",
    "recognized":true,
    "drawing":[[[129,128,129,129,130,130,131,132,132,133,133,133,133,...]]]
  }

The format of the drawing array is as following:

[ 
  [  // First stroke 
    [x0, x1, x2, x3, ...],
    [y0, y1, y2, y3, ...],
    [t0, t1, t2, t3, ...]
  ],
  [  // Second stroke
    [x0, x1, x2, x3, ...],
    [y0, y1, y2, y3, ...],
    [t0, t1, t2, t3, ...]
  ],
  ... // Additional strokes
]

Where x and y are the pixel coordinates, and t is the time in milliseconds since the first point. x and y are real-valued while t is an integer. The raw drawings can have vastly different bounding boxes and number of points due to the different devices used for display and input.

Preprocessed dataset

We've preprocessed and split the dataset into different files and formats to make it faster and easier to download and explore.

Simplified Drawing files (.ndjson)

We've simplified the vectors, removed the timing information, and positioned and scaled the data into a 256x256 region. The data is exported in ndjson format with the same metadata as the raw format. The simplification process was:

  1. Align the drawing to the top-left corner, to have minimum values of 0.
  2. Uniformly scale the drawing, to have a maximum value of 255.
  3. Resample all strokes with a 1 pixel spacing.
  4. Simplify all strokes using the Ramer–Douglas–Peucker algorithm with an epsilon value of 2.0.

There is an example in examples/nodejs/simplified-parser.js showing how to read ndjson files in NodeJS.
Additionally, the examples/nodejs/ndjson.md document details a set of command-line tools that can help explore subsets of these quite large files.

Binary files (.bin)

The simplified drawings and metadata are also available in a custom binary format for efficient compression and loading.

There is an example in examples/binary_file_parser.py showing how to load the binary files in Python.
There is also an example in examples/nodejs/binary-parser.js showing how to read the binary files in NodeJS.

Numpy bitmaps (.npy)

All the simplified drawings have been rendered into a 28x28 grayscale bitmap in numpy .npy format. The files can be loaded with np.load(). These images were generated from the simplified data, but are aligned to the center of the drawing's bounding box rather than the top-left corner. See here for code snippet used for generation.

Get the data

The dataset is available on Google Cloud Storage as ndjson files seperated by category. See the list of files in Cloud , or read more about accessing public datasets using other methods. As an example, to easily download all simplified drawings, one way is to run the command gsutil -m cp 'gs://quickdraw_dataset/full/simplified/*.ndjson' .

Full dataset seperated by categories

Sketch-RNN QuickDraw Dataset

This data is also used for training the Sketch-RNN model. An open source, TensorFlow implementation of this model is available in the Magenta Project, (link to GitHub repo). You can also read more about this model in this Google Research blog post. The data is stored in compressed .npz files, in a format suitable for inputs into a recurrent neural network.

In this dataset, 75K samples (70K Training, 2.5K Validation, 2.5K Test) has been randomly selected from each category, processed with RDP line simplification with an epsilon parameter of 2.0. Each category will be stored in its own .npz file, for example, cat.npz.

We have also provided the full data for each category, if you want to use more than 70K training examples. These are stored with the .full.npz extensions.

Note: For Python3, loading the npz files using np.load(data_filepath, encoding='latin1', allow_pickle=True)

Instructions for converting Raw ndjson files to this npz format is available in this notebook.

Projects using the dataset

Here are some projects and experiments that are using or featuring the dataset in interesting ways. Got something to add? Let us know!

Creative and artistic projects

Data analyses

Papers

Guides & Tutorials

Code and tools

Changes

May 25, 2017: Updated Sketch-RNN QuickDraw dataset, created .full.npz complementary sets.

License

This data made available by Google, Inc. under the Creative Commons Attribution 4.0 International license.

Dataset Metadata

The following table is necessary for this dataset to be indexed by search engines such as Google Dataset Search.

property value
name The Quick, Draw! Dataset
alternateName Quick Draw Dataset
alternateName quickdraw-dataset
url
sameAs https://github.com/googlecreativelab/quickdraw-dataset
description The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game "Quick, Draw!". The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located.\n \n Example drawings: ![preview](https://raw.githubusercontent.com/googlecreativelab/quickdraw-dataset/master/preview.jpg)
provider
property value
name Google
sameAs https://en.wikipedia.org/wiki/Google
license
property value
name CC BY 4.0
url

quickdraw-dataset's People

Contributors

akshaybahadur21 avatar chrisgorgo avatar dalessandroj avatar enjalot avatar halfdanj avatar hardmaru avatar keshavgbpecdelhi avatar lurch avatar mrayinteractive avatar ndri avatar nick-jonas avatar sanjaymahto avatar seong954t avatar sorryusernameisalreadytaken avatar talhakabakus avatar tianyk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

quickdraw-dataset's Issues

How to transform .npz to photograph

Hi
Now our team want to use the .npz dataset to do other research,but we tried many times, we can't transform the .npz numpy array to the grtaph like .jpg or .png. we show the shape about the array is (28,3). so we can't get back to rgb graph.
We read the quick draw rnn dataset paper, still don't know how to transform them.
Can you help me to solve it out?

Best wishes
Hans Yang

Convert saved drawing images/files (png/jpg) to same numpy (npy) bitmaps for prediction

I have been scratching my head for over 5 days now trying various models and code repos and still have not been able to make it work. The model trains well and evals well but I am failing at actual predictions.
Instead of models based on drawing strokes, I have been playing with models using actual drawing images to predict (like a image classifier) and most of these models use the numpy bitmaps dataset (npy files).

Everything is well and good except the part to feed the model drawing from a saved image file (since most of these articles or code repos fed it via canvas or JS or android). I tried to replicate their prediction code (mainly image processing) as much as I can in python but the predictions are still way way wrong.

Here is my image processing and prediction code:

from PIL import Image
import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
from random import randint
from scipy.misc.pilutil import imsave, imread, imresize
%matplotlib inline  

clock = qd.get_drawing("circle")
apple = clock
apple.image.save("apple.png")


mypath = "data/"
txt_name_list = []
for (dirpath, dirnames, filenames) in walk(mypath):
        if filenames != '.DS_Store':
            txt_name_list.extend(filenames)
            break
    
    

def adjust_gamma(image, gamma=1.5):
   invGamma = 1.0 / gamma
   table = np.array([((i / 255.0) ** invGamma) * 255
      for i in np.arange(0, 256)]).astype("uint8")

   return cv.LUT(image, table)


def preprocess(img):
    # for sketch & not canvas drawings use the following:

    gray = cv.bilateralFilter(img, 9, 75, 75)
    #
    gray = cv.erode(gray, None, iterations=1)
    #
    gray = adjust_gamma(gray, 1.1)
    #return gray

    th3 = cv.adaptiveThreshold(gray, 255, cv.ADAPTIVE_THRESH_GAUSSIAN_C,cv.THRESH_BINARY_INV, 11, 2)
    #th3 = cv.adaptiveThreshold(img, 255, cv.ADAPTIVE_THRESH_GAUSSIAN_C,cv.THRESH_BINARY_INV, 11, 2)
    return th3
  
  
  
#img = apple.image.convert("L")

#imgData = request.get_data()
#convertImage(imgData)
print("debug")

x = imread('apple.png', mode='L')

x = preprocess(x)

#x = cv.bitwise_not(x)



x = imresize(x, (32, 32))

x = x.astype('float32')
x /= 255

x = x.reshape(1, 32, 32, 1)

print(txt_name_list)
#print(x)

out = model.predict(x)
#print(out)
print(np.argmax(out, axis=1))
index = np.array(np.argmax(out, axis=1))
index = index[0]

print(txt_name_list[index])

plt.imshow(x.squeeze()) 

There is quite a difference between how image looks in numpy dataset and how it comes after I process it.

chrome_2019-06-06_10-01-14

chrome_2019-06-06_10-01-27

Here is my full model:

from __future__ import print_function
import  numpy  as  np
import matplotlib.pyplot as plt
from  sklearn.model_selection  import train_test_split
from os import walk, getcwd
import h5py
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
import cv2 as cv
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D, BatchNormalization, AveragePooling2D
from keras import backend as K
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from keras.optimizers import SGD
from keras.optimizers import Adam
from keras.callbacks import EarlyStopping,ModelCheckpoint
from sklearn.metrics import confusion_matrix

#For Multi GPU
from keras.utils import multi_gpu_model
from keras import metrics

batch_size = 128

epochs = 40

img_rows, img_cols = 28, 28

mypath = "data/"
txt_name_list = []

#slice_train = 30500
slice_train = 10000

def top_3_acc(y_true, y_pred):
    return metrics.top_k_categorical_accuracy(y_true, y_pred, k=3)

def readData():
    x_train = []
    x_test = []
    y_train = []
    y_test = []
    xtotal = []
    ytotal = []
    x_val = []
    y_val = []

    for (dirpath, dirnames, filenames) in walk(mypath):
        if filenames != '.DS_Store':
            txt_name_list.extend(filenames)
            break

    #print(mypath)
    i=0
    classescount = 0

    for txt_name in txt_name_list:
        txt_path = mypath + txt_name
        x = np.load(txt_path)
        print(txt_name)
        print(i)
        classescount += 1
        x = x.astype('float32') / 255.  ##scale images
        y = [i] * len(x)
        x = x[:slice_train]
        y = y[:slice_train]

        if i != 0:
            xtotal = np.concatenate((x, xtotal), axis=0)
            ytotal = np.concatenate((y, ytotal), axis=0)
        else:
            xtotal = x
            ytotal = y
        i += 1

    print(classescount)
    print("xshape = ", xtotal.shape)
    print("yshape = ", ytotal.shape)
    x_train, x_test, y_train, y_test = train_test_split(xtotal, ytotal, test_size=0.3, random_state=42)
    x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.2, random_state=1)

    return x_train, x_val, x_test, y_train, y_val, y_test, classescount


def lenet(x_train, x_val, x_test, y_train, y_val, y_test, num_classes):
    if K.image_data_format() == 'channels_first':
        x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
        x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
        x_val = x_val.reshape(x_val.shape[0], 1, img_rows, img_cols)
        input_shape = (1, img_rows, img_cols)
    else:
        x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
        x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
        x_val = x_val.reshape(x_val.shape[0], img_rows, img_cols, 1)
        input_shape = (img_rows, img_cols, 1)

    # more reshaping
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_val = x_val.astype('float32')
    x_train /= 255
    x_test /= 255
    x_val /= 255

    # convert class vectors
    y_train = keras.utils.to_categorical(y_train, num_classes)
    y_test = keras.utils.to_categorical(y_test, num_classes)
    y_val = keras.utils.to_categorical(y_val, num_classes)

    x_train = np.pad(x_train, ((0, 0), (2, 2), (2, 2), (0, 0)), 'constant')
    x_val = np.pad(x_val, ((0, 0), (2, 2), (2, 2), (0, 0)), 'constant')
    x_test = np.pad(x_test, ((0, 0), (2, 2), (2, 2), (0, 0)), 'constant')

    print('x_train shape:', x_train.shape)
    print(x_train.shape[0], 'train samples')
    print(x_val.shape[0], 'validation samples')
    print(x_test.shape[0], 'test samples')

    print(y_train.shape)

    print(input_shape)

    model = Sequential()

    model.add(Conv2D(filters=6, kernel_size=(3, 3), activation='relu', input_shape=(32, 32, 1)))
    model.add(AveragePooling2D())

    model.add(Conv2D(filters=16, kernel_size=(3, 3), activation='relu'))
    model.add(AveragePooling2D())

    model.add(Flatten())

    model.add(Dense(units=120, activation='relu'))

    model.add(Dense(units=84, activation='relu'))

    model.add(Dense(units=num_classes, activation='softmax'))

    filepath = "saved/weightslenet.{epoch:02d}.h5"
    ES = EarlyStopping(patience=5)
    check = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=False, mode='max')

    #model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy', top_3_acc])
    #Trying Multi GPU
    #model = multi_gpu_model(model, gpus=2)
    #model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    
    model.fit(x_train, y_train,batch_size=batch_size,epochs=epochs,verbose=1, validation_data=(x_val, y_val), callbacks=[ES, check])
    #model.fit(x_train, y_train,batch_size=batch_size,epochs=epochs,verbose=1, validation_data=(x_val, y_val), callbacks=[ES, check])

    score = model.evaluate(x_test, y_test, verbose=0)
    print('Test loss:', score[0])
    print('Test accuracy:', score[1])

    model.save('cnnOld2.h5')
    print("Saved model to disk")
    #
    # cm = metrics.confusion_matrix(test_batch.classes, y_pred)
    # # or
    # # cm = np.array([[1401,    0],[1112, 0]])
    #
    # plt.imshow(cm, cmap=plt.cm.Blues)
    # plt.xlabel("Predicted labels")
    # plt.ylabel("True labels")
    # plt.xticks([], [])
    # plt.yticks([], [])
    # plt.title('Confusion matrix ')
    # plt.colorbar()
    # plt.show()
    print(y_test)

    loaded_model = keras.models.load_model('cnnOld2.h5', custom_objects={"top_3_acc": top_3_acc})
    print("test")
    #y_pred = loaded_model.predict_on_batch(x_test)
    #score = loaded_model.evaluate(x_test, y_test, verbose=0)

    y_pred = loaded_model.predict(x_test)
    print(y_pred)

    indexes = np.argmax(y_pred, axis=1)
    i=0
    for y in y_pred:
        y[y<1000]=0
        # print("allzero",y)
        y[indexes[i]] = 1
        i+=1

    cm = confusion_matrix(
        y_test.argmax(axis=1), y_pred.argmax(axis=1))
    acc = accuracy_score(y_test.argmax(axis=1), y_pred.argmax(axis=1), normalize=True, sample_weight=None)
    cr = classification_report(y_test.argmax(axis=1), y_pred.argmax(axis=1))
    print(cm)
    print(acc)
    print(cr)

def main():
    x_train, x_val, x_test, y_train, y_val, y_test, num_classes = readData()
    lenet(x_train, x_val, x_test, y_train, y_val, y_test, num_classes)

if __name__ == '__main__':
    main()

Looking for a ressource to help me export the model for tensorflow model serving

Based on the mnist serving tutorial I'm trying to export the quickdraw-data set to load it into the tensorflow-model-server and make a client request that returns me the prediction of the input data.
But I'm struggling to get the export right in combination with the correct client request to the server.

Are there any ressources available that might help me getting the model export right for the model server?
The mnist example and the tensorflow documentation are not sufficient for me.

Thanks in advance!

Inappropriate content in Dog data

Hey, I don't know if this is the right place to report this, but I randomly spotted some, uh, inappropriate content in the moderated json dataset for "Dog":

Entry 3230 spells out "F*** YOU"
{"word":"dog","countrycode":"IN","timestamp":"2017-03-14 13:45:59.72155 UTC","recognized":false,"key_id":"5976946842271744","drawing":[[[0,35],[40,115]],[[0,7,57],[48,38,6]],[[24,45,59,61],[63,55,45,49]],[[71,86,98,102,108,107],[71,81,77,68,34,16]],[[129,106,101,104,111,139,157,162],[19,32,46,58,66,76,77,72]],[[180,215,207,205,212,225,236,255,249,223,220,245],[0,93,69,47,36,27,24,26,37,56,62,79]],[[107,100,51,45],[145,159,224,237]],[[37,70],[155,183]],[[125,116,115,119,127,144,152,161,162,162,154,132,125],[189,209,219,230,236,238,232,223,207,195,182,178,190]],[[190,197,201,215,225,237,237],[182,215,219,220,211,180,168]]]}

In portuguese clown expected crown images

Preconditions:
The game displayed "Draw" Palhaço(Clown).
Step by step:
1 - draw a Clown
2 - Finished the 6 drawings
3 - Click on Clown Draw and check the examples drawn by other people.

Current behavior:
The examples are Crown (Coroa in portuguese) images

Expeced behavior:
The examples are Clown images
draw-palhaco(Clown)-step1
examples-coroas(Crown)
click_on_palhaco(clown)_step2

Pretrained model

Hello, are you also considering releasing the trained neural network used for recognition of images? O:-) The Sketch-RNN is a generative model, from what I understand by a quick look.

Unable to recognize camera data?

Draw T-shirt on paper, then get the corresponding camera data stream, but can't recognize it
Is there any way to convert camera data to the drawing array?

Symbol on coffee cup

the coffee cup ndjson with key id 6723059434127360 has a symbol on it that may be the peaceful Swastika or could be the Nazi Hakenkreuz

Screenshot from 2022-01-15 22-06-12

Screen should autoscroll

When clicking a drawing at the bottom of the data page to see the detail, the detail pop up bubble is rendered below the screen, this should either auto scroll or render above where it has been clicked knowing it is near the bottom of the page

Translation

In portuguese there is a mistake with the words "clown" and "crown".

Can a normal doodle image be used ?

Tried searching on this topic quite a lot but could not find any information. Seems most of the implementation of Quick Draw are done using strokes data rather than the doodle images itself, due to which for prediction/inference you need to provide stroke data of your drawing too and not a image of the drawing, atleast from what I have concluded till now.

So I wanted to ask and confirm, is it correct that you cannot use a image of your drawing/doodle to make same kind of categorical prediction like Quick Draw ?

Any reference or pointers on this would be much appreciated

Let me finish my doodle

User should be allowed to finish the doodle in given time , not cut off immediately after being recognized... this way you get unfinished doodles in a database which is useless.

Hi There, anyone who can help me to figure out why I cannot open my csv file on Jupyter?

import pandas

df= pandas.read_csv('inventory_INVENTORY_TRANSACTION_.csv')

print(df)
import pandas

df= pandas.read_csv('inventory_INVENTORY_TRANSACTION_.csv')

print(df)

ParserError Traceback (most recent call last)
in
1 import pandas
2
----> 3 df= pandas.read_csv('inventory_INVENTORY_TRANSACTION_.csv')
4
5 print(df)

~\anaconda3\lib\site-packages\pandas\io\parsers.py in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
608 kwds.update(kwds_defaults)
609
--> 610 return _read(filepath_or_buffer, kwds)
611
612

~\anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
466
467 with parser:
--> 468 return parser.read(nrows)
469
470

~\anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
1055 def read(self, nrows=None):
1056 nrows = validate_integer("nrows", nrows)
-> 1057 index, columns, col_dict = self._engine.read(nrows)
1058
1059 if index is None:

~\anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
2059 def read(self, nrows=None):
2060 try:
-> 2061 data = self._reader.read(nrows)
2062 except StopIteration:
2063 if self._first_chunk:

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()

pandas_libs\parsers.pyx in pandas._libs.parsers.raise_parser_error()

ParserError: Error tokenizing data. C error: Expected 20 fields in line 46, saw 33

import pandas

df= pandas.read_csv('C:\Users\Marcos\OneDrive\Desktop\DATA\inventory_INVENTORY_TRANSACTION_.csv')

print(df)
import pandas

df= pandas.read_csv('C:\Users\Marcos\OneDrive\Desktop\DATA\inventory_INVENTORY_TRANSACTION_.csv')


print(df)

NameError Traceback (most recent call last)
in
3 # df= pandas.read_csv('C:\Users\Marcos\OneDrive\Desktop\DATA\inventory_INVENTORY_TRANSACTION_.csv')
4
----> 5 print(df)

NameError: name 'df' is not defined

import pandas

df= pandas.read_csv("inventory_INVENTORY_TRANSACTION_.csv")

print(df)
import pandas

df= pandas.read_csv("inventory_INVENTORY_TRANSACTION_.csv")

print(df)

ParserError Traceback (most recent call last)
in
1 import pandas
2
----> 3 df= pandas.read_csv("inventory_INVENTORY_TRANSACTION_.csv")
4
5 print(df)

~\anaconda3\lib\site-packages\pandas\io\parsers.py in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
608 kwds.update(kwds_defaults)
609
--> 610 return _read(filepath_or_buffer, kwds)
611
612

~\anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
466
467 with parser:
--> 468 return parser.read(nrows)
469
470

~\anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
1055 def read(self, nrows=None):
1056 nrows = validate_integer("nrows", nrows)
-> 1057 index, columns, col_dict = self._engine.read(nrows)
1058
1059 if index is None:

~\anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
2059 def read(self, nrows=None):
2060 try:
-> 2061 data = self._reader.read(nrows)
2062 except StopIteration:
2063 if self._first_chunk:

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()

pandas_libs\parsers.pyx in pandas._libs.parsers.raise_parser_error()

ParserError: Error tokenizing data. C error: Expected 20 fields in line 46, saw 33

import pandas

df= pandas.read_csv("inventory_INVENTORY_TRANSACTION_.csv")

df.head()
import pandas

df= pandas.read_csv("inventory_INVENTORY_TRANSACTION_.csv")

df.head()

ParserError Traceback (most recent call last)
in
1 import pandas
2
----> 3 df= pandas.read_csv("inventory_INVENTORY_TRANSACTION_.csv")
4
5 df.head()

~\anaconda3\lib\site-packages\pandas\io\parsers.py in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
608 kwds.update(kwds_defaults)
609
--> 610 return _read(filepath_or_buffer, kwds)
611
612

~\anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
466
467 with parser:
--> 468 return parser.read(nrows)
469
470

~\anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
1055 def read(self, nrows=None):
1056 nrows = validate_integer("nrows", nrows)
-> 1057 index, columns, col_dict = self._engine.read(nrows)
1058
1059 if index is None:

~\anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
2059 def read(self, nrows=None):
2060 try:
-> 2061 data = self._reader.read(nrows)
2062 except StopIteration:
2063 if self._first_chunk:

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

pandas_libs\parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()

pandas_libs\parsers.pyx in pandas._libs.parsers.raise_parser_error()

ParserError: Error tokenizing data. C error: Expected 20 fields in line 46, saw 33``

Modern numpy.load does not accept these .npz files

What is a current, up-to-date code sample that can load the .npz files in this distribution?
This no longer works in Python 3:

import numpy as np
x = np.load(file_path, allow_pickle=True, encoding='latin1')
print(x.keys())

Here is the error for 'cats.npz'

x = np.load(file, allow_pickle=True)
Traceback (most recent call last):
File "/Users/l0n008k/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 454, in load
return pickle.load(fid, **pickle_kwargs)
_pickle.UnpicklingError: invalid load key, '\x0a'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "/Users/l0n008k/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 457, in load
"Failed to interpret file %s as a pickle" % repr(file))
OSError: Failed to interpret file '/Users/l0n008k/Downloads/cat.npz' as a pickle

Error with Placeholder_1

I tried to run el script https://github.com/tensorflow/magenta-demos/blob/master/jupyter-notebooks/Sketch_RNN.ipynb but with diferente dataset (bread.npz) and with the checkpoints of that and i have the next problem :
![69332857-d8f75f80-0c57-11ea-813c-22fe4db6b636]
(https://user-images.githubusercontent.com/38189240/69523732-e2890c00-0f64-11ea-9e81-62b97e725735.png)

Here is the code (https://github.com/Pauladds/Rnn_sketch) , i am running "blob_sketch_RNN-bread" but i don't know my problem...

Code to process raw dataset to simplified dataset

The simplification process is discussed as follows.

  1. Align the drawing to the top-left corner, to have minimum values of 0.
  2. Uniformly scale the drawing, to have a maximum value of 255.
  3. Resample all strokes with a 1 pixel spacing.
  4. Simplify all strokes using the Ramer–Douglas–Peucker algorithm with an epsilon value of 2.0.

Where can I find the source code that implements the above steps for simplifying the raw dataset? Thanks.

How are authors' rights handled?

Wouldn't Quick, Draw! either need to make it clear to its users that their drawings were going to be open sourced, or get permission for all 50 million users before open-sourcing their hand-made drawings?

I am curious about the implications of Authors' Rights in such projects. No matter how small and simple a drawing, it is still a unique, hand created artefact and therefore such rights surely apply?

How is this handled by Google?

How could I get information like "countrycode" in numpy_bitmap dataset.

Hello,

I wonder how could I get information like "word", "countrycode", "recognized" in numpy_bitmap. I use up.load() to get the data and the only information I got is about the image. I wonder how could I get more information.

I tried binary data. It contains the information data I need but I want the image into 28*28 format. I tried to use vector_to_raster() function but failed since my image format is like:

[((0, 31, 70, 97, 121, 195, 230), (46, 38, 9, 0, 0, 29, 32)),
((3, 24, 98, 118, 157, 181, 197, 212, 255),
(46, 45, 72, 83, 88, 77, 54, 42, 30)),
((120, 109, 99, 92, 91, 105, 116, 129, 142, 150, 155, 156, 146, 109),
(1, 2, 11, 25, 41, 66, 75, 79, 77, 66, 54, 28, 15, 1)),
((109, 104, 103, 113, 122, 138, 146, 150), (8, 13, 32, 51, 57, 57, 52, 44))]

Thank you so much for your help! :)

Time until first point?

At the moment, the first point of the first stroke has t = 0. It would be super interesting to also be able to know how many milliseconds after the player was asked to draw something they actually made their first stroke! Is there anyway of incorporating this info in any future releases of the data set?

Broken links

It looks like all data links in readme are broken. Thank you for this amazing work. It would be great this issue could be fixed. thank you.

Heart symbol

What do you think about adding a heart symbol category to the app? I believe it would be a good candidate for QuickDraw as it can is fairly language agnostic, very recognizable, and can be easily drawn with a single stroke.

On the same topic, is this the best place to propose new symbols?

.numpy to bitmap format on Mac?

Okay, since the batch processing of .ndjson to SVG looks like it's going to be problematic, how about suggestions for batch converting the numpy format images into plain old vanilla bitmaps on Mac OS X?
Ideally through a nice simple app, since my Python familiarity is zilch.

Quick draw

How do I make a drawing to train the Google AI to recognize the drawings??????? 🤷‍♀️

  • your very confused user of the Quick Draw website

preprocessing new png/jpg image to predict on deep learning model

When I load npy data which was provided by google quick draw, the prediction works fine on my deep learning model.

data_url = '/content/gdrive/My Drive/Colab Notebooks/img/numpy_bitmap/sun.npy'
example_cat = np.load(data_url)

cat_len = example_cat.shape[0] # number of total image

start_num = 11 

example = example_cat[start_num,:784+start_num]

plt.imshow(example.reshape(28, 28))
example = example.reshape(28,28,1).astype('float32')
example /=255.0
print(example)
import matplotlib.pyplot as plt
from random import randint
%matplotlib inline  

pred = model.predict(np.expand_dims(example, axis=0))[0]
ind = (-pred).argsort()[:5]
print(ind)
latex = [categories_dict[x] for x in ind]
plt.imshow(example.squeeze()) 
print(latex)

somehow the image file won't be uploaded here so I attach the result of above code by the link: https://s3.us-west-2.amazonaws.com/secure.notion-static.com/57460690-0ad2-42c1-9d5a-cc9d756534ea/Untitled.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAT73L2G45O3KS52Y5%2F20201104%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20201104T012109Z&X-Amz-Expires=86400&X-Amz-Signature=cc6febeea2aee67315b8c7d353e7e378677ac965707a8ac0eb7bb6ecfb8a5f0b&X-Amz-SignedHeaders=host&response-content-disposition=filename%20%3D%22Untitled.png%22

Then, I captured the exact same image and saved it as a png image. I loaded the file again as NumPy array and preprocessed it so that I can put in my model to predict which category it belongs to. And somehow it does not work and returns completely different prediction. This is happening for every new png image I am trying to work with.

im = cv2.imread('/content/gdrive/My Drive/Colab Notebooks/sun2.PNG', cv2.IMREAD_GRAYSCALE)
resize_img = cv2.resize(im, (28,28), interpolation = cv2.INTER_AREA) 
img_vector = np.asarray(resize_img, dtype="uint8")
img = img_vector.reshape(28,28,1).astype('float32')
import matplotlib.pyplot as plt
from random import randint
%matplotlib inline  

img /= 255.0
pred = model.predict(np.expand_dims(img, axis=0))[0]
ind = (-pred).argsort()[:5]
print(ind)
latex = [categories_dict[x] for x in ind]
plt.imshow(img.squeeze()) 
print(latex)

again, I attach the result for this code as a link: https://s3.us-west-2.amazonaws.com/secure.notion-static.com/a64bd7ce-f72f-4c67-a9f3-0689421ef10e/Untitled.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAT73L2G45O3KS52Y5%2F20201104%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20201104T012838Z&X-Amz-Expires=86400&X-Amz-Signature=1cc52972a8a50202efa90618acd41c8be483f188f5b70d092ef63c6bc5ce8a18&X-Amz-SignedHeaders=host&response-content-disposition=filename%20%3D%22Untitled.png%22

below is how I preprocessed data and how I trained my model.

# Reshape and normalize
x_train = x_train.reshape(x_train.shape[0], image_size, image_size, 1).astype('float32')
x_test = x_test.reshape(x_test.shape[0], image_size, image_size, 1).astype('float32')
#image_size is 28

x_train /= 255.0
x_test /= 255.0

# Convert class vectors to class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
def cnn_model():
    # create model
    model = Sequential()
    model.add(Conv2D(30, (5, 5), input_shape=x_train.shape[1:], activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Conv2D(15, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))
    # Compile model
    
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

the training process and evaluation results are like below.

Epoch 1/100
22356/22356 [==============================] - 1323s 59ms/step - loss: 2.7714 - accuracy: 0.3795 - val_loss: 2.2759 - val_accuracy: 0.4751
Epoch 2/100
22356/22356 [==============================] - 1339s 60ms/step - loss: 2.3925 - accuracy: 0.4481 - val_loss: 2.1659 - val_accuracy: 0.4948
Epoch 3/100
22356/22356 [==============================] - 1323s 59ms/step - loss: 2.3365 - accuracy: 0.4588 - val_loss: 2.1333 - val_accuracy: 0.5015
Epoch 4/100
22356/22356 [==============================] - 1303s 58ms/step - loss: 2.3131 - accuracy: 0.4630 - val_loss: 2.1396 - val_accuracy: 0.4996
Epoch 5/100
22356/22356 [==============================] - 1262s 56ms/step - loss: 2.3013 - accuracy: 0.4655 - val_loss: 2.1199 - val_accuracy: 0.5026
Epoch 6/100
22356/22356 [==============================] - 1326s 59ms/step - loss: 2.2932 - accuracy: 0.4663 - val_loss: 2.1190 - val_accuracy: 0.5046
Epoch 7/100
22356/22356 [==============================] - 1269s 57ms/step - loss: 2.2870 - accuracy: 0.4676 - val_loss: 2.1067 - val_accuracy: 0.5053
Epoch 8/100
22356/22356 [==============================] - 1299s 58ms/step - loss: 2.2844 - accuracy: 0.4678 - val_loss: 2.1090 - val_accuracy: 0.5053
Epoch 9/100
22356/22356 [==============================] - 1288s 58ms/step - loss: 2.2828 - accuracy: 0.4683 - val_loss: 2.1147 - val_accuracy: 0.5045
Epoch 10/100
22356/22356 [==============================] - 1289s 58ms/step - loss: 2.2797 - accuracy: 0.4683 - val_loss: 2.0907 - val_accuracy: 0.5073
Epoch 11/100
22356/22356 [==============================] - 1280s 57ms/step - loss: 2.2784 - accuracy: 0.4690 - val_loss: 2.1087 - val_accuracy: 0.5058
Epoch 12/100
22356/22356 [==============================] - 1262s 56ms/step - loss: 2.2787 - accuracy: 0.4688 - val_loss: 2.1078 - val_accuracy: 0.5035
Epoch 13/100
22356/22356 [==============================] - 1335s 60ms/step - loss: 2.2773 - accuracy: 0.4690 - val_loss: 2.1078 - val_accuracy: 0.5049
Epoch 14/100
22356/22356 [==============================] - 1292s 58ms/step - loss: 2.2789 - accuracy: 0.4687 - val_loss: 2.1239 - val_accuracy: 0.5014
Epoch 15/100
22356/22356 [==============================] - 1277s 57ms/step - loss: 2.2824 - accuracy: 0.4676 - val_loss: 2.1220 - val_accuracy: 0.5016
Epoch 16/100
22356/22356 [==============================] - 1291s 58ms/step - loss: 2.2816 - accuracy: 0.4682 - val_loss: 2.1093 - val_accuracy: 0.5058
CPU times: user 18h 13min 31s, sys: 4h 19min 8s, total: 22h 32min 40s
Wall time: 5h 46min 14s
19407/19407 [==============================] - 101s 5ms/step - loss: 2.1135 - accuracy: 0.5047
Test accuarcy: 50.47%

I am assuming that something is wrong with how I am preprocessing the data, but I cannot find why this is happening and what I am doing wrong. I would be glad if you'd look up what is needed to be done for my code or data. Thank you for open sourcing this amazing project.

Cull scribbles

This is true for many images, but people get frustrated and then scribble out their drawing. (try "tiger" for example.) It would be nice to cull out the ones where the artist said "to heck with it" and scratched out their drawing.

idunno

idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno
idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno well if you don't know this will be you
idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno idunno in github so look out

How is the stroke data represented (square examples)?

I've been scouring this Github (reading its source code) and reading the following documentation https://quickdraw.readthedocs.io/en/latest/api.html#

But I don't understand the structure of some_image.strokes.

I've filtered all squares to have no_of_strokes == 4. Looking at the images, this is indeed the case as the strokes are subtly separated from each other.

But then when I inspect the data, I see things such as:

Square 1 (a perfectly fine square upon image inspection)

[[(42, 7), (31, 63), (16, 124), (0, 229)], 
[(40, 0), (255, 11)], 
[(7, 229), (116, 229), (211, 223)], 
[(251, 18), (233, 211), (227, 218), (213, 219)]]

Square 2 (another perfectly fine square upon image inspection)

[[(1, 18), (0, 48), (8, 117), (19, 196), (21, 203), (25, 203)], 
[(14, 5), (55, 4), (105, 8), (253, 3)], 
[(33, 207), (86, 192), (117, 187), (255, 188)], 
[(246, 0), (240, 5), (239, 18), (245, 108), (247, 198)]]

My question: I always thought that lines were drawn in the format of:
(x1, y1) to (x2, y2).

So why are there more than 2 coordinates per array? I'd expect the data to be:

[ [(x1, y1), (x2, y2)],
 [(x1, y1), (x2, y2)],
 [(x1, y1), (x2, y2)],
 [(x1, y1), (x2, y2)] ]

I hope that someone knows, and is kind enough to clear the confusion.

Reconstruct high quality 28x28 .npy files from binary files

First off, thank you so much for providing such a helpful dataset, it truly is a goldmine!

I am having trouble reconstructing the 28x28 images from the binary files provided in this dataset.
What are the actual steps in order to reconstruct the 28x28 images in the provided quality from the binary files?

My issue is quite similar to #15 but I managed to get intermediary results.

Here is my current progress:

  • Using the examples/binary_file_parser.py I am able to reconstruct the image in any given size by handling the stroke paths.

  • I am also able to use some blurring to smooth the image.

  • However the quality of the reconstructed image is nowhere near the 28x28 images dataset provided in this repository.

This is an original image from the 28x28 .npy dataset:
original_npy

This is my reconstruction using no blurring technique:
reconstruction_no_blur

And this is my reconstruction using a (2, 2) blur kernel in OpenCV:
reconstruction_blur_2

Any idea on how to reconstruct these images in the quality that is available when downloading the 28x28 .npy files?
Is there some more advanced smoothing and filtering techniques that I have been missing?

Any way to export the image as .png/.jpg from the ndjson/bin file.

I'm not getting how the block describe the image. Is there any way I can export image as png/jpg?
example block:
{ 'key_id': 6545061917491200, 'image': [((54, 37, 7, 5, 12, 0, 7, 34, 52, 56, 52, 42, 38), (66, 49, 8, 11, 63, 105, 105, 89, 84, 88, 149, 205, 255)), ((59, 62, 88, 95, 97), (66, 13, 1, 1, 33))], 'recognized': 1, 'countrycode': 'SE', 'timestamp': 1485553794 }

Exporting Drawings as Raw SVG's

I want to start off by giving tremendous props to the Quick Draw team for releasing this data set to the public. I can't wait to start digging in!

I'm wondering if anyone would be able to provide documentation on how to generating these drawings as individual SVG's. Or even at the very least where I should start looking to be able to do it myself.

Thanks!

One-to-one correspondence between datasets in different formats

Hi,

Thanks for sharing this awesome dataset. I want to use both the bitmap dataset and the meta information contained in the original raw dataset. So I am wondering if there is a one-to-one correspondence between entries with the same indices in these two datasets?

Thanks.

How to convert the preprocessed bin file to Numpy data?

I am wondering how the preprocessed bin data was converted to Numpy (28x28) data?

In the Numpy data, I see that each location has a value between 0 and 255 (not just 1). How was this value arrived at? I thought the original stroke data contained only x and y coordinates and for each such x,y coordinate, we make a 1? But this does not seem to be the case. Is there a pointer to any algorithm to convert the stroke data to the numpy array?

Flip flops!!!?

It is not flip flops, it was invented by NZ and we named them jandals, I was disgusted when I was told to draw flip flops!!!?

filtering images

I used this data to create a kaggle challenge for my students last semester (https://www.kaggle.com/c/pictionary/leaderboard). There are some disturbing images that people have drawn: swatikas, penises, ... These emerged after examining the misclassified images from the classification model built to separate kangaroo, crab, banana, boomerang, cactus and flip flops. It might be useful to filter the dataset to remove these.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.