Coder Social home page Coder Social logo

tensorflow-tfrecord's Introduction

Tensorflow's TFRecord with tf.data - Short Note

In this implementation, I convert image and label to TFrecord file, and then read the file using tf.data

What is TFRecord ?

  • TFRecord is a Tensorflow's standard file format, which stores data in binary format
  • It is recommended to use for machine learning projects, especially one that involves with big dataset

Why it is recommended ?

  • Since data are stored in binary format, it is faster and more flexible
  • For a very large dataset whose size is too big to fit into your memory, it is easy to read partial data from the file

How to implement ?

  • There are two mainstep to implement TFRecord: convert data to TFRecord, and read TFRecord data

Convert data to TFRecord format format

To convert data to TFRecord file format, there are five main steps as below:

1. Create a writer

    writer = tf.python_io.TFRecordWriter(tfrecord_filename)

2. Prepare data and make it binary (In this solution, I use input as an image)

    img = load_image(os.path.join(img_directory,img_addr), img_width, img_height)
    img, shape = get_image_binary(img)
    if img_addr.find('cat') != -1: 
        label = 1
    else:
        label = 0

3. Create features

    feature = {'label': _int64_feature(label),
               'image': _bytes_feature(img),
               'shape':_bytes_feature(shape)}
    features = tf.train.Features(feature=feature)

4. create example

    example = tf.train.Example(features=features)

5. write the example in the TFRecord file

    writer.write(example.SerializeToString())

read TFRecord using tf.data

Reading data using tf.data is much easier than traditional method. You can follow the instructions below:

1. Create parse function that map serialized data back to the usable form

    def _parse(serialized_data):
        features = {'label': tf.FixedLenFeature([], tf.int64),
                   'image': tf.FixedLenFeature([], tf.string),
                   'shape': tf.FixedLenFeature([], tf.string)}
        features = tf.parse_single_example(serialized_data,
                                          features)
        img = tf.decode_raw(features['image'],tf.uint8)
        shape = tf.decode_raw(features['shape'],tf.int32)
        img = tf.reshape(img, shape)
        return img, features['label']

2. create a dataset and map it with parse function

    dataset = tf.data.TFRecordDataset(tfrecord_filename)
    dataset = dataset.map(_parse)

3. create an iterator. That's it

    iterator = dataset.make_initializable_iterator()
    img, label = iterator.get_next()

Execution

To test the code, just run tf_record.py and you will see the tfrecord file in the working directory

Reference

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.