Coder Social home page Coder Social logo

python-test's Introduction

python-test

We are having an issue with generating a stream of data so that we can use on several algorithms. We have two csv files. The first one is called dates.csv. The second one is called data.csv. We have a date range filter of start date and end date. What we need to do is to bring both csv files into memory. The dates.csv should be changed to an object that has the following structure:

key, start_date, end_date

in the dates.csv

key = line[0] + line[1] start_date = line[2] end_date = line[3]

This structure can be a dictionary, numpy array ect.

The data.csv should be changed to an object that has the following structure:

key, date, float1, float2

in the data.csv

key = line[0] + line[2] date = line[1] float1 = line[3] float2 = line[4]

The system will be write very little but read a lot. This means that we can cache the structure of this object in shared memory or file or some other way. The time that it takes to write the structure in shared memory or wherever it is stored is not really counted. Whatever method you choose to cache or store this data must be able to handle several processes trying to read from the data at once. What we are mainly concerned with is the time it takes to read the data from where it is cached or stored and then do a transformation so our algorithm can operate on the data. The transformation needs to be the following:

Using start and end date passed in go through the dates object and find all the keys that have a start date <= the input start date and an end date >= the input end date. In other wards the input start and end date range must be greater than or equal to the keys start and end date range. Once the dates have been parsed use the key values to go through the data object to find all of the date, float1, and float2 values that fall within the input date range. Another transformation object needs to be created that looks like the following:

key1, [[date[0], float1[0], float2[0], date[1], float1[1], float2[1], ect…]], key2, [[date[0], float1[0], float2[0], date[1], float1[1], float2[1], ect…]], key3, [[date[0], float1[0], float2[0], date[1], float1[1], float2[1], ect…]]

This will be an array structure.

We need to be able to create this transformation object in python in about 1 - 2 seconds.

python-test's People

Contributors

jcadruvi avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.