Coder Social home page Coder Social logo

mobility's Introduction

ANET Mobility project

Reports so far

To play around with the data

1. fork and clone the repo

git clone https://github.com/sscu-budapest/mobility

(preferably fork it first)

2. install deps

pip install -r requirements.txt

3. get the sample data

if you are using the anet server:

dvc pull

otherwise, set up the anet server to be anetcloud in ssh config and then

dvc pull --remote anetcloud-ssh

4. load some samples and look around

from src.data_dumps import ParsedCols
from src.load_samples import covid_tuesday

def total_range(s):
    return s.max() - s.min()

samp_df = covid_tuesday.get_full_df()

samp_df.groupby(ParsedCols.user).agg(
    {
        ParsedCols.lon: ["std", total_range],
        ParsedCols.lat: ["std", total_range],
        ParsedCols.dtime: ["min", "max", "count"],
    }
).agg(["mean", "median"]).T
mean median
lon std 0.033919 0.001588
total_range 0.080638 0.000471
lat std 0.01889 0.001067
total_range 0.045475 0.000336
dtime min 2020-11-03 07:38:23.900066048 2020-11-03 06:33:42
max 2020-11-03 18:43:00.657093888 2020-11-03 21:08:10
count 87.307432 21.0

+ if you want to run something that can run on the full data set, I suggest using dask

from src.data_dumps import ParsedCols
from src.load_samples import covid_tuesday
import matplotlib.pyplot as plt


samp_ddf = covid_tuesday.get_full_ddf()

ddf_aggs = (
    samp_ddf.assign(hour=lambda df: df[ParsedCols.dtime].dt.hour)
    .groupby("hour")
    .agg({ParsedCols.lon: ["std"], ParsedCols.lat: "std", "dtime": "count"})
    .compute()
)

fig, ax1 = plt.subplots()

ddf_aggs.iloc[:, :2].plot(figsize=(14, 7), ax=ax1, xlabel="hour in the day").legend(
    loc="center left"
)
ddf_aggs.loc[:, "dtime"].plot(figsize=(14, 7), ax=ax1.twinx(), color="green").legend(
    loc="center right"
)

fig1

load a full week of data

from src.create_samples import covid_sample, non_covid_sample

# this is about 3GB of memory, use get_full_ddf for lazy dask dataframe
cov_df = covid_sample.get_full_df()

TODO

  • "reliable user" counts

    • number of pings
    • "do we know where they live"
    • every month at least once a week
    • 30 / day (?)
    • 3 in teh morning, 3 in teh evening
  • dump by month

  • dump by user

mobility's People

Contributors

endremborza avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.