Coder Social home page Coder Social logo

r-transit / tidytransit Goto Github PK

View Code? Open in Web Editor NEW
142.0 142.0 21.0 74.03 MB

R package for working with GTFS data

Home Page: https://r-transit.github.io/tidytransit/

R 99.34% Rez 0.66%
cran gtfs public public-transport tidyverse transit transit-data transport transportation

tidytransit's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tidytransit's Issues

general discussion around routing packages in r

Sorry for potentially naive question but can this route from A to B with gtfs data?

Context: @mem48 and I are currently using OpenTripPlanner for this but it has quite a lot of overheads, but is very good for multi-modal routing (perhaps one day that will be possible in R).

update package for new dplyr release

from rstudio:

This is an automated email to let you know that:

  • A new version of dplyr is ready to go to CRAN. dplyr is
    currently at version 0.7.8 and will become 0.8.0 upon release.

  • tidytransit uses dplyr and has problems with the new version.

  • We plan to submit dplyr to CRAN on February 1.

This release represents about 9 months of development, detailed in
this blog post:
https://www.tidyverse.org/articles/2018/12/dplyr-0-8-0-release-candidate/

I need your help to keep tidytransit and dplyr working together
smoothly. In the next weeks, can you please:

  1. Read about the changes to dplyr at
    https://github.com/tidyverse/dplyr/blob/master/NEWS.md#dplyr-080.
    This page includes a list of breaking changes, the reasoning behind
    them, and to how to update your code.

  2. Carefully inspect the failing checks listed at the bottom of this email.

  3. For each failing check, either update your package, or tell me
    that I have a bug. If you have made changes to your package, please
    submit an update to CRAN before February 1.

If you have discovered a bug in dplyr, please file an issue (ideally
with a small reprex that illustrates the problem) at
https://github.com/tidyverse/dplyr/issues. If you're not sure whether
or not you've found a bug, please an issue and we'll help you figure
it out. Breaking changes that are not listed qualify as bugs.

Please respond to this message if you have any questions.

Thanks,

Romain Francois

== CHECK RESULTS ========================================

  • checking examples ... ERROR
    Running examples in ‘tidytransit-Ex.R’ failed
    The error most likely occurred in:
    
    

Name: get_route_frequency

Title: Get Route Frequency

Aliases: get_route_frequency

** Examples

data(gtfs_obj)
gtfs_obj <- get_route_frequency(gtfs_obj)
Calculating route and stop headways using defaults (6 am to 10 pm
for weekday service).
Error in n() : could not find function "n"
Calls: get_route_frequency ... -> ->
mutate.tbl_df -> mutate_impl
Execution halted
```

  • checking tests ...
     ERROR
    Running the tests in ‘tests/testthat.R’ failed.
    Last 13 lines of output:
      16: `_fseq`(`_lhs`)
      17: freduce(value, `_function_list`)
      18: function_list[[i]](value)
      19: dplyr::mutate(., service_trips = n())
      20: mutate.tbl_df(., service_trips = n()) at
    

/Users/romain/git/tidyverse/dplyr/R/manip.r:416
21: mutate_impl(.data, dots) at
/Users/romain/git/tidyverse/dplyr/R/tbl-df.r:91

  ══ testthat results

═══════════════════════════════════════════════════════════
OK: 3 SKIPPED: 7 FAILED: 3
1. Error: Stop frequencies (headways) for included data are as
expected (@test_headways.R#4)
2. Error: Route frequencies (headways) for included data are as
expected (@test_headways.R#11)
3. Error: Route frequencies (headways) can be calculated for
included data for a particular service id (@test_headways.R#17)

  Error: testthat unit tests failed
  Execution halted
```

Stops and routes frequency calculation fails when routes have their own unique service ID's (e.g. Barcelona)

Hello everyone I am trying to plot the frecuency of buses that pass trough each stop.

The library suppose to add these data when I write frecuency true like below:
gtfs <-read_gtfs("gtfs.zip", local=TRUE, geometry = TRUE, frequency=TRUE)

Theorically I will get a new data with the frecuency of buses per stop. However I get a error in the console that says :

Warning message:
In get_route_frequency(gtfs_obj) : failed to calculate frequency--
try passing a service_id from calendar_df

becasue this I cant get that ifnormation (I get the data frame but without that data), I have also try to get it using "get_stop_frequency" without luck.

Can someone help me ?
Thank you all

general discussion of repository structure

Hey, I appreciate your effort to separate the different "modules" and I'm looking forward to seeing where this package might end up. As I don't know much about developing R packages, I don't yet understand how this repository and the others (trread, ...) are connected? Is it the same code duplicated and separate or is it automatically cloned/imported in some way?

read non-gtfs-spec column names in parse_gtfs

When using the google sample feed, the resulting stop_times_df looks like

$stop_times_df
# A tibble: 28 x 9
   trip_id arrival_time departure_time stop_id        stop_sequence stop_headsign pickup_type X8    shape_dist_traveled
   <chr>   <chr>        <chr>          <chr>                  <int> <chr>               <int> <chr>               <dbl>
 1 STBA    6:00:00      6:00:00        STAGECOACH                 1 NA                     NA NA                     NA
 2 STBA    6:20:00      6:20:00        BEATTY_AIRPORT             2 NA                     NA NA                     NA
 3 CITY1   6:00:00      6:00:00        STAGECOACH                 1 NA                     NA NA                     NA
 4 CITY1   6:05:00      6:07:00        NANAA                      2 NA                     NA NA                     NA
 5 CITY1   6:12:00      6:14:00        NADAV                      3 NA                     NA NA                     NA
 6 CITY1   6:19:00      6:21:00        DADAN                      4 NA                     NA NA                     NA
 7 CITY1   6:26:00      6:28:00        EMSI                       5 NA                     NA NA                     NA
 8 CITY2   6:28:00      6:30:00        EMSI                       1 NA                     NA NA                     NA
 9 CITY2   6:35:00      6:37:00        DADAN                      2 NA                     NA NA                     NA
10 CITY2   6:42:00      6:44:00        NADAV                      3 NA                     NA NA                     NA

The problem seems to be that the column "drop_off_time" is not a valid column name which is pretty strange for an example feed. Also there are commas missing from line 17 on but that's not the point.

My question is: Why are only required/expected columns read in import.R#348? Why don't we simply read the whole file as a simple csv and check validity afterwards? The column X8 we get isn't really helpful anyways.

Incorrect route linestrings in `.$routes_sf`

Great idea for a package! However I am noticing some issues with the output on my first use, looking at L train routes in Chicago. A reproducible example:

library(tidytransit)
library(mapview)

chicago_gtfs <- read_gtfs("http://www.transitchicago.com/downloads/sch_data/google_transit.zip")

routes <- chicago_gtfs$routes_sf

mapview(routes[routes$route_id == "Pink", ])

image

This is the Yellow Line, not the Pink Line. I'm wondering if some of the rows are getting shuffled when they are converted to simple features? I've forked and will go through the code but let me know if you have any suggestions. Thanks!

make transit service calculation defaults more flexible

default frequency calculation fails if the service schedule isn't something like this kind of weekday (1,1,1,1,1,0,0). but services are potentially specified in broader ways.

it should default to something less restrictive by default and just calculate frequency for whatever it can.

specify how to make this package more useful to package developers

hey @mpadge is this the right issue title for your question about how to manage "merging" this package with other packages?

i think tidytransit ended up being more oriented toward users than developers, partially by the prompting of @angela-li to consider that a user would not want to have to import and think about multiple packages to just do basic mapping and frequency/schedule analysis.

i think that intuition was right, but as you work on gtfs-router i expect you'll develop better approaches across a number of problems.

another way to think about this issue is how to deprecate this package gracefully as you advance your work on gtfs-router. could just be managed by having a similar api.

Plot example not working

Based on this stackoverflow question.

local_gtfs_path <- system.file("extdata", 
                              "google_transit_nyc_subway.zip", 
                              package = "tidytransit")
nyc <- read_gtfs(local_gtfs_path, 
                local=TRUE)
plot(nyc)

with the plot function:

tidytransit:::plot.gtfs <- function (x, ...) {
    dots = list(...)
    routes_sf_frequencies <- x$routes_sf %>% dplyr::inner_join(x$routes_frequency_df, 
        by = "route_id") %>% dplyr::select(median_headways, mean_headways, 
        st_dev_headways, stop_count)
    plot(routes_sf_frequencies)
}

The problem seems to be twofold: routes_sf is missing by default and the headway calculations haven't been done.

Use data.table to read feeds!

library(tidytransit)
library (magrittr)
f <- list.files (getwd (), full.names = TRUE)
filename <- f [grep ("VBB", f)] # GTFS for Berlin-Brandenburg Transport - it's huge!
get_df <- function (filename)
{
    flist <- file.path (utils::unzip (filename, list = TRUE)$Name)
    res <- list ()
    for (i in seq (flist))
    {
        cmd <- paste0 ("unzip -p \"", filename, "\" \"", flist [i], "\"")
        res [[i]] <- data.table::fread (cmd = cmd, showProgress = FALSE) %>%
            as.data.frame ()
    }
    names (res) <- strsplit (flist, ".txt")
    return (res)
}
rbenchmark::benchmark (
                       dat <- read_gtfs (filename, local = TRUE),
                       dat <- get_df (filename),
                       replications = 1)
#>                                       test                       replications elapsed relative   user.self sys.self user.child sys.child
#> 2                                dat <- get_df(filename)            1    3.463     1.00      6.678     0.432      1.745     0.312
#> 1 dat <- read_gtfs(filename, local = TRUE)            1  31.411     9.07    30.701    0.645      0.000     0.000

Created on 2019-02-01 by the reprex package (v0.2.1)

GTFS feeds can be enormous, and data.table makes a pretty huge difference - it'll read a feed nearly ten times faster!


This is also by way of starting a separate conversation about the potential future merging of gtfs-router into this package. It seems like the obvious place for it, and the primary usage for tidytransit if it were available is surely likely to be transit routing? You could then check out your transit options from within the comfort of your R session!

Package description on github

@tbuckl Should we change the package description on github? I don't think the package being sf compatible is its main focus. I'd suggest something like on tidytransit.r-transit.org:
"tidytransit reads the General Transit Feed Specification (GTFS) into tidyverse and simple features dataframes. Use tidytransit to map transit stops and routes, calculate transit frequencies, and validate transit feeds."

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.