Coder Social home page Coder Social logo

Future Improvements- Returning a DataFrame of all source-destination nodes within a certain distance (and including the distance) about pandana HOT 24 OPEN

udst avatar udst commented on July 19, 2024
Future Improvements- Returning a DataFrame of all source-destination nodes within a certain distance (and including the distance)

from pandana.

Comments (24)

smmaurer avatar smmaurer commented on July 19, 2024 3

Hi @knaaptime, thanks for following up about this!

Looking back at my work in progress, I think it's close to being mergeable. I'll find some time soon to finish it up, and some testing/input from your end would definitely be helpful too!

You can check things out in the range-queries branch, which includes a runnable example of the new functionality in examples/range_example.py. Does this seem workable for the use case you had in mind?

One to-do was that it returns results as a list of lists of node id–distance tuples, which is not a very convenient format if you have a lot of origin nodes. We could either leave this as-is for the initial release, or take a pass at conversion to something else -- probably multi-indexed DataFrame would be the easiest but I'm not sure.

from pandana.

bstabler avatar bstabler commented on July 19, 2024 2

I would like to see this feature added as well. Are there any updates on this? Is it likely to be added anytime soon?

from pandana.

ljwolf avatar ljwolf commented on July 19, 2024 2

I'd love to see this get merged! Like @knaaptime, I'd love to have simple range-queries on networks with Pandana. I've been blown away by its efficiency!

I've tackled the return value thing plus docstrings in my fork, and I've sent a draft PR (#167). I opted there to work in the same manner as nearest_pois, using columns rather than indices in the output data frame. Happy to change that, but I was hoping to get the branch cleaned up & prepared to merge.

The main issue is that the branch currently returns distances greater than the limiting radius. I've implemented a fix in c++ (that's buggy) and a fix using pandas (that's wasteful).

I hope this is useful, and I'm game to do the legwork to push this through.

from pandana.

knaaptime avatar knaaptime commented on July 19, 2024 1

hey folks, is there still potential to expose this from the C++ side? I'm working on a new spatial weights constructor for libpysal that uses pandana to find the "neighbors" withn some network distance of each observation. It accomplishes the spirit of this issue, but does so really inefficiently by iterating over all o-d pairs of an input geodataframe and calling shortest_path_lengths before filtering out those beyond the threshold. If instead I could just call get_nodes_in_range (and it returned the distance in addition to the node id) it would be killer and presumably much faster

from pandana.

fscottfoti avatar fscottfoti commented on July 19, 2024 1

Nice! So cython knows to turn a vector of pairs into a list of tuples? That's pretty nice.

A list of namedtuples could be nice for the single node. Maybe a sparse matrix for the whole graph.

from pandana.

knaaptime avatar knaaptime commented on July 19, 2024 1

hey folks, just wanted to check in and see if anyone is game to merge this and cut a new release. Would love to use this new functionality, and would be more than happy to help tie any loose ends I can :)

from pandana.

fscottfoti avatar fscottfoti commented on July 19, 2024

FYI - this has existed in Pandana in C++ for a while, but did not make it into the version that was converted to use Pandas DataFrames.

from pandana.

sablanchard avatar sablanchard commented on July 19, 2024

Thanks Stefan,
We will place this on our planned list of improvements and can keep you posted on our progress once we commit to digging into the C++ code. We have some other items on the list that require this so likely will try to address all of them at once.

from pandana.

semcogli avatar semcogli commented on July 19, 2024

Hi, Stefan

I think you could go the other way by returning your set of coordinates ( I assume those are transit stops) within predefined distance to parcel centroids. Then do a reversed query to get your isochrones. We have done similar analysis for transit access and we used 200+ transit stations. And pandana returned a table with 400+ columns successfully!

from pandana.

fscottfoti avatar fscottfoti commented on July 19, 2024

Ahh yes, that's a pretty great idea @semcogli very clever

from pandana.

stefancoe avatar stefancoe commented on July 19, 2024

@semcogli-that is a great idea! Funny because we are doing that already in our accessibility script to get the distance from each parcel to nearest transit stop. Thanks also to @fscottfoti and @sablanchard for replying!

from pandana.

fscottfoti avatar fscottfoti commented on July 19, 2024

It looks like the get_nodes_in_range function is gone, but it's not really needed. Basically this returns a DistanceVec which is a node_id to distance map (so has what you want).

Also once you call precomputeRangeQueries, the info you want is just sitting there in memory here. Must be tantalizing ;)

It just needs to be added to the cython file (cyaccess.pyx) to get it back into python-land. Maybe you should just add it?

from pandana.

knaaptime avatar knaaptime commented on July 19, 2024

sweet, thanks @fscottfoti! i was hoping there would be a relatively simple path like this (and i was pretty sure precompute essentially calculated everything i needed...)

Tantalizing indeed! I'll read up on how these C/++ extensions work. Once complete, I'm guessing this would mean reducing to a workflow something like

net = pdna.Network(...)
net.precompute(3000)
neighbor_dist = net.get_distance_vec()

which would grab the data sitting in memory and return a scipy sparse matrix or a dataframe or something?

from pandana.

fscottfoti avatar fscottfoti commented on July 19, 2024

Yeah sounds right - might just return a dict of dicts of weights? The first key could be the origin node_id, the second key would be the destination node_id and the second value would be the weight?

from pandana.

knaaptime avatar knaaptime commented on July 19, 2024

perfect

from pandana.

smmaurer avatar smmaurer commented on July 19, 2024

I agree that this would be super useful! Thanks @fscottfoti for pointing out the relevant code. I can take a stab at revealing this on the Python side -- @knaaptime let's coordinate.

from pandana.

knaaptime avatar knaaptime commented on July 19, 2024

🎉🎉

from pandana.

knaaptime avatar knaaptime commented on July 19, 2024

(fwiw, when this is implemented it would also mean you could use all the same tooling and workflows, but estimate spatial models in place of OLS for the hedonics back upstream in UDST)

from pandana.

smmaurer avatar smmaurer commented on July 19, 2024

Ok, a first pass of this is working in the range-queries branch!

The initial Python function is network.nodes_in_range(node, radius), which returns results for a single source node. I'm thinking the Python API should support a single source node, a list of source nodes, and the entire graph.

Precomputing is optional -- if it's been done, we can return cached results, and otherwise they'll be computed when the function is called.

In C++, the data is stored as a vector of NodeID–distance pairs. The direct translation into Python is a list of tuples, for example the following if you ask for nodes within 4 impedance units of node 1:

[(1, 0.0), (0, 1.0), (908, 1.0), (21, 1.0), (907, 2.0), (787, 2.0), (423, 2.0), 
(22, 2.0), (20, 2.0), (513, 3.0), (19, 3.0), (786, 3.0), (462, 3.0), (214, 3.0), 
(679, 3.0), (420, 3.0), (123, 4.0), (124, 4.0), (17, 4.0), (1436, 4.0), 
(141, 4.0), (1313, 4.0), (212, 4.0), (903, 4.0), (511, 4.0)]

What output formats seem most useful, though? Probably a dataframe or sparse matrix? I'll try some preprocessing in C++ to optimize performance for generating those.

cc @janowicz

from pandana.

knaaptime avatar knaaptime commented on July 19, 2024

awesome!! thanks a ton for the quick work on this! I agree with @fscottfoti on the formats

from pandana.

knaaptime avatar knaaptime commented on July 19, 2024

i know you guys are busy with other parts of the package at the moment, but just wanted to drop by and see if there's anything i can do to test/otherwise help move this along :)

from pandana.

bstabler avatar bstabler commented on July 19, 2024

I'm excited for this feature, thanks @smmaurer!

from pandana.

knaaptime avatar knaaptime commented on July 19, 2024

awesome @smmaurer, thank you! I'll play with the range-queries branch, but i agree this is pretty much complete for what i'm after. A dataframe would be manageable, but i think a scipy.sparse matrix might be the ideal output for my use case (generating a spatial weights matrix)

from pandana.

knaaptime avatar knaaptime commented on July 19, 2024

sorry to bother folks, but just wanted to check back in and see if there's any hope to get this merged. If there's any work left to do, I'm happy to do the lifting

from pandana.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.