Comments (24)
Hi @knaaptime, thanks for following up about this!
Looking back at my work in progress, I think it's close to being mergeable. I'll find some time soon to finish it up, and some testing/input from your end would definitely be helpful too!
You can check things out in the range-queries branch, which includes a runnable example of the new functionality in examples/range_example.py
. Does this seem workable for the use case you had in mind?
One to-do was that it returns results as a list of lists of node id–distance tuples, which is not a very convenient format if you have a lot of origin nodes. We could either leave this as-is for the initial release, or take a pass at conversion to something else -- probably multi-indexed DataFrame would be the easiest but I'm not sure.
from pandana.
I would like to see this feature added as well. Are there any updates on this? Is it likely to be added anytime soon?
from pandana.
I'd love to see this get merged! Like @knaaptime, I'd love to have simple range-queries on networks with Pandana. I've been blown away by its efficiency!
I've tackled the return value thing plus docstrings in my fork, and I've sent a draft PR (#167). I opted there to work in the same manner as nearest_pois
, using columns rather than indices in the output data frame. Happy to change that, but I was hoping to get the branch cleaned up & prepared to merge.
The main issue is that the branch currently returns distances greater than the limiting radius. I've implemented a fix in c++ (that's buggy) and a fix using pandas (that's wasteful).
I hope this is useful, and I'm game to do the legwork to push this through.
from pandana.
hey folks, is there still potential to expose this from the C++ side? I'm working on a new spatial weights constructor for libpysal that uses pandana to find the "neighbors" withn some network distance of each observation. It accomplishes the spirit of this issue, but does so really inefficiently by iterating over all o-d pairs of an input geodataframe and calling shortest_path_lengths
before filtering out those beyond the threshold. If instead I could just call get_nodes_in_range
(and it returned the distance in addition to the node id) it would be killer and presumably much faster
from pandana.
Nice! So cython knows to turn a vector of pairs into a list of tuples? That's pretty nice.
A list of namedtuples could be nice for the single node. Maybe a sparse matrix for the whole graph.
from pandana.
hey folks, just wanted to check in and see if anyone is game to merge this and cut a new release. Would love to use this new functionality, and would be more than happy to help tie any loose ends I can :)
from pandana.
FYI - this has existed in Pandana in C++ for a while, but did not make it into the version that was converted to use Pandas DataFrames.
from pandana.
Thanks Stefan,
We will place this on our planned list of improvements and can keep you posted on our progress once we commit to digging into the C++ code. We have some other items on the list that require this so likely will try to address all of them at once.
from pandana.
Hi, Stefan
I think you could go the other way by returning your set of coordinates ( I assume those are transit stops) within predefined distance to parcel centroids. Then do a reversed query to get your isochrones. We have done similar analysis for transit access and we used 200+ transit stations. And pandana returned a table with 400+ columns successfully!
from pandana.
Ahh yes, that's a pretty great idea @semcogli very clever
from pandana.
@semcogli-that is a great idea! Funny because we are doing that already in our accessibility script to get the distance from each parcel to nearest transit stop. Thanks also to @fscottfoti and @sablanchard for replying!
from pandana.
It looks like the get_nodes_in_range function is gone, but it's not really needed. Basically this returns a DistanceVec which is a node_id to distance map (so has what you want).
Also once you call precomputeRangeQueries, the info you want is just sitting there in memory here. Must be tantalizing ;)
It just needs to be added to the cython file (cyaccess.pyx) to get it back into python-land. Maybe you should just add it?
from pandana.
sweet, thanks @fscottfoti! i was hoping there would be a relatively simple path like this (and i was pretty sure precompute
essentially calculated everything i needed...)
Tantalizing indeed! I'll read up on how these C/++ extensions work. Once complete, I'm guessing this would mean reducing to a workflow something like
net = pdna.Network(...)
net.precompute(3000)
neighbor_dist = net.get_distance_vec()
which would grab the data sitting in memory and return a scipy sparse matrix or a dataframe or something?
from pandana.
Yeah sounds right - might just return a dict of dicts of weights? The first key could be the origin node_id, the second key would be the destination node_id and the second value would be the weight?
from pandana.
perfect
from pandana.
I agree that this would be super useful! Thanks @fscottfoti for pointing out the relevant code. I can take a stab at revealing this on the Python side -- @knaaptime let's coordinate.
from pandana.
🎉🎉
from pandana.
(fwiw, when this is implemented it would also mean you could use all the same tooling and workflows, but estimate spatial models in place of OLS for the hedonics back upstream in UDST)
from pandana.
Ok, a first pass of this is working in the range-queries branch!
The initial Python function is network.nodes_in_range(node, radius)
, which returns results for a single source node. I'm thinking the Python API should support a single source node, a list of source nodes, and the entire graph.
Precomputing is optional -- if it's been done, we can return cached results, and otherwise they'll be computed when the function is called.
In C++, the data is stored as a vector of NodeID–distance pairs. The direct translation into Python is a list of tuples, for example the following if you ask for nodes within 4 impedance units of node 1:
[(1, 0.0), (0, 1.0), (908, 1.0), (21, 1.0), (907, 2.0), (787, 2.0), (423, 2.0),
(22, 2.0), (20, 2.0), (513, 3.0), (19, 3.0), (786, 3.0), (462, 3.0), (214, 3.0),
(679, 3.0), (420, 3.0), (123, 4.0), (124, 4.0), (17, 4.0), (1436, 4.0),
(141, 4.0), (1313, 4.0), (212, 4.0), (903, 4.0), (511, 4.0)]
What output formats seem most useful, though? Probably a dataframe or sparse matrix? I'll try some preprocessing in C++ to optimize performance for generating those.
cc @janowicz
from pandana.
awesome!! thanks a ton for the quick work on this! I agree with @fscottfoti on the formats
from pandana.
i know you guys are busy with other parts of the package at the moment, but just wanted to drop by and see if there's anything i can do to test/otherwise help move this along :)
from pandana.
I'm excited for this feature, thanks @smmaurer!
from pandana.
awesome @smmaurer, thank you! I'll play with the range-queries branch, but i agree this is pretty much complete for what i'm after. A dataframe would be manageable, but i think a scipy.sparse matrix might be the ideal output for my use case (generating a spatial weights matrix)
from pandana.
sorry to bother folks, but just wanted to check back in and see if there's any hope to get this merged. If there's any work left to do, I'm happy to do the lifting
from pandana.
Related Issues (20)
- [Question] How to get different Key|Value combinations in osm.node_query
- [Feature Request] Aggreggation queries to subset of nodes
- Binary incompatibility error with older versions of NumPy HOT 1
- Shortest path length exists but shortest path is empty HOT 5
- shortest path distance between non connected nodes
- Allow specifying mapping distance for POIs
- Issue: the POI Accessibility computation is taking forever
- OSM loader - POIs that are ways in OSM HOT 1
- Pandana network initiation slow for large network HOT 4
- how to increase the performance of the aggregate when impedance is changed to time?
- net.shortest_paths doesn't work with multiprocessing
- tables sub-dependency versions conflict with python 3.10 compatibility when installing with pip HOT 1
- incorrect shortest paths solutions when impendance values are particularly small or large HOT 1
- Plotting with Pandana does not render title HOT 1
- osm.pdna_network_from_bbox error HOT 6
- Pulling network object from OSM bugged HOT 2
- New release ?
- Google Colab install pandana error HOT 1
- query the OSM API for the street network within the specified bounding box error HOT 1
- OSM Network Loader Error HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pandana.