Comments (3)
Right now we're not handling varied documents. Each document must have all the fields you list in your dtype. If some of your documents have a "geo" subdocument and some don't, then pass this query string to "find":
find({"geo": {"$exists": True}})
But there might be other scenarios in which you want all the documents, but some documents are missing some fields. I'm curious, what behavior should BSON-NumPy have for this scenario? Should it return zeros for the missing values, or return a masked ndarray?
from bson-numpy.
Thanks for the tip. I will split up my queries for now.
Just read thru the above link and I like the idea of returning masked ndarrys for missing fields.
from bson-numpy.
Hi @alysivji . I am closing this as we have discontinued development of BSON-NumPy. PyMongoArrow is now the recommended way to materialize MongoDB query results as NumPy ndarrays as well as tabular formats like Pandas' DataFrames and PyArrow Tables.
With PyMongoArrow you are no longer required to split your queries due to a difference in document structure. Given a specified schema, PyMongoArrow will attempt to load all documents from the query result set and will replace missing fields or fields containing a value of unexpected type with a Null/NaN which can be accounted for/handled by your application logic. See the documentation for an example - https://mongo-arrow.readthedocs.io/en/pymongoarrow-0.1.1/quickstart.html.
from bson-numpy.
Related Issues (20)
- Define supported Python versions HOT 2
- Create and use a master branch HOT 1
- Clean up branches HOT 1
- Test in Evergreen HOT 3
- Add a THIRD-PARTY-NOTICES file
- Add standard MongoDB header to each source file
- Update setup.py metadata HOT 1
- Add pandas tutorial. HOT 1
- Provide a script to "clean" data HOT 1
- Add __version__
- Remove all Python 2 specific code HOT 1
- Speedup Travis builds using ccache HOT 1
- Memory leak HOT 2
- Pin pandas version and consider dropping Python 3.5 support HOT 4
- Seg faults for pulling flat arrays HOT 7
- Segfault when running benchmark HOT 4
- Use Python.h to make bson-config.h more portable
- ModuleNotFoundError: No module named 'bsonnumpy._cbsonnumpy' HOT 2
- Install fails with Python 3.6 because of Numpy 1.20 HOT 4
- dtype for a numpy array HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bson-numpy.