Comments (3)
Please note that the error is raised just at import:
import pyarrow.parquet as pq
Therefore it must be caused by some problem with your pyarrow installation. I would recommend you uninstall and install pyarrow again.
I also see that it seems you use conda to install pyarrow. Please note that pyarrow offers 3 different packages in conda-forge: https://arrow.apache.org/docs/python/install.html#using-conda
conda install -c conda-forge pyarrow
While the pyarrow conda-forge package is the right choice for most users, both a minimal and maximal variant of the package exist, either of which may be better for your use case. See Differences between conda-forge packages.
Please, make sure you install the right one: I guess it is either pyarrow
(or pyarrow-all
).
from datasets.
I have same issue, please downgrade pyarrow==15.0.2, it seem datasets library need to be fix
from datasets.
It is not a problem with the datasets
library: we support latest version of pyarrow
and our Continuous Integration tests are using pyarrow 16.1.0 without any problem.
The error reported here is raised when importing pyarrow.parquet:
---> 29 import pyarrow.parquet as pq
File /opt/conda/lib/python3.10/site-packages/pyarrow/parquet/__init__.py:20
1 # Licensed to the Apache Software Foundation (ASF) under one
2 # or more contributor license agreements. See the NOTICE file
3 # distributed with this work for additional information
(...)
17
18 # flake8: noqa
---> 20 from .core import *
File /opt/conda/lib/python3.10/site-packages/pyarrow/parquet/core.py:33
30 import pyarrow as pa
32 try:
---> 33 import pyarrow._parquet as _parquet
34 except ImportError as exc:
35 raise ImportError(
36 "The pyarrow installation is not built with support "
37 f"for the Parquet file format ({str(exc)})"
38 ) from None
File /opt/conda/lib/python3.10/site-packages/pyarrow/_parquet.pyx:1, in init pyarrow._parquet()
AttributeError: module 'pyarrow.lib' has no attribute 'ListViewType'
This can only be explained if pyarrow was not properly installed.
If the user just installed pyarrow-core
from conda-forge, then its parquet subpackage is not installed and cannot be imported. You can check pyarrow docs:
- Differences between conda-forge packages: https://arrow.apache.org/docs/python/install.html#python-conda-differences
The
pyarrow-core
package includes the following functionality:
...
Thepyarrow
package adds the following:
...
Parquet (i.e.,pyarrow.parquet
)
from datasets.
Related Issues (20)
- `Dataset.with_format` behaves inconsistently with documentation HOT 2
- load_dataset() should load all subsets, if no specific subset is specified HOT 4
- Remove canonical datasets from docs
- My Private Dataset doesn't exist on the Hub or cannot be accessed HOT 8
- Manual downloads should count as downloads HOT 1
- Method to load Laion400m
- IndexError during training with Squad dataset and T5-small model HOT 1
- load json file error with v2.20.0 HOT 2
- How can I load partial parquet files only? HOT 12
- Support NumPy 2.0
- cannot split dataset when using load_dataset HOT 2
- Convert polars DataFrame back to datasets HOT 1
- cache in nfs error
- Problematic rank after calling `split_dataset_by_node` twice HOT 1
- Dataset with streaming doesn't work with proxy HOT 1
- ImportError when importing datasets.load_dataset HOT 3
- CI is broken for tests using hf-internal-testing/librispeech_asr_dummy
- IterableDataset: Unsupported ScalarType BFloat16 HOT 3
- Datasetbuilder Local Download FileNotFoundError HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datasets.