Comments (6)
Hi, I got the same error and solve it by updating the httplib2 to the latest version. Regarding the requirements, I also updated tensorflow==1.15.0 since version 1.14.0 gives me the following error: "No module named deprecation_wrapper".
Hmmm, did you change anything from the requirements.txt file other than update httplib2 to newest and updating tensorflow to 1.15.0? I did both of those things but now am getting a "No module named module_wrapper" error :(
from conversational-datasets.
Fixed that by installing the necessary dependencies through requirements.txt.
This is the error I get now:
user@cloudshell:~/conversational-datasets (reddit-data-288210)$ python reddit/create_data.py \
> --output_dir ${DATADIR?} \
> --reddit_table ${PROJECT?}:${DATASET?}.${TABLE?} \
> --runner DataflowRunner \
> --temp_location ${DATADIR?}/temp \
> --staging_location ${DATADIR?}/staging \
> --project ${PROJECT?} \
> --dataset_format JSON
********************************************************************************
Python 2 is deprecated. Upgrade to Python 3 as soon as possible.
See https://cloud.google.com/python/docs/python2-sunset
To suppress this warning, create an empty ~/.cloudshell/no-python-warning file.
The command will automatically proceed in seconds or on any key.
********************************************************************************
WARNING: Logging before flag parsing goes to stderr.
I0902 11:45:25.874641 140704769283904 apiclient.py:464] Starting GCS upload to gs://reddit-data-bucket/reddit/20200902/staging/beamapp-user-0902114525-652460.1599047125.652742/pipeline.pb..
.
I0902 11:45:25.880270 140704769283904 transport.py:157] Attempting refresh to obtain initial access_token
Traceback (most recent call last):
File "reddit/create_data.py", line 347, in <module>
run()
File "reddit/create_data.py", line 341, in run
result = p.run()
File "/home/user/.local/lib/python2.7/site-packages/apache_beam/pipeline.py", line 390, in run
self.to_runner_api(), self.runner, self._options).run(False)
File "/home/user/.local/lib/python2.7/site-packages/apache_beam/pipeline.py", line 403, in run
return self.runner.run_pipeline(self)
File "/home/user/.local/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 364, in run_pipeline
self.dataflow_client.create_job(self.job), self)
File "/home/user/.local/lib/python2.7/site-packages/apache_beam/utils/retry.py", line 180, in wrapper
return fun(*args, **kwargs)
File "/home/user/.local/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py", line 485, in create_job
self.create_job_description(job)
File "/home/user/.local/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py", line 511, in create_job_description
StringIO(job.proto_pipeline.SerializeToString()))
File "/home/user/.local/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py", line 467, in stage_file
response = self._storage_client.objects.Insert(request, upload=upload)
File "/home/user/.local/lib/python2.7/site-packages/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py", line 971, in Insert
download=download)
File "/home/user/.local/lib/python2.7/site-packages/apitools/base/py/base_api.py", line 720, in _RunMethod
http, http_request, **opts)
File "/home/user/.local/lib/python2.7/site-packages/apitools/base/py/http_wrapper.py", line 356, in MakeRequest
max_retry_wait, total_wait_sec))
File "/home/user/.local/lib/python2.7/site-packages/apitools/base/py/http_wrapper.py", line 304, in HandleExceptionsAndRebuildHttpConnections
raise retry_args.exc
httplib2.SSLHandshakeError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:727)
from conversational-datasets.
Hi, I got the same error and solve it by updating the httplib2 to the latest version. Regarding the requirements, I also updated tensorflow==1.15.0 since version 1.14.0 gives me the following error: "No module named deprecation_wrapper".
from conversational-datasets.
Regarding the region, you can add the flag 'region' in the command prompt.
python reddit/create_data.py \
--output_dir ${DATADIR?}
--reddit_table ${PROJECT?}:${DATASET?}.${TABLE?}
--runner DataflowRunner
--temp_location ${DATADIR?}/temp
--staging_location ${DATADIR?}/staging
--project ${PROJECT?}
--dataset_format JSON
--region us-east1
from conversational-datasets.
I was wondering if there was an update to the "No module named module_wrapper" error. Thanks!
from conversational-datasets.
Regarding the region, you can add the flag 'region' in the command prompt. python reddit/create_data.py \
--output_dir ${DATADIR?}
--reddit_table PROJECT?:{DATASET?}.${TABLE?}
--runner DataflowRunner
--temp_location ${DATADIR?}/temp
--staging_location ${DATADIR?}/staging
--project ${PROJECT?}
--dataset_format JSON
--region us-east1
hi,
I used your method and I found it can not sign in google and apitools has been deprecated.
If there has other way to download reddit dataset? Thanks @AntoineSimoulin
from conversational-datasets.
Related Issues (20)
- Chinese Data HOT 1
- AmazonQA Data Size HOT 2
- Why not include posts in the Reddit dataset? HOT 1
- apache-beam==2.5.0 requirements error HOT 1
- Support for python3 HOT 1
- No module named deprecation_wrapper HOT 2
- Is it possible to get access to the raw data from other storage/computational platform and read/process data there.
- Quota exceeded: Your project exceeded quota for free query bytes scanned. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors HOT 1
- Access not available: "http://models.poly-ai.com/convert/v1/model.tar.gz"
- Large datasets
- how to run ? HOT 2
- "No module named module_wrapper" HOT 1
- Get more workers with Google Cloud's free trial HOT 2
- Local Download
- The app is blocked HOT 1
- 'Comment' is not defined
- Line 114
- AttributeError: 'NoneType' object has no attribute 'Client'
- How to submit RTBF
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from conversational-datasets.