Coder Social home page Coder Social logo

Comments (6)

amorisot avatar amorisot commented on May 20, 2024 1

Hi, I got the same error and solve it by updating the httplib2 to the latest version. Regarding the requirements, I also updated tensorflow==1.15.0 since version 1.14.0 gives me the following error: "No module named deprecation_wrapper".

Hmmm, did you change anything from the requirements.txt file other than update httplib2 to newest and updating tensorflow to 1.15.0? I did both of those things but now am getting a "No module named module_wrapper" error :(

from conversational-datasets.

drunkinlove avatar drunkinlove commented on May 20, 2024

Fixed that by installing the necessary dependencies through requirements.txt.
This is the error I get now:

user@cloudshell:~/conversational-datasets (reddit-data-288210)$ python reddit/create_data.py \
>   --output_dir ${DATADIR?} \
>   --reddit_table ${PROJECT?}:${DATASET?}.${TABLE?} \
>   --runner DataflowRunner \
>   --temp_location ${DATADIR?}/temp \
>   --staging_location ${DATADIR?}/staging \
>   --project ${PROJECT?} \
>   --dataset_format JSON
********************************************************************************
Python 2 is deprecated. Upgrade to Python 3 as soon as possible.
See https://cloud.google.com/python/docs/python2-sunset
To suppress this warning, create an empty ~/.cloudshell/no-python-warning file.
The command will automatically proceed in  seconds or on any key.
********************************************************************************
WARNING: Logging before flag parsing goes to stderr.
I0902 11:45:25.874641 140704769283904 apiclient.py:464] Starting GCS upload to gs://reddit-data-bucket/reddit/20200902/staging/beamapp-user-0902114525-652460.1599047125.652742/pipeline.pb..
.
I0902 11:45:25.880270 140704769283904 transport.py:157] Attempting refresh to obtain initial access_token
Traceback (most recent call last):
  File "reddit/create_data.py", line 347, in <module>
    run()
  File "reddit/create_data.py", line 341, in run
    result = p.run()
  File "/home/user/.local/lib/python2.7/site-packages/apache_beam/pipeline.py", line 390, in run
    self.to_runner_api(), self.runner, self._options).run(False)
  File "/home/user/.local/lib/python2.7/site-packages/apache_beam/pipeline.py", line 403, in run
    return self.runner.run_pipeline(self)
  File "/home/user/.local/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 364, in run_pipeline
    self.dataflow_client.create_job(self.job), self)
  File "/home/user/.local/lib/python2.7/site-packages/apache_beam/utils/retry.py", line 180, in wrapper
    return fun(*args, **kwargs)
  File "/home/user/.local/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py", line 485, in create_job
    self.create_job_description(job)
  File "/home/user/.local/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py", line 511, in create_job_description
    StringIO(job.proto_pipeline.SerializeToString()))
  File "/home/user/.local/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py", line 467, in stage_file
    response = self._storage_client.objects.Insert(request, upload=upload)
  File "/home/user/.local/lib/python2.7/site-packages/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py", line 971, in Insert
    download=download)
  File "/home/user/.local/lib/python2.7/site-packages/apitools/base/py/base_api.py", line 720, in _RunMethod
    http, http_request, **opts)
  File "/home/user/.local/lib/python2.7/site-packages/apitools/base/py/http_wrapper.py", line 356, in MakeRequest
    max_retry_wait, total_wait_sec))
  File "/home/user/.local/lib/python2.7/site-packages/apitools/base/py/http_wrapper.py", line 304, in HandleExceptionsAndRebuildHttpConnections
    raise retry_args.exc
httplib2.SSLHandshakeError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:727)

from conversational-datasets.

AntoineSimoulin avatar AntoineSimoulin commented on May 20, 2024

Hi, I got the same error and solve it by updating the httplib2 to the latest version. Regarding the requirements, I also updated tensorflow==1.15.0 since version 1.14.0 gives me the following error: "No module named deprecation_wrapper".

from conversational-datasets.

AntoineSimoulin avatar AntoineSimoulin commented on May 20, 2024

Regarding the region, you can add the flag 'region' in the command prompt.
python reddit/create_data.py \

--output_dir ${DATADIR?}
--reddit_table ${PROJECT?}:${DATASET?}.${TABLE?}
--runner DataflowRunner
--temp_location ${DATADIR?}/temp
--staging_location ${DATADIR?}/staging
--project ${PROJECT?}
--dataset_format JSON
--region us-east1

from conversational-datasets.

alu13 avatar alu13 commented on May 20, 2024

I was wondering if there was an update to the "No module named module_wrapper" error. Thanks!

from conversational-datasets.

pygongnlp avatar pygongnlp commented on May 20, 2024

Regarding the region, you can add the flag 'region' in the command prompt. python reddit/create_data.py \

--output_dir ${DATADIR?}
--reddit_table PROJECT?:{DATASET?}.${TABLE?}
--runner DataflowRunner
--temp_location ${DATADIR?}/temp
--staging_location ${DATADIR?}/staging
--project ${PROJECT?}
--dataset_format JSON
--region us-east1

hi,

I used your method and I found it can not sign in google and apitools has been deprecated.

If there has other way to download reddit dataset? Thanks @AntoineSimoulin

from conversational-datasets.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.