nittolese / gquestions Goto Github PK
View Code? Open in Web Editor NEWFind "People Also Ask" questions
License: GNU General Public License v3.0
Find "People Also Ask" questions
License: GNU General Public License v3.0
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//div[@class='kno-ftr']//div/following-sibling::a[text()='Enviar comentarios']"}
Trying to use it to fetch Spanish results gives me this error. But I see the link with the anchor text "Comentarios" instead of "Enviar comentarios", so if I change the line 103 in the file gquestions.py
it works fine.
From:
el = browser.find_element_by_xpath("//div[@class='kno-ftr']//div/following-sibling::a[text()='Enviar comentarios']")
To:
el = browser.find_element_by_xpath("//div[@class='kno-ftr']//div/following-sibling::a[text()='Comentarios']")
Bravo @nittolese you read my mind! I was about to invest some time to do exactly what you have done (and your version is for sure more pythonic than what I could have done).
What would it take to add in the csv the URL behind each answer? Is it under crawlQuestions
?
Congrats again!
I found this error because while trying to use the module for language agnostic searches (I don't know if it's the right way to call them) in Spanish, I got all the questions in English. For example, names of things that are spelled the same in both English and Spanish.
For the selection of the language, the script uses the parameter hl
but that is the web interface language, instead the lr
parameter should be used since it sets the search language.
For reference: https://sites.google.com/site/tomihasa/google-language-codes
Now, my problem is, changing the line 78 of gquestions.py
from:
browser.get("https://www.google.com?hl=es")
To:
browser.get("https://www.google.com?hl=es&lr=lang_es")
Does provide me with the search results in the correct language but I see no questions at the end and the script fails.
Just a clarification on your terminology:
What do you mean by "headless"?
I have tried to add the Turkish Language Support but I couldn't do it.
I have added:
elif lang == "tr":
el = browser.find_element_by_xpath(
"//div[@class='kno-ftr']//div/following-sibling::a[text()='Geri bildirim']")
if args['en']:
lang = "en"
elif args['es']:
lang = "es"
elif args['tr']:
lang = "tr"
Sections as it should be, the Xpath is correct but still, couldn't do it. Do you have a guide about how to add new Languages?
I am getting the following error:
Daniels-MBP-4:gquestions daniel$ python3.6 gquestions.py query "flights" en
{'--csv': False,
'--headless': False,
'--help': False,
'<depth>': None,
'<keyword>': 'flights',
'depth': False,
'en': True,
'es': False,
'query': True}
Traceback (most recent call last):
File "gquestions.py", line 320, in <module>
browser = initBrowser()
File "gquestions.py", line 69, in initBrowser
return webdriver.Chrome(options=chrome_options,executable_path=chrome_path)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
self.service.start()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/selenium/webdriver/common/service.py", line 76, in start
stdin=PIPE)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 729, in __init__
restore_signals, start_new_session)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 1364, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 8] Exec format error: 'driver/chromedriver'
Any thoughts what is going on?
Just for completeness:
Daniels-MBP-4:gquestions daniel$ chromedriver --version
ChromeDriver 75.0.3770.90 (a6dcaf7e3ec6f70a194cc25e8149475c6590e025-refs/branch-heads/3770@{#1003})
Looks like the maximum depth one can set is 1:
https://github.com/nittolese/gquestions/blob/master/gquestions.py#L300-L305
What is the logic here?
i launched the script but i have this error:
x535@x535-pc:~/Descargas/gquestions-master$ python gquestions.py File "gquestions.py", line 2 SyntaxError: Non-ASCII character '\xe2' in file gquestions.py on line 3, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
so, i decided to deleted all started words:
`usage='''
โโ๐พ Gquestions CLI Usage โโ
๐ Usage:
gquestions.py query (en|es) [depth ] [--csv] [--headless]
gquestions.py (-h | --help)
๐ก Examples:
๐ Options:
-h, --help
'''
Finally, when i try to launch again the script, i have this error: 'x535@x535-pc:~/Descargas/gquestions-master$ python gquestions.py File "gquestions.py", line 106 logging.info(f"clicking on ... {el.text}") ^ SyntaxError: invalid syntax
Example:
for the following scrapped query:
How do you find the cheapest airfare?
i also need the paragraph .
To find the cheapest airfare, you can visit our site or download the app and enter your departure and arrival cities and find out the cheapest days to fly with our fare calendar. In compare to full service carriers, low cost airlines offer cheaper fares. By considering budget airlines, you can do great savings on airline tickets.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.