Coder Social home page Coder Social logo

pkrumins / xgoogle Goto Github PK

View Code? Open in Web Editor NEW
218.0 31.0 102.0 218 KB

Python library to Google services (google search, google sets, google translate, sponsored links)

Home Page: http://www.catonmat.net/blog/python-library-for-google-search/

Python 100.00%

xgoogle's Introduction

This is a Google library called 'xgoogle'. Current version is 1.3.

It's written by Peteris Krumins ([email protected]).
His blog is at http://www.catonmat.net  --  good coders code, great reuse.

The code is licensed under MIT license.

--------------------------------------------------------------------------

At the moment it contains:
 * Google Search module xgoogle/search.py.
   http://www.catonmat.net/blog/python-library-for-google-search/

 * Google Sponsored Links Search module xgoogle/sponsoredlinks.py
   http://www.catonmat.net/blog/python-library-for-google-sponsored-links-search/

 * Google Sets module xgoogle/googlesets.py
   http://www.catonmat.net/blog/python-library-for-google-sets/

 * Google Translate module xgoogle/translate.py
   http://www.catonmat.net/blog/python-library-for-google-translate/

--------------------------------------------------------------------------

Here is an example usage of Google Search module:

    >>> from xgoogle.search import GoogleSearch
    >>> gs = GoogleSearch("catonmat")
    >>> gs.results_per_page = 25
    >>> results = gs.get_results()
    >>> for res in results:
    ...   print res.title.encode('utf8')
    ... 

    output:

    good coders code, great reuse
    MIT's Introduction to Algorithms, Lectures 1 and 2: Analysis of ...
    catonmat - Google Code
    ...

The GoogleSearch object has several public methods and properties:

    method get_results() - gets a page of results, returning a list of SearchResult objects.
    property num_results - returns number of search results found.
    property results_per_page - sets/gets the number of results to get per page.
    property page - sets/gets the search page.

A SearchResult object has three attributes -- "title", "desc", and "url".
They are Unicode strings, so do a proper encoding before outputting them.

--------------------------------------------------------------------------

Here is an example usage of Google Sponsored Links Search module:

    >>> from xgoogle.sponsoredlinks import SponsoredLinks, SLError
    >>> sl = SponsoredLinks("video software")
    >>> sl.results_per_page = 100
    >>> results = sl.get_results()
    >>> for result in results:
    ...   print result.title.encode('utf8')
    ...

    output:

    Photoshop Video Software
    Video Poker Software
    DVD/Video Rental Software
    ...

The SponsoredLinks object has several public methods and properties:

    method get_results() - gets a page of results, returning a list of SearchResult objects.
    property num_results - returns number of search results found.
    property results_per_page - sets/gets the number of results to get per page.

A SponsoredLink object has four attributes -- "title", "desc", "url", and "display_url".
They are Unicode strings, don't forget to use a proper encoding before outputting them.

--------------------------------------------------------------------------

Here is an example usage of Google Sets module:

    >>> from xgoogle.googlesets import GoogleSets
    >>> gs = GoogleSets(['red', 'yellow'])
    >>> results = gs.get_results()
    >>> print len(results)
    >>> for r in results:
    ...   print r.encode('utf8')
    ... 

    output:

    red
    yellow
    blue
    white
    ...

The GoogleSets object has only get_results(set_type) public method. The default value
for set_type is SMALL_SET, which makes it return 15 related items or fewer.
Use LARGE_SET to get more than 15 items. This get_results() method returns a list of
related items that are represented as unicode strings.
Don't forget to do the proper encoding when outputting these strings!

Here is an example showing differences between SMALL_SET and LARGE_SET:

    >>> from xgoogle.googlesets import GoogleSets, LARGE_SET, SMALL_SET
    >>> gs = GoogleSets(['python', 'perl'])
    >>> results_small = gs.get_results() # SMALL_SET by default
    >>> len(results_small)
    11
    >>> results_small
    [u'python', u'perl', u'php', u'ruby', u'java', u'javascript', u'c++', u'c',
     u'cgi', u'tcl', u'c#']
    >>>
    >>> results_large = gs.get_results(LARGE_SET)
    >>> len(results_large)
    46
    >>> results_large
    [u'perl', u'python', u'java', u'c++', u'php', u'c', u'c#', u'javascript',
     u'howto', u'wiki', u'raid', u'dd', u'linux', u'ruby', u'language', u'xml',
     u'sgml', u'svn', u'kernel', ...]


--------------------------------------------------------------------------

Here is an example usage of Google Translate module:

    >>> from xgoogle.translate import Translator
    >>>
    >>> translate = Translator().translate
    >>> print translate("Mani sauc Pēteris", lang_to="ru").encode('utf-8')
    Меня зовут Петр
    >>> print translate("Mani sauc Pēteris", lang_to="en")
    My name is Peter
    >>> print translate("Меня зовут Петр")
    My name is Peter

The "translate" function takes three arguments - "message", "lang_from" and "lang_to".
If "lang_from" is not given, Google's translation service auto-detects it.
If "lang_to" is not given, it defaults to "en" (English).

In case of an error the "translate" function throws "TranslationError" exception.
Make sure to wrap your code in try/except block to catch it:

    >>> from xgoogle.translate import Translator, TranslationError
    >>>
    >>> try: 
    >>>   translate = Translator().translate
    >>>   print translate("")
    >>> except TranslationError, e:
    >>>   print e

    Failed translating: invalid text 


The Google Translate module also provides "LanguageDetector" class that can be used
to detect the language of the text.

Here is an example usage of LanguageDetector:

    >>> from xgoogle.translate import LanguageDetector, DetectionError
    >>>
    >>> detect = LanguageDetector().detect
    >>> english = detect("This is a wonderful library.")
    >>> english.lang_code
    'en'
    >>> english.lang
    'English'
    >>> english.confidence
    0.28078437000000001
    >>> english.is_reliable
    True

The "DetectionError" may get raised if the detection failed.


--------------------------------------------------------------------------


Version history:

v1.0:  * initial release, xgoogle library contains just the Google Search.
v1.1:  * added Google Sponsored Links Search.
       * fixed a bug in browser.py that might have thrown an unexpected exception.
v1.2:  * added Google Sets module
v1.3:  * added Google Translate module
       * fixed a bug in browser.py when KeyboardInterrupt did not get propagated.

--------------------------------------------------------------------------

That's it. Have fun! :)


Sincerely,
Peteris Krumins
http://www.catonmat.net

xgoogle's People

Contributors

pkrumins avatar ssteinerx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xgoogle's Issues

GoogleSearch not works by filter site

This code returns 0 results, why ???

textToSearch = "android site:twitter.com"
gs = GoogleSearch(textToSearch)
gs.results_per_page = 100
gs.page = 0
results = gs.get_results()
for res in results:
url = res.url.encode('utf8')
print("url:", url)

Patch to support multiple google sites

Found in a comment in the blog: http://www.catonmat.net/c/3011


Simple hack to use more than one google...

In the tool I am developing I needed to be able to input different google's to search so i came up with this simple hack

class GoogleSearch(object):
....
def init(self, query, tld, random_agent=False, debug=False, lang="en", re_search_strings=None):
self.query = query
self._tld = tld
You can then specify a different google by typing

GoogleSearch("keywords", tld="co.uk")


Does it work with Python 3.x?

I was just wondering if it works on Py 3.x (as of 22-07-14). I'm new to Github and I'm sorry if this is the wrong place to ask this question. If this isn't the right place, please point me in the correct direction!

xgoogle is not installable from pypi

currently it is impossible to install xgoogle using pip or easy_install because it does not have the proper release registered / download links.

Div with number of results not found

Search no longer works (tried with the "quick and dirty" example search with debug flag):
_extract_info() no longer findes a div 'ssb' element on the search page.
This may be due to changes/redesign in googles result pages.

Doesn't support https

When I was trying the example in the package, an error occurred saying
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile execfile(filename, namespace) File "C:\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile exec(compile(scripttext, filename, 'exec'), glob, loc) File "D:/google_test.py", line 19, in <module> results = gs.get_results() File "C:\Anaconda\lib\site-packages\xgoogle-1.3-py2.7.egg\xgoogle\search.py", line 150, in get_results page = self._get_results_page() File "C:\Anaconda\lib\site-packages\xgoogle-1.3-py2.7.egg\xgoogle\search.py", line 207, in _get_results_page raise SearchError, "Failed getting %s: %s" % (e.url, e.error) xgoogle.search.SearchError: Failed getting http://www.google.com/search?hl=en&q=about&btnG=Google+Search: [Errno 10054]

Clearly it says xgoogle.search.SearchError: Failed getting http://www.google.com/search?hl=en&q=about&btnG=Google+Search: [Errno 10054] in which google only support https visit but here xgoogle is still using http.

Error with google translate api

xgoogle fails due to Abuse

File "./cool.py", line 9, in
print pars.unescape(translate("ayuda", lang_to="en"))
File "/usr/local/lib/python2.7/dist-packages/xgoogle-1.3-py2.7.egg/xgoogle/translate.py", line 49, in translate
raise TranslationError, "Failed translating: %s" % data['responseDetails']
xgoogle.translate.TranslationError: Failed translating: Suspected Terms of Service Abuse. Please see http://code.google.com/apis/errors

When trying to run this:

!/usr/bin/python

import sys
import HTMLParser
from xgoogle.translate import Translator

pars = HTMLParser.HTMLParser()
translate = Translator().translate
print pars.unescape(translate("traducir esto", lang_to="en"))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.