Coder Social home page Coder Social logo

mysolr's Introduction

https://secure.travis-ci.org/RedTuna/mysolr.png?branch=master https://coveralls.io/repos/RedTuna/mysolr/badge.png?branch=dev https://pypip.in/d/mysolr/badge.png https://pypip.in/license/mysolr/badge.png

mysolr

Fast python solr binding. Check full documentation here

Features

  • Full query syntax support
  • Facets support
  • Highlighting support
  • Spellchecker support
  • More like this support
  • Stats support
  • Concurrent searchs
  • Python 3 compatible

Installation

From source code:

python setup.py install

From pypi:

pip install mysolr

Usage

from mysolr import Solr

# Default connection to localhost:8080
solr = Solr()

# All solr params are supported!
query = {'q' : '*:*', 'facet' : 'true', 'facet.field' : 'foo'}
response = solr.search(**query)

# do stuff with documents
for document in response.documents:
    # modify field 'foo'
    document['foo'] = 'bar'

# update index with modified documents
solr.update(response.documents, commit=True)

mysolr's People

Contributors

dcrosta avatar dohque avatar fertapric avatar fjavieralba avatar moliware avatar msabramo avatar rabad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mysolr's Issues

Python 2.6 support

We are very near to support it but we have to fix:

  • compat.py : the way that we're checking versions is not compatible with 2.6
  • ordereddict is needed in this version

Provide a utility function to convert datetimes into solr-friendly format?

I'm guessing we can't have auto-handling of datetimes like in pysolr without adding some cruft and a performance hit.
Would it be possible to have a utils module that would provide some helper functions for convenience? ie. converting a datetime to a UTC string.
A simple str(datetimeObj) doesn't work as it's not in the right format for solr. Converting to a UTC string doesn't seem as straightforward as it should in Python.

Dependencies Documentation

I think we should document the gevent dependency when using async search.

In fact, it's a requests dependency, but if you have not installed gevent, mysolr.async_search will not work.

Deployment error with MySolr

When I try and deploy my server with Mysolr I get this error: TypeError: 'NoneType' object has no attribute getitem. It seems like the error is: system_info.raw_content or system_info.raw_content['lucene'] is None

Full Log Message
Traceback (most recent call last):                                                                   
File "./app/__init__.py", line 70, in <module>                                                     
from app.search.views import search                                                              
File "./app/search/views.py", line 8, in <module>                                                  
solr_search = Search()                                                                           
File "./app/search/search.py", line 81, in __init__                                                
self.solr = Solr(SOLR_URL)                                                                       
File "/xxx/lib/python2.7/site-packages/mysolr/mysolr.py", line 45, in __init__ 
self.version = self.get_version()                                                                
File "/xxx/lib/python2.7/site-packages/mysolr/mysolr.py", line 225, in get_version                                                                                                  
version = system_info.raw_content['lucene']['solr-spec-version']                                 
TypeError: 'NoneType' object has no attribute '__getitem__'                                          
unable to load app 0 (mountpoint='') (callable not found or import error)                            
*** no app loaded. going in full dynamic mode ***

Problem with facet.pivot in requests

Hi,

I'd like to add "facet.pivots" to the request. I can add this to a query just fine, but there is a problem with MySolr parsing the response.

I get the following error:

ValueError
Exception Value: need more than 1 value to unpack
Exception Location: Python34/lib/_collections_abc.py in update, line 583

This seems related to:
mysolr/response.py in parse_facets

Any suggestions on getting the facet.pivot response to work?

Thanks!
-Eric

Release new version to pypi

It looks like its been a while since the last release and there some useful changes (especially #30) that are in the repo but not in 0.8. Would you please consider creating a new release for these updates and posting it to pypi. Thanks.

Support solr 1.4

For supporting solr 1.4 mysolr should be able to post xml correctly. mysolr._get_add_xml doesn't support multivalued fields and it isn't managing well encoding or character escaping.

Dangerous network data parsing using eval()

Using eval() to parse data received from the network is a huge security hole. There are 2 ways to fix this:

  • the ast.literal_eval() function should happily parse Solr's pythonic output,
  • use the json module on all python versions since mysolr seems to support python 2.6 and up.

I have opted for the second solution since the code being already there, I assumed it worked properly on newer python versions.

NB: I have tested this patch on python 2.6 against solr 3.6, on Debian Squeeze.

Failure to handle HTML response

In case of error Solr returns just HTML, and mysolr just eats it, making impossible to at least log error. Please return raw body if it is impossible to parse it. Thanks!

Domain expired

The domain for the documentation is no longer available. Moving it to github pages or a separate location seems like a good idea.

Problems with eval(), Unicode and Python3

Hi!

I found a bug on mysolr 0.5.1. When I tried to perform a simple query to my server, an exception raised on mysolr.py:line 48.

Here is the traceback:

Traceback (most recent call last):
File "solr_test.py", line 4, in
response = solr.search(q=':',rows=1)
File "/usr/local/lib/python3.2/dist-packages/mysolr-0.5.1-py3.2.egg/mysolr/mysolr.py", line 67, in search
solr_response = self.__build_response(response)
File "/usr/local/lib/python3.2/dist-packages/mysolr-0.5.1-py3.2.egg/mysolr/mysolr.py", line 48, in __build_response
response_object = eval(response.content)
File "", line 1
{'responseHeader':{'status':0,'QTime':0,'params':{'wt':'python','q':':','rows':'1'}},'response':{'numFound':100,'start':0,'maxScore':50.392296,'docs':[{'id':'184335691025625089','user_name':'Microsoft','text':u'RT @bing: We\u2019re joining forces with @bullymovie to help stop bullying in schools. Join us: http://t.co/pSSnht1K','insensitive_text':u'RT @bing: We\u2019re joining forces with @bullymovie to help stop bullying in schools. Join us: http://t.co/pSSnht1K','user_followers':214438,'created_at':'2012-03-26T17:47:08Z','retweet_count':63,'user_nick':'Microsoft','user_statuses':2285,'score':50.392296}]}}

The problem is that one field of the response has a unicode character and that field is encoded as u'field-text' and Python3 does not know how to interpret that.

Using eval() is not a good practice to parse the JSON response from the server, json.loads() or json.load() should be used instead.

I solved the problem but I have not tested my changes so much. Here is what I've done:

  1. Change the value of the 'wt' parameter to 'json' instead of 'python'. The Python response from Solr was designed for Python2, but I am using Python3 and there were problems with the quotation marks. (function mysolr.py:__build_request)
  2. Substitute eval() in __build_response(), this is the result:
    def __build_response(self, response):
    """ Build a SolrResponse from http request made with requests. """
    content_str = response.content.decode('utf-8') # Bytes to string
    response_object = json.loads(content_str)
    return SolrResponse(response_object)

As I said, this is just a quick fix. But I hope that it helps to solve the problem.

Congratulations for this module!

mysolr and Apache SOLR 4.0 compatibility issue

I'm using mysolr with Solr 4.0 RC1, I found just one issue regaring the commit waitFlush parametter that is not used anymore and is reported as uknown attribute by solr and causing error.

I fixed this by following (should this be in core?)

a) added the version param in costructor

def __init__(self, base_url='http://localhost:8080/solr/', auth=None, version=3):
    self.version=version

b) and in commit function do this ...

if self.version==3:
    xml = '<commit waitFlush="%s" waitSearcher="%s" expungeDeletes="%s" />' % ('true' if wait_flush else 'false',
                                                                                   'true' if wait_searcher else 'false',
                                                                                   'true' if expunge_deletes else 'false')    
else:
    xml = '<commit waitSearcher="%s" expungeDeletes="%s" />' % ('true' if wait_searcher else 'false',
                                                                                   'true' if expunge_deletes else 'false')

Multiple facets in single request

Currently in mysolr 0.7.1 there is no way to define multiple facet fields in the same search request due to the params being a dict and solr dictates that for multiple facet fields you need to define the parameter multiple times (http://wiki.apache.org/solr/SimpleFacetParameters#facet.field)

A solution might to take the params as a list of tuples something along the lines of this:
solr = Solr()
query = [('q' , ':'), ('facet' , 'true'), ('facet.field' , 'foo'), ('facet.field' , 'bar')]
response = solr.search(query)

removing the need for a /update/json request handler

As of Solr 4.0,
updates can be sent in json format via the default /update search handler.
If a user does not have the /update/json search handler the update will fail.

I have fixed this by changing line 309 in mysolr.py from

    url = urljoin(self.base_url, 'update/json')

to

    url = urljoin(self.base_url, 'update?wt=json')

in order to fix this. Should this change be global to everyone?
Thanks!

SolrQuery refactor

SolrQuery must allow the user to have control over the fields to use in the query.

A SolrQuery must receive a dictionary used as a direct mapping for Solr API.

SolrResponse parser

A SolrResponse object must have the following attributes:

  • status: query status
  • qtime: query time
  • total_results: number of results of the query
  • documents: list of documents returned for the query
  • facets: accessible facets

Remove pinning of requests package to specific version

Right now, the requests version specifier is pinned to 0.12.1 (in a17df4f ). There have been 11 releases since that time. The pinning to a specific version is not making it easy to use this package in a project where the latest requests is in use (I created a fork, just to remove the version pin).

Can this version pinning be removed and left up to the end user to get the proper version if they are using Python 3?

Add auth to requests

It would be great if there can be an extra parameter 'auth' when making Solr connection:

def __init__(self, base_url='http://localhost:8080/solr/', auth=None):

Then the requests would use the auth parameter. E.g.

def search(self, resource='select', **kwargs):
    query = build_request(kwargs)

    http_response = requests.get(
            urljoin(self.base_url, resource),
            params=query,
            auth=self.auth)

The 'auth' parameter would allow for both HTTP Basic Auth and Digest Auth:

http://docs.python-requests.org/en/latest/user/quickstart/#basic-authentication

Python Booleans support

It would be nice if the following query worked:

query = {'q': ':', 'facet': True, 'facet.field': 'manu'}

It currently fails because of a 400 Http Error in solr due to "invalid boolean value: True".
So, this is not exactly a mySolr issue, but I think it could be convenient to give support to native python booleans in the queries.

P.D.:

It's the same error that you get when you try the following query in your browser:

http://localhost:8080/solr/select/?q=_:_&start=0&rows=10&indent=on&facet=True&facet.field=manu

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.