Coder Social home page Coder Social logo

streamcrab's People

Contributors

boorad avatar cyhex avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

streamcrab's Issues

Small issue on running python tests/moodClientServerTest.py

The directions say basically to start in the tracker directory. I get this error when doing that:

Traceback (most recent call last):
File "tests/moodClientServerTest.py", line 4, in
from tracker.lib.moodClassifierClient import MoodClassifierTCPClient
ImportError: No module named tracker.lib.moodClassifierClient
dwmcqueen@dwmcqueen-VirtualBox:~/smm/tracker$

Classifier NoneType Error

Hi

python start-classifier.py running .

After error give program

Traceback (most recent call last):
File "start-classifier.py", line 14, in
pool = ClassifierWorkerPool()
File "/home/ilkay/Masaüstü/streamcrab-master/smm/classifier/pool.py", line 50, in init
self.trained_classifier = row.get_classifier()
AttributeError: 'NoneType' object has no attribute 'get_classifier'

What should I do ?

Thanks

Reacting to special signs in Tweets

Loaded maxEntTestCorpus
Classify: Bloomberg –He's the man of the Year!
/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py:37: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
if t in stopwords:
/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py:275: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
if word[-1] == 's':
Traceback (most recent call last):
File "toolbox/shell-classifier.py", line 34, in
features = config.classifier_tokenizer.getFeatures(txt)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 144, in getFeatures
return dict.fromkeys(cls.getClassifierTokens(text), 1)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 131, in getClassifierTokens
tokes = cls.stemm(tokes)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 152, in stemm
tokens[i] = stemmer.stem(t)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 633, in stem
stem = self.stem_word(word.lower(), 0, len(word) - 1)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 591, in stem_word
word = self._step1ab(word)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 289, in _step1ab
if word.endswith("ied"):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
Feneks-MacBook-Pro:streamcrab fenek$ python toolbox/shell-classifier.py maxEntTestCorpus
exit: ctrl+c

Loaded maxEntTestCorpus

mongoengine.connection.ConnectionError: Cannot connect to database default : False is not a read preference.

/dev/streamcrab$ sudo python toolbox/collect-tweets.py happy 2000
Traceback (most recent call last):
File "toolbox/collect-tweets.py", line 28, in
models.connect()
File "/home/venkat/dev/streamcrab/smm/models.py", line 125, in connect
mongoengine.connect(**conf)
File "build/bdist.linux-x86_64/egg/mongoengine/connection.py", line 173, in connect
File "build/bdist.linux-x86_64/egg/mongoengine/connection.py", line 135, in get_connection
mongoengine.connection.ConnectionError: Cannot connect to database default :
False is not a read preference.

Not able to create classifier using "toolbox/train-classifier.py"

while running "toolbox/train-classifier.py" i am getting following exception:

Traceback (most recent call last):
File "toolbox/train-classifier.py", line 13, in
from smm import models
File "/usr/local/lib/python2.7/dist-packages/nltk-2.0.4-py2.7.egg/nltk/classify/maxent.py", line 315, in train
gaussian_prior_sigma, **cutoffs)
File "/usr/local/lib/python2.7/dist-packages/nltk-2.0.4-py2.7.egg/nltk/classify/maxent.py", line 1440, in train_maxent_classifier_with_scipy
model.fit(algorithm=algorithm)
File "/usr/lib/python2.7/dist-packages/scipy/maxentropy/maxentropy.py", line 1026, in fit
return model.fit(self, self.K, algorithm)
File "/usr/lib/python2.7/dist-packages/scipy/maxentropy/maxentropy.py", line 226, in fit
callback=callback)
File "/usr/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 636, in fmin_cg
gfk = myfprime(x0)
File "/usr/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 176, in function_wrapper
return function(x, *args)
File "/usr/lib/python2.7/dist-packages/scipy/maxentropy/maxentropy.py", line 420, in grad
G = self.expectations() - self.K
ValueError: operands could not be broadcast together with shapes (800) (1636)

could you please help me to create classifier.

Wrong sentiment

Classify: today is good

Classification: negative with 64.81%

Feature negativ positiv


today==1 (1) 0.325
good==1 (1) -0.044
today==1 (1) -0.631
good==1 (1) 0.031


TOTAL: 0.281 -0.600
PROBS: 0.648 0.352

SMM - is there a missing file?

Hi

Firstly, I have to say this SMM is fantastic and I'm looking forward to being able to implement some of the classifiers you mention...

I have a questions though, and hope you can help

I've run all the tests - per your readme and everything is 'OK'

when I try and initiate the program on my local machine though I get the following error -

python tests/moodClientServerTest.py

Traceback (most recent call last):
File "tests/moodClientServerTest.py", line 4, in
from tracker.lib.moodClassifierClient import MoodClassifierTCPClient
ImportError: No module named tracker.lib.moodClassifierClient

can you provide any insight please?

Thanks in advance

Install Twitter using pip

On OSX remember to install Twitter (or the examples won't work in the default config) --
sudo pip install twitter

other language

Timor, how to add russian language in setting for analysis of the russian tweets?

tweetClassifier.py

Hi again,

Did any one manage to get the tweetClassifier.py script working? I don't think the current documentation mentions the pre-requisite modules also, it links to hard coded data?!

Thanks, Ahmed

NumPy, SciPy, NLTK

Hello,

I wonder what role do these libraries in the project.

That is, the classifiers are part of NumPy and SciPy or NLTK?

tweets_positive_test.dat & tweets_negative_test.dat

Hi,

I would like to quickly try out smm but I can't due to these two missing data files:

tweetsPFile = "/home/gx/Sites/SMM/trunk/tracker/data/tweets_positive_test.dat"
tweetsNFile = "/home/gx/Sites/SMM/trunk/tracker/data/tweets_negative_test.dat"

Did you generate them yourself ? If so, is it possible to commit them too ?

Thanks !

AttributeError: 'str' object has no attribute 'get'

I have installed all the modules and run all the steps to work with the the SMM and when Im trying to run the client python program to connect to the server the server side dumps this error,

I'm currently stuck on this:

tracker # python moodClassifierd.py debug
starting debug mode...

OK

Exception happened during processing of request from ('127.0.0.1', 51861)
Traceback (most recent call last):
File "/usr/lib/python2.6/SocketServer.py", line 560, in process_request_thread
self.finish_request(request, client_address)
File "/usr/lib/python2.6/SocketServer.py", line 322, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python2.6/SocketServer.py", line 617, in init
self.handle()
File "moodClassifierd.py", line 60, in handle
raise e

AttributeError: 'str' object has no attribute 'get'

The moodClassifierd.py is throwing an exception on the line 60, do you guys know what it is happening?

try:

        data_to_send = []
        for r in recvData:
            text = r.get('text')
            r['x_lang'] = self.server.langCls.detect(text)[0]
            r['x_mood'] = self.server.moodCls.classify(text,r['x_lang'] )
            data_to_send.append(r)
    except Exception,e:
        raise e                  <------------------------------------------------------------HERE
        return False

    self._send(data_to_send)

RuntimeError caused by inappropriate multithreading implementation on Microsoft Windows

RuntimeError:
            Attempt to start a new process before the current process
            has finished its bootstrapping phase.

            This probably means that you are on Windows and you have
            forgotten to use the proper idiom in the main module:

                if __name__ == '__main__':
                    freeze_support()
                    ...

            The "freeze_support()" line can be omitted if the program
            is not going to be frozen to produce a Windows executable.

The causing function calls are the following ones:
pool.start()
worker.start()

which are called when running start-classifier.py

These function calls also cause the following error:

TypeError: can't pickle thread.lock objects

Error on run of test client

After starting debug and running test client (python moodClientServerTest.py in the "test" directory), see this error on the client window:

Traceback (most recent call last):
File "moodClientServerTest.py", line 11, in
print MCC.classify(test_data, 'search')
File "../../tracker/lib/moodClassifierClient.py", line 57, in classify
self._readResults()
File "../../tracker/lib/moodClassifierClient.py", line 28, in _readResults
dataLen = int(dataLen)
ValueError: invalid literal for int() with base 10: ''

With this on the daemon window:

Exception happened during processing of request from ('127.0.0.1', 44979)
Traceback (most recent call last):
File "/usr/lib/python2.7/SocketServer.py", line 582, in process_request_thread
self.finish_request(request, client_address)
File "/usr/lib/python2.7/SocketServer.py", line 323, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python2.7/SocketServer.py", line 639, in init
self.handle()
File "moodClassifierd.py", line 61, in handle
raise e
RuntimeError: dictionary changed size during iteration

Consume in .net Application

Hi,
I want to consume it .net application. I just want a function to which i give a string and it returns me score or positive/negative result.

thanks

i can't run this

i don't find
import StopStemmTwitterProcessor, StopTwitterProcessor

pool.py

Line 50

49 row = TrainedClassifiers.objects(name=config.classifier).first()
50 self.trained_classifier = row.get_classifier()

the element row is not recognize as a TrainedClassifiers therefore you can't do the get_classifier() method

Thank you

Polarity it is not working

Hi I have build successfully everything i have been able to create the databases with the following commands
To Build Training Dataset

python collector/trainer/twitterCollector.py
python collector/trainer/tweetClassifier.py

I have already created this files in /data dir

-rw-r--r-- 1 root root 293204 Oct 11 20:53 mood_traing_150k_1k_0.6.dat
-rw-r--r-- 1 root root 6104719 Oct 11 20:52 tweets_negative_raw.dat
-rw-r--r-- 1 root root 6581580 Oct 11 20:47 tweets_positive_raw.dat

My service it is OK

domU-12-31-39-06-8E-37 tracker # ./moodClassifierd.py start
starting...
OK

tests # more moodClientServerTest.py

-- coding: utf-8 --

import sys
sys.path.append('../../')
from tracker.lib.moodClassifierClient import MoodClassifierTCPClient

MCC = MoodClassifierTCPClient('127.0.0.1',6666)

test_data = {'text':'I am sad because i have a bad iphone So the 4S is announced yet preorder is sold out? alright then'}

print MCC.classify(test_data, 'search')

OUTPUT it is


[{'text': u'I am sad because i have a bad iphone So the 4S is announced yet preorder is sold out? alright then', 'x_mood': 0.0, 'x_lang': 'en'}]

I'm not able to identify if the polarity it is "POSITIVE or NEGATIVE" What im doing wrong because it look neutral, I am assuming that i will have to get -.xxx for negative and close to 1 for positive ?

results

Do you have a technical paper about the result of applying those classifiers on data? Is the data read from the public stream api?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.