cyhex / streamcrab Goto Github PK

View Code? Open in Web Editor NEW

144.0 144.0 49.0 5.53 MB

Real-Time, Twitter sentiment analyzer engine

Home Page: http:/www.streamcrab.com

CSS 21.74% JavaScript 12.16% Python 66.10%

streamcrab's People

Contributors

Stargazers

Watchers

Forkers

aenigme coolhero frankk00 tklee laranea sp00 boorad cellscape toniprada abhinavgupta oiclid hpsoar alepharchives thinium aptx486900 ac3647 jimmy0000 superxroot ilnurmanapov ameen4827 vdeleon zebpalmer geraldstanje ranjithtenz wkryst ghitakouadri kumardeepam hari-viswadeep mostdev sunilosunil agogodavid nahidcse05 karimkhanp ishantanu sandy4321 theseusyang shobhitmittal bastinrobin thedatalass pippobaudos cloudzombie drat letschm arnebab manxiaoca sommschu beerus11 p-4-pratyush-ranjan

streamcrab's Issues

Incorrect stream for Twitter Stream API

The collector/trainer/twitterCollector.py has this stream:

http://stream.twitter.com/1/statuses/filter.json

It should be this:

https://stream.twitter.com/1/statuses/filter.json

(notice the https)

Small issue on running python tests/moodClientServerTest.py

The directions say basically to start in the tracker directory. I get this error when doing that:

Traceback (most recent call last):
File "tests/moodClientServerTest.py", line 4, in
from tracker.lib.moodClassifierClient import MoodClassifierTCPClient
ImportError: No module named tracker.lib.moodClassifierClient
dwmcqueen@dwmcqueen-VirtualBox:~/smm/tracker$

Classifier NoneType Error

python start-classifier.py running .

After error give program

Traceback (most recent call last):
File "start-classifier.py", line 14, in
pool = ClassifierWorkerPool()
File "/home/ilkay/Masaüstü/streamcrab-master/smm/classifier/pool.py", line 50, in init
self.trained_classifier = row.get_classifier()
AttributeError: 'NoneType' object has no attribute 'get_classifier'

What should I do ?

Thanks

Reacting to special signs in Tweets

Loaded maxEntTestCorpus
Classify: Bloomberg –He's the man of the Year!
/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py:37: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
if t in stopwords:
/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py:275: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
if word[-1] == 's':
Traceback (most recent call last):
File "toolbox/shell-classifier.py", line 34, in
features = config.classifier_tokenizer.getFeatures(txt)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 144, in getFeatures
return dict.fromkeys(cls.getClassifierTokens(text), 1)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 131, in getClassifierTokens
tokes = cls.stemm(tokes)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 152, in stemm
tokens[i] = stemmer.stem(t)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 633, in stem
stem = self.stem_word(word.lower(), 0, len(word) - 1)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 591, in stem_word
word = self._step1ab(word)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 289, in _step1ab
if word.endswith("ied"):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
Feneks-MacBook-Pro:streamcrab fenek$ python toolbox/shell-classifier.py maxEntTestCorpus
exit: ctrl+c

Loaded maxEntTestCorpus

mongoengine.connection.ConnectionError: Cannot connect to database default : False is not a read preference.

/dev/streamcrab$ sudo python toolbox/collect-tweets.py happy 2000
Traceback (most recent call last):
File "toolbox/collect-tweets.py", line 28, in
models.connect()
File "/home/venkat/dev/streamcrab/smm/models.py", line 125, in connect
mongoengine.connect(**conf)
File "build/bdist.linux-x86_64/egg/mongoengine/connection.py", line 173, in connect
File "build/bdist.linux-x86_64/egg/mongoengine/connection.py", line 135, in get_connection
mongoengine.connection.ConnectionError: Cannot connect to database default :
False is not a read preference.

Not able to create classifier using "toolbox/train-classifier.py"

while running "toolbox/train-classifier.py" i am getting following exception:

Traceback (most recent call last):
File "toolbox/train-classifier.py", line 13, in
from smm import models
File "/usr/local/lib/python2.7/dist-packages/nltk-2.0.4-py2.7.egg/nltk/classify/maxent.py", line 315, in train
gaussian_prior_sigma, **cutoffs)
File "/usr/local/lib/python2.7/dist-packages/nltk-2.0.4-py2.7.egg/nltk/classify/maxent.py", line 1440, in train_maxent_classifier_with_scipy
model.fit(algorithm=algorithm)
File "/usr/lib/python2.7/dist-packages/scipy/maxentropy/maxentropy.py", line 1026, in fit
return model.fit(self, self.K, algorithm)
File "/usr/lib/python2.7/dist-packages/scipy/maxentropy/maxentropy.py", line 226, in fit
callback=callback)
File "/usr/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 636, in fmin_cg
gfk = myfprime(x0)
File "/usr/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 176, in function_wrapper
return function(x, *args)
File "/usr/lib/python2.7/dist-packages/scipy/maxentropy/maxentropy.py", line 420, in grad
G = self.expectations() - self.K
ValueError: operands could not be broadcast together with shapes (800) (1636)

could you please help me to create classifier.

StopStemmTwitterProcessor

classifier_tokenizer = StopStemmTwitterProcessor not found

Wrong sentiment

Classify: today is good

Classification: negative with 64.81%

Feature negativ positiv

today==1 (1) 0.325
good==1 (1) -0.044
today==1 (1) -0.631
good==1 (1) 0.031

TOTAL: 0.281 -0.600
PROBS: 0.648 0.352

SMM - is there a missing file?

Firstly, I have to say this SMM is fantastic and I'm looking forward to being able to implement some of the classifiers you mention...

I have a questions though, and hope you can help

I've run all the tests - per your readme and everything is 'OK'

when I try and initiate the program on my local machine though I get the following error -

python tests/moodClientServerTest.py

can you provide any insight please?

Thanks in advance

Not sure what changes to make to twitterCollector.py

What fields should be replaced? I see username / password - is this Twitter username / password? Anything else?

Install Twitter using pip

On OSX remember to install Twitter (or the examples won't work in the default config) --
sudo pip install twitter

classifier

Hi,

Do you want to compare your classifier with other teams competed at semeval2014 and tell me the score?
http://alt.qcri.org/semeval2014/task9/
paper and results: http://alt.qcri.org/semeval2014/cdrom/pdf/SemEval2014009.pdf

Thanks,
Gerald

other language

Timor, how to add russian language in setting for analysis of the russian tweets?

tweetClassifier.py

Hi again,

Did any one manage to get the tweetClassifier.py script working? I don't think the current documentation mentions the pre-requisite modules also, it links to hard coded data?!

Thanks, Ahmed

NumPy, SciPy, NLTK

Hello,

I wonder what role do these libraries in the project.

That is, the classifiers are part of NumPy and SciPy or NLTK?

tweets_positive_test.dat & tweets_negative_test.dat

Hi,

I would like to quickly try out smm but I can't due to these two missing data files:

tweetsPFile = "/home/gx/Sites/SMM/trunk/tracker/data/tweets_positive_test.dat"
tweetsNFile = "/home/gx/Sites/SMM/trunk/tracker/data/tweets_negative_test.dat"

Did you generate them yourself ? If so, is it possible to commit them too ?

Thanks !

AttributeError: 'str' object has no attribute 'get'

I have installed all the modules and run all the steps to work with the the SMM and when Im trying to run the client python program to connect to the server the server side dumps this error,

I'm currently stuck on this:

tracker # python moodClassifierd.py debug
starting debug mode...

OK

Exception happened during processing of request from ('127.0.0.1', 51861)
Traceback (most recent call last):
File "/usr/lib/python2.6/SocketServer.py", line 560, in process_request_thread
self.finish_request(request, client_address)
File "/usr/lib/python2.6/SocketServer.py", line 322, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python2.6/SocketServer.py", line 617, in init
self.handle()
File "moodClassifierd.py", line 60, in handle
raise e

AttributeError: 'str' object has no attribute 'get'

The moodClassifierd.py is throwing an exception on the line 60, do you guys know what it is happening?

try:

        data_to_send = []
        for r in recvData:
            text = r.get('text')
            r['x_lang'] = self.server.langCls.detect(text)[0]
            r['x_mood'] = self.server.moodCls.classify(text,r['x_lang'] )
            data_to_send.append(r)
    except Exception,e:
        raise e                  <------------------------------------------------------------HERE
        return False

    self._send(data_to_send)

RuntimeError caused by inappropriate multithreading implementation on Microsoft Windows

RuntimeError:
            Attempt to start a new process before the current process
            has finished its bootstrapping phase.

            This probably means that you are on Windows and you have
            forgotten to use the proper idiom in the main module:

                if __name__ == '__main__':
                    freeze_support()
                    ...

            The "freeze_support()" line can be omitted if the program
            is not going to be frozen to produce a Windows executable.

The causing function calls are the following ones:
pool.start()
worker.start()

which are called when running start-classifier.py

These function calls also cause the following error:

TypeError: can't pickle thread.lock objects

Error on run of test client

After starting debug and running test client (python moodClientServerTest.py in the "test" directory), see this error on the client window:

Traceback (most recent call last):
File "moodClientServerTest.py", line 11, in
print MCC.classify(test_data, 'search')
File "../../tracker/lib/moodClassifierClient.py", line 57, in classify
self._readResults()
File "../../tracker/lib/moodClassifierClient.py", line 28, in _readResults
dataLen = int(dataLen)
ValueError: invalid literal for int() with base 10: ''

With this on the daemon window:

Exception happened during processing of request from ('127.0.0.1', 44979)
Traceback (most recent call last):
File "/usr/lib/python2.7/SocketServer.py", line 582, in process_request_thread
self.finish_request(request, client_address)
File "/usr/lib/python2.7/SocketServer.py", line 323, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python2.7/SocketServer.py", line 639, in init
self.handle()
File "moodClassifierd.py", line 61, in handle
raise e
RuntimeError: dictionary changed size during iteration

Consume in .net Application

Hi,
I want to consume it .net application. I just want a function to which i give a string and it returns me score or positive/negative result.

thanks

i can't run this

i don't find
import StopStemmTwitterProcessor, StopTwitterProcessor

pool.py

Line 50

49 row = TrainedClassifiers.objects(name=config.classifier).first()
50 self.trained_classifier = row.get_classifier()

the element row is not recognize as a TrainedClassifiers therefore you can't do the get_classifier() method

Thank you

Polarity it is not working

Hi I have build successfully everything i have been able to create the databases with the following commands
To Build Training Dataset

python collector/trainer/twitterCollector.py
python collector/trainer/tweetClassifier.py

I have already created this files in /data dir

-rw-r--r-- 1 root root 293204 Oct 11 20:53 mood_traing_150k_1k_0.6.dat
-rw-r--r-- 1 root root 6104719 Oct 11 20:52 tweets_negative_raw.dat
-rw-r--r-- 1 root root 6581580 Oct 11 20:47 tweets_positive_raw.dat

My service it is OK

domU-12-31-39-06-8E-37 tracker # ./moodClassifierd.py start
starting...
OK

tests # more moodClientServerTest.py

-- coding: utf-8 --

import sys
sys.path.append('../../')
from tracker.lib.moodClassifierClient import MoodClassifierTCPClient

MCC = MoodClassifierTCPClient('127.0.0.1',6666)

test_data = {'text':'I am sad because i have a bad iphone So the 4S is announced yet preorder is sold out? alright then'}

print MCC.classify(test_data, 'search')

OUTPUT it is

[{'text': u'I am sad because i have a bad iphone So the 4S is announced yet preorder is sold out? alright then', 'x_mood': 0.0, 'x_lang': 'en'}]

I'm not able to identify if the polarity it is "POSITIVE or NEGATIVE" What im doing wrong because it look neutral, I am assuming that i will have to get -.xxx for negative and close to 1 for positive ?

results

Do you have a technical paper about the result of applying those classifiers on data? Is the data read from the public stream api?

cyhex / streamcrab Goto Github PK

streamcrab's People

Contributors

Stargazers

Watchers

Forkers

streamcrab's Issues

OK

AttributeError: 'str' object has no attribute 'get'

-- coding: utf-8 --

Recommend Projects

Recommend Topics

Recommend Org