fennb / phirehose Goto Github PK
View Code? Open in Web Editor NEWPHP interface to Twitter Streaming API
PHP interface to Twitter Streaming API
In consume example at line 92:
while ($rawStatus = fgets($fp, 4096)) {
Some tweets are bigger than 4k, this limit has to be increased
Hello guys.
I was working with a project and I was using Streaming API (especially GET statuses/sample) but after the changes on API v1.1 i have difficulties to make the oAuth 1.0a. Could anyone give me a simple example of how to implement it?
I create the signature with oAuth tool but I have no idea on where and how to use it.
I am using the Phirehose library and the enqueueStatus method to collect public statuses.
The call on the API was: $sc = new SampleConsumer('theochry', 'myCode', Phirehose::METHOD_SAMPLE);
I can get Phirehose to run fine in my development server but cannot get it to run on the production servers. Is there something special that I need to have on the server to get it to work?
My production server is configured using http://forge.laravel.com with the php info available on http://neon.19degrees.io
The production server uses a similar vagrant image as http://laravel.com/docs/4.2/homestead (nginx + php5.6).
The following are the errors I am getting in the production log... it go on till the 21 retries and give up.
Phirehose: Connecting to twitter stream: https://stream.twitter.com/1.1/statuses/filter.json with params: array ( 'track' => '19degreesdevtest',)
Phirehose: Resolved host stream.twitter.com to 199.59.148.138
Phirehose: Connecting to ssl://199.59.148.138, port=443, connectTimeout=5
Phirehose: TCP failure 1 of 20 connecting to stream: (0). Sleeping for 1 seconds.
Phirehose: Connecting to twitter stream: https://stream.twitter.com/1.1/statuses/filter.json with params: array ( 'track' => '19degreesdevtest',)
Phirehose: Resolved host stream.twitter.com to 199.59.148.138
Phirehose: Connecting to ssl://199.59.148.138, port=443, connectTimeout=5
Phirehose: TCP failure 2 of 20 connecting to stream: (0). Sleeping for 2 seconds.
Phirehose: Connecting to twitter stream: https://stream.twitter.com/1.1/statuses/filter.json with params: array ( 'track' => '19degreesdevtest',)
Phirehose: Resolved host stream.twitter.com to 199.59.148.138
Phirehose: Connecting to ssl://199.59.148.138, port=443, connectTimeout=5
Phirehose: TCP failure 3 of 20 connecting to stream: (0). Sleeping for 4 seconds.
Phirehose: Connecting to twitter stream: https://stream.twitter.com/1.1/statuses/filter.json with params: array ( 'track' => '19degreesdevtest',)
Phirehose: Resolved host stream.twitter.com to 199.59.148.138
Phirehose: Connecting to ssl://199.59.148.138, port=443, connectTimeout=5
Phirehose: TCP failure 4 of 20 connecting to stream: (0). Sleeping for 8 seconds.
Recently, all of the sudden, my Phirehose running in the background started failing with an error PHP Fatal error: Allowed memory size of 125829120 bytes exhausted (tried to allocate 2147483641 bytes) in Phirehose/Phirehose.php on line 409
Which would be this peace of code:
<?php
if ($statusLength > 0) {
// Read status bytes and enqueue
$bytesLeft = $statusLength - strlen($this->buff);
while ( $bytesLeft > 0
&& $this->conn !== NULL
&& !feof($this->conn)
&& ($numChanged = stream_select($this->fdrPool, $fdw, $fde, 0, 20000)) !== FALSE
&& (time() - $lastStreamActivity) <= $this->idleReconnectTimeout) {
$this->fdrPool = array($this->conn); // Reassign
$this->buff .= fread($this->conn, $bytesLeft); // Read until all bytes are read into buffer
$bytesLeft = ($statusLength - strlen($this->buff));
}
// Accrue/enqueue and track time spent enqueing
$enqueueStart = microtime(TRUE);
$this->enqueueStatus($this->buff);
$this->enqueueSpent += (microtime(TRUE) - $enqueueStart);
$this->statusCount++;
} else {
// Timeout/no data after readTimeout seconds
}
<?
More specific line 409: php $this->buff .= fread($this->conn, $bytesLeft);
, function fread()
.
Also as you can see, for some reason I am getting a lenght of 2gb data coming in, which I am currently exploring while. EDIT: interesting this is that it's always around 2gb
I tried to do a quick workaround where if $bytesLeft
is more than 8192 (fread() maxiumum chuck) to use 8192 chuck read. If I am not missing something (which I probably am) the $this->conn lenght should be readed anyways since fread() will just split it down, or the process is to big and it takes way to much time for it to read. It looks like this
if(8192 < $bytesLeft){
$this->buff .= fread($this->conn, 8192);
}
else{
$this->buff .= fread($this->conn, $bytesLeft);
}
However, seems like when I set maunaly the fread() chuck read size, tweets are sometimes not being processed at all, with me not getting any error.
Any ideas (@fennb @DarrenCook) ?
Thanks,
Marko
I ran into an issue yesterday where I could no longer process the txt files from ghetto-queue-collect with ghetto-queue-consume. In my case it looks like for whatever reason each status tweet was being appended to the file without adding a new line. Subsequently when trying to process, line 91 of the consume script (while ($rawStatus = fgets($fp, 8192))) was pulling data chunks that contained multiple status tweets which resulted in incomplete json arrays causing the script to break.
The short term fix I made was to change the collector to add PHP_EOL to each status save(line 70):
fputs($this->getStream(), $status . PHP_EOL);
What I'm wondering is if anyone else has had this problem and second, would this be better to move into Phirehose.php where enqueueStatus is called?
Hello,
If you are tracking multiple keywords, how can you get which word triggered a status?
Migrated from code.google.com:
Following Twitter recommendations, first application should connect to Twitter stream with new predicates, and only after that current session should be disconnected. That is ment to minimize probability of lost tweets.
First of all, sorry for my bad english and, second, THANKS for this awesome code.
I was having some problems when working with big amounts of tweets. I was getting this error code:
Object of class stdClass could not be converted to string
After searching for hours, i found this:
foreach ($entities->hashtags as $hashtag) {
$where = 'tweet_id=' . $tweet_id . ' ' . 'AND tag="' . $hashtag->text . '"'; if(! $oDB->in_table('tweet_tags',$hashtag)) { $field_values = 'tweet_id=' . $tweet_id . ', ' . 'tag="' . $hashtag->text . '"'; $oDB->insert('tweet_tags',$field_values); }
}
After changing
if(! $oDB->in_table('tweet_tags',$hashtag)) {
for
if(! $oDB->in_table('tweet_tags',$where)) {
it began working perfect.
Is this right?
Thanks
My application broke after merging in recent changes to Phirehose that add HTTP 1.1 support and chunked encoding in favor of delimited:length.
My application is no longer able to parse the JSON for most tweets.
I'm using code borrowed from ghetto-queue-consume.php, and I'm still debugging to get to the bottom of the issue. But the line in question appears to be
while ($rawStatus = fgets($fp, 8192))
I'm logging each $rawStatus. I see that each one is exactly 8192 bytes. (I don't remember this being the case before; I thought the 8192 referred to the maximum number of bytes per chunk.) Obviously this leads to the JSON statuses starting and ending at random places.
I hope with more digging I can clarify this ticket, but my top-level question is: Does ghetto-queue-consume.php (which hasn't been updated in some time) still work with the recent changes to support HTTP 1.1?
Issue migrated from code.google.com:
To replicate:
$locations = array(
array(
37.784317,
-122.401855,
30
)
);
I believe this is due to the bounds not being sorted; the bounding box apparently needs to be southwest first, northeast second, and setLocationsByCircle does not consider this.
Didn't do any debugging, but on OSX, when running sample.php, I can start the stream, then disconnect the wifi for 10 seconds and reconnect, the stream will come back. If I disconnect for 20-30 seconds or a minute then reconnect, the stream hangs and never reconnects. Snow leopard, OSX, php 5.3.8, streaming the spritzer stream. sample.php is unaltered other than username/password.
$myobj->setTrack(array('abc', 'rose'));
$myobj->consume();
If i want to modify tracking keywords without disconnecting how is it possible?
USER_AGENT is referenced in a few places in the UserstreamPhirehose.php class, but is not defined at the top of the class. I attempted to define it in my subclass, but it still didn't find it. (PHP 5.3.6)
I'm referencing this StackExchange post. Basically, all of the scripts appear to run fine with logging and reports being made. But I can't find the data! Where are these files written?
http://stackoverflow.com/questions/22153930/where-are-these-files-twtitter-userstream-via-phirehose
Hi,
What license is this source code covered under? Is it MIT?
Thanks,
Jatin
Hello,
I have been using this library for years but I am now experiencing an issue with it. I am running the example stream file on a macbook pro running mavericks. I am using the OSX version of PHP which has always worked fine. I am doing inserts to a mysql database on another home server (I am doing this while streaming data in, not using a queue yet). Even when I comment out inserts (to troubleshoot if this is a mysql connection issue) I still experience the same issue. I am using the latest release. There is never a set time interval in which this occurs. It is always random. I have tried everything I can think of to fix this but I am at a loss. Has anyone else experienced this issue? Does anyone know how to fix it.
@fennb thanks for this awesome library, I actually had a data scientist at Twitter recommend it to me the other day
Simply changing from:
const URL_BASE = 'http://stream.twitter.com/1/statuses/';
to:
const URL_BASE = 'https://stream.twitter.com/1/statuses/';
fails.
From the source:
All our Streaming API products are now supporting SSL and we've just updated the Streaming API Methods [1], User Streams [2] and Site Streams [3] documentation pages accordingly.
As we're planning to sunset HTTP support in about a month, we strongly encourage you to switch to SSL (HTTPS) as soon as possible, especially if you're still authenticating your Streaming API requests with Basic Auth. On a related note, Basic Auth will eventually be deprecated on the Streaming API, and we recommend that you switch to OAuth well in advance.
Implementation changes on your side should be as simple as changing http protocol for https
Just like for api.twitter.com, we are using Verisign SSL certificates on these domains. If for some reason you need the Verisign Root CA Certificate, you can obtain it directly from Verisign, or from this link: http://curl.haxx.se/ca/cacert.pem
On September 29th, the Streaming API will turn SSL only. While one month’s time is our plan, please be sure to give us your feedback on this discussion thread [4] if you think you needed more time. As always, if you have questions about the Streaming API, let us know on our Developers Discussions board.
[1] https://dev.twitter.com/docs/streaming-api/methods
[2] https://dev.twitter.com/docs/streaming-api/user-streams
[3] https://dev.twitter.com/docs/streaming-api/site-streams
[4] https://dev.twitter.com/docs/security/public-key
Blog URL: https://dev.twitter.com/blog/streaming-api-turning-ssl-only-september-29th
Please do not reply to this message; it was sent from an unmonitored email address. Visit https://dev.twitter.com/user to manage your subscriptions.
I just downloaded the last version today. Using the ghetto example the consumer doesn't get the full json tweet, it gets just a part. This is and example:
18026,"created_at":"Thu Mar 08 21:44:33 +0000 2012","utc_offset":7200,"time_zone":"Amsterdam","geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"0084B4","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/509391853473775620\/dtSNKnQD_normal.jpeg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/509391853473775620\/dtSNKnQD_normal.jpeg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/518908009\/1410299354","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweeted_status":{"created_at":"Wed Oct 01 07:58:12 +0000 2014","id":517221930462883840,"id_str":"517221930462883840","text":"Today\u2019s #competition Just favourite and retweet this tweet to be in with a chance of winning. Winner chosen at random tomorrow. GO!","source":"\u003ca href=\"https:\/\/about.twitter.com\/products\/tweetdeck\" rel=\"nofollow\"\u003eTweetDeck\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":25534147,"id_str":"25534147","name":"Volvo Trucks UK ","screen_name":"VolvoTrucksUK","location":"Warwick, United Kingdom","url":"http:\/\/www.volvotrucks.co.uk","description":"Total Commercial Vehicle Solution Provider, providing quality Volvo Trucks within the UK market http:\/\/www.volvotrucks.co.uk","protected":false,"verified":false,"followers_count":12956,"friends_count":1114,"listed_count":102,"favourites_count":584,"statuses_count":12151,"created_at":"Fri Mar 20 16:55:44 +0000 2009","utc_offset":3600,"time_zone":"London","geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"FFFFFF","profile_background_image_url":"http:\/\/pbs.twimg.com\/profile_background_images\/676413883\/7b8aa0d3bee4d2f57b4da2cf10387585.jpeg","profile_background_image_url_https":"https:\/\/pbs.twimg.com\/profile_background_images\/676413883\/7b8aa0d3bee4d2f57b4da2cf10387585.jpeg","profile_background_tile":false,"profile_link_color":"2FC2EF","profile_sidebar_border_color":"FFFFFF","profile_sidebar_fill_color":"FFFFFF","profile_text_color":"0E083D","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/441162209654108160\/rRINCdyl_normal.jpeg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/441162209654108160\/rRINCdyl_normal.jpeg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/25534147\/1405412142","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweet_count":216,"favorite_count":107,"entities":{"hashtags":[{"text":"competition","indices":[8,20]}],"trends":[],"urls":[],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en"},"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"competition","indices":[27,39]}],"trends":[],"urls":[],"user_mentions":[{"screen_name":"VolvoTrucksUK","name":"Volvo Trucks UK ","id":25534147,"id_str":"25534147","indices":[3,17]}],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"medium","lang":"en","timestamp_ms":"1412234939794"}
For me, seems a problem with newlines because the files generated by the collector don't have any newline.
Phirehose docs state, "Phrases, keywords with spaces, are not supported.".
This might have been the case before, however it seems Twitter now supports this, and this repository appears to support it as well.
https://dev.twitter.com/docs/streaming-apis/parameters#track
I dont know it this is the correct place to ask this, if not, delete it please.
Could you give me an example of how to use Setlocation(), getlocation(), etc? Some app or something.. I dont know how to add it to my code.
thanks!
Hi, I'm having an issue while opening multiple user streams, I don't know if twitter could have an issue with this eventually, the problem is the following:
Application 1 starts and opens a user stream for User1 (everything goes as expected)
Application 2 starts and opens a user stream for User2 however this user stream is returning the timeline for both User1 and User2, I can filter the tweets to match only User2 mentions (in this case)
Application 3 starts and opens a user stream for User3 and I get also the streams of User1 and User2 as well, so I have these 3 apps on different terminals to supervise them and if User3 gets a mention I'll see the update pop on Application 3 window only, but if User1 gets a mention I'll see it pop on Application1, Application2 and Application3 windows.
is this expected behavior? the more user streams I open I get all the past streams in that new stream aswell, the old streams still running.
I'm using PHIREHOSE but I have a problem with upper-case and lower-case sensitive. When I am searching, for example, the hashtag: #askdemi .. The Streaming API gets exactly this hashtag, it don't find for example: #AskDemi or #ASKDEMI .. You can help me? .. thanks. Where is the problem?
Using this package is it possible to only get a users favourites?
Or in other words get a specific event?
Migrated from code.google.com:
->setLocations($locations)
$locations = array(
array(-122.75, 36.8, -121.75, 37.8),
array(-74, 40, -73, 41)
);
Error 401 UNAUTHORIZED
When running Phirehose using setTrack, I get the expected responses from Twitter and Phirehose. When using setLocation with the above data I always get a 401 Unauthorized error. I have used several different co-ordinate groups but this doesn't change the response. The above data is the sample data included in the Phirehose code base. I have checked the Twitter documentation for help but with no success.
Can you do a release so that 1558e18 get released ar part of stable?
Since the recent updates to the Streaming API (SSL) We have been struggling to get the latest version of phirehose to work behind our proxy.
After making minor changes to the phirehose script in order to work with our proxy
The error log we are getting is
Error: [Tue Oct 18 13:59:37 2011] [error] [client 194.80.32.10] Phirehose: Connecting to twitter stream: https://stream.twitter.com/1/statuses/filter.json with params: array ( 'delimited' => 'length', 'track' => '#x$
[Tue Oct 18 13:59:37 2011] [error] [client 194.80.32.10] Phirehose: Resolved host stream.twitter.com to 199.59.148.138
[Tue Oct 18 13:59:37 2011] [error] [client 194.80.32.10] Phirehose: Connecting to 199.59.148.138
[Tue Oct 18 13:59:37 2011] [error] [client 194.80.32.10] Phirehose: Full URL: ssl://199.59.148.138:443
[Tue Oct 18 13:59:37 2011] [error] [client 194.80.32.10] Phirehose: TCP failure 20 of 20 connecting to stream: No route to host (113). Sleeping for 16 seconds.
Changes : $opts = array('http' => array('proxy' => 'tcp://wwwcache.lancs.ac.uk:8080', 'request_fulluri' => true));
$context = stream_context_create($opts);
//@$this->conn = fsockopen($scheme . $streamIP, $port, $errNo, $errStr, $this->connectTimeout,$context);
@$this->conn = stream_socket_client($scheme . $streamIP . ":" .$port, $errNo, $errStr, $this->connectTimeout, STREAM_CLIENT_CONNECT, $context);
We are now thinking it could be to do with
Opening an insecure connection to the proxy which then attempts to forward the plain text request to twitter. We were thinking perhaps it can be re-coded using cURL instead of fsock? But we arent sure on this matter.
Thanks
Hi, we have an application based on the Twitter API, but since the 14/01/14 we are experiencing problems due to upgrade of their API. Specifically we are using Phirehose Framework to retrieve Streams of certain users. However, we could not retrieve tweets as before that date. Randomly we recover certain tweets but not constantly. The error that we find to connect with Twitter is as follows:
04/03 19:51:09 - Consume rate: 1 status/sec (55 total), avg enqueueStatus(): 0.24ms, avg checkFilterPredicates(): 0ms (10 total) over 62 seconds, max stream idle period: 5 seconds.
04/03 19:52:11 - Consume rate: 1 status/sec (41 total), avg enqueueStatus(): 0.17ms, avg checkFilterPredicates(): 0ms (11 total) over 62 seconds, max stream idle period: 5 seconds.
04/03 19:53:11 - Consume rate: 1 status/sec (51 total), avg enqueueStatus(): 11.93ms, avg checkFilterPredicates(): 0ms (10 total) over 60 seconds, max stream idle period: 5 seconds.
04/03 19:54:11 - Consume rate: 1 status/sec (51 total), avg enqueueStatus(): 8.2ms, avg checkFilterPredicates(): 0ms (10 total) over 60 seconds, max stream idle period: 4 seconds.
04/03 19:55:12 - Consume rate: 1 status/sec (59 total), avg enqueueStatus(): 2.04ms, avg checkFilterPredicates(): 0ms (12 total) over 61 seconds, max stream idle period: 5 seconds.
04/03 19:56:12 - Consume rate: 1 status/sec (49 total), avg enqueueStatus(): 0.09ms, avg checkFilterPredicates(): 0ms (12 total) over 60 seconds, max stream idle period: 4 seconds.
04/03 19:57:12 - Consume rate: 1 status/sec (59 total), avg enqueueStatus(): 2.67ms, avg checkFilterPredicates(): 0ms (11 total) over 60 seconds, max stream idle period: 4 seconds.
04/03 19:58:12 - Consume rate: 1 status/sec (56 total), avg enqueueStatus(): 0.1ms, avg checkFilterPredicates(): 0ms (11 total) over 60 seconds, max stream idle period: 5 seconds.
04/03 19:59:12 - Consume rate: 1 status/sec (54 total), avg enqueueStatus(): 0.09ms, avg checkFilterPredicates(): 0ms (10 total) over 60 seconds, max stream idle period: 5 seconds.
04/03 20:00:01 - Connecting to twitter stream: https://stream.twitter.com/1.1/statuses/filter.json with params: array ( 'follow' => '781293,807095,1169241,1652541,4157261,4898091,7520262,7556392,7557352,8354962,9825392,9974322,...,)
04/03 20:00:01 - Resolved host stream.twitter.com to 199.16.156.20, 199.16.156.110
04/03 20:00:01 - Connecting to ssl://199.16.156.20, port=443, connectTimeout=5
04/03 20:00:02 - Connection established to 199.16.156.20
04/03 20:00:02 - POST /1.1/statuses/filter.json HTTP/1.1
Host: stream.twitter.com:443
Connection: Close
Content-type: application/x-www-form-urlencoded
Content-length: 2825
Accept: /
Authorization: OAuth realm="",oauth_consumer_key="xxxxxxxxxxxxxxxxxx",oauth_nonce="12a852fc6ca90588971d7eab192aa709",oauth_signature_method="HMAC-SHA1",oauth_timestamp="1393959602",oauth_version="1.0A",oauth_token="xxxxxxxxxxxxxxx15-wxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",oauth_signature="powpMeuX2bL99ORYfE9FIhyBUyg%3D"
User-Agent: Phirehose/1.0RC +https://github.com/fennb/phirehose
Please we need your help to resolve this issue.
Thanks in advance
Got this warning today (and infinity loop)
E_WARNING. Error message was "socket_last_error(): supplied resource is not a valid Socket resource" in file /var/www/tjournal.ru/protected/classes/vendor/fennb/phirehose/lib/Phirehose.php at line 522.
I am using the sample file i.e. phirehose_0.2.4\example\sample.php (after setting username and passwd values to my twitter uname/passwd) but i keep on getting following kind of errors:
......
[Sat Dec 24 14:11:09 2011] [error] [client 127.0.0.1] Phirehose: Connecting to 199.59.148.138
[Sat Dec 24 14:11:12 2011] [error] [client 127.0.0.1] Phirehose: TCP failure 5 of 20 connecting to stream: No connection could be made because the target machine actively refused it.\r\n (10061). Sleeping for 16 seconds.
[Sat Dec 24 14:11:28 2011] [error] [client 127.0.0.1] PHP Fatal error: Maximum execution time of 30 seconds exceeded in C:\wamp\www\phirehose_0.2.4\lib\Phirehose.php on line 482
.......
Could you please inform what could be the issue? Sorry if this kinda problem is not to be posted here but i could not find any discussion forums for the same?
Thanks
This might be the most useless issue report ever, i'm still investigating, however:
running the latest version of Phirehose, on Laravel's Homestead VM which uses PHP 5.5.18 when I try to connect to an oAuth filter stream I receive a 401. However the exact same code when run locally on OSX running php 5.4.24 works.
I would assume the issue is either with the vm, potentially with SSL but given i get a well formed response I wouldn't have throught so. Or an issue with PHP 5.5
I am going to try and upgrade locally to 5.5 to see if it recreates the problem to rule out the VM but if anyone has encountered this before let me know
WORKING
Phirehose: Connecting to twitter stream: https://stream.twitter.com/1.1/statuses/filter.json with params: array ( 'track' => 'goodnight,hello,morning,the',)
Phirehose: Connecting to ssl://stream.twitter.com, port=443, connectTimeout=5
Phirehose: Connection established to stream.twitter.com
Phirehose: POST /1.1/statuses/filter.json HTTP/1.1
Host: stream.twitter.com:443
Connection: Close
Content-type: application/x-www-form-urlencoded
Content-length: 39
Accept: */*
Authorization:
BROKEN
Phirehose: Connecting to twitter stream: https://stream.twitter.com/1.1/statuses/filter.json with params: array ( 'track' => 'goodnight,hello,morning,the',)
Phirehose: Connecting to ssl://stream.twitter.com, port=443, connectTimeout=5
Phirehose: Connection established to stream.twitter.com
Phirehose: POST /1.1/statuses/filter.json HTTP/1.1
Host: stream.twitter.com:443
Connection: Close
Content-type: application/x-www-form-urlencoded
Content-length: 39
Accept: */*
Authorization:
I made a mistake entering my twitter credentials but instead of throwing exception, filter-oauth.php example entered an infinite loop trying to authorize with wrong credentials and getting:
HTTP/1.1 401 Authorization Required
cache-control: must-revalidate,no-cache,no-store
connection: close
content-length: 266
content-type: text/html
date: Tue, 03 Feb 2015 16:51:53 UTC
server: tsa
www-authenticate: OAuth realm="Firehose"
x-connection-hash: 870557403e3c5879d660975bc7e5d976
<html>\n<head>\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>\n<title>Error 401 Unauthorized</title>
</head>
<body>
<h2>HTTP ERROR: 401</h2>
<p>Problem accessing '/1.1/statuses/filter.json'. Reason:
<pre> Unauthorized</pre>
</body>
</html>
My script that consumes statuses seems to run fine for a couple hours (or days) using only 1-3% CPU - but then suddenly it jumps to 100% CPU usage and no longer records anything from the stream. Since the library seems to cover all the basics like reconnect, I'm not sure where to start debugging. No errors, no output.
I buffer 100 statuses, and then just dump them to file. Files rotate every hour. Problem seems to be with the connection.
I am using the OAuthPhirehose library to track keywords on Twitter, however when tracking any keyword with a space a HTTP ERROR 401: Unauthorized error would be returned.
It seems Phirehose does not correctly encode the post data for Twitter. http_build_query < PHP 5.4 does not support PHP_QUERY_RFC3986. As a quick hack, I added the following line after the http_build_query call to change the encoding of spaces. This works for me, but your mileage may vary.
$postData = http_build_query($requestParams, NULL, '&');
$postData = str_replace('+', '%20', $postData); // MODIFICATION: Convert + to %20 as per PHP_QUERY_RFC3986
$authCredentials = $this->getAuthorizationHeader();
I don't have time to submit a patch, but hopefully someone else will.
Migrated from code.google.com:
Replace the code line (line 458):
throw new ErrorException("Unable to resolve hostname: '" . $urlParts['host'] . '"');
With the following code:
$tcpRetry = ($tcpRetry < self::TCP_BACKOFF_MAX) ? $tcpRetry * 2 : self::TCP_BACKOFF_MAX;
$this->log("Unable to resolve hostname '" . $urlParts['host'] . "' Sleeping for ". $tcpRetry . " seconds.");
sleep($tcpRetry);
continue;
Please note: others already indicated that 457 also has an error. This line originally reads
if (count($streamIPs) == 0) {
and should be changed into:
if (empty($streamIPs)) {
I want to get Timeline for a different user that the one who created the app. I get the OAUTH_TOKEN and OAUTH_SECRET for that user and I use them in the script (userstream-simple.php), but the list of friends returned is not correct. It seems to be the list of friends for the user that created the app on twitter.
My gold is to use User Streams to get Timelines for multiple users.
Thanks,
Petre Tudor
I just upgraded to the latest Phirehose code after the Twitter SSL/TLS change on Jan 14th made my phirehose script stop connecting (403 Forbidden).
So it's working again now, but the latest code seems to have a bug in the way it processes the buffer, making it call enqueueStatus()
with an empty string ("\r\n", technically).
I'm not sure if this is an existing bug in the Phirehose code, or if something changed with the Twitter API along with the connection security enforcement.
Whatever the exact cause, it appears that sometimes the fread()
call on the stream loads in an extra '\r\n', and $s
becomes '\r\n\r\n', so even after one set of line breaks is trimmed off by substr()
a second one still remains - which is fed in to 'enqueueStatus()' as an empty string.
I'm not sure why it's happening though - maybe the Twitter stream is sending an empty message containing just this extra '\r\n' sequence? I'm new to stream processing so I'm not sure of the best way to get a view at what is happening.
hi guys :)
does anyone know how to get the coordinates of a tweet using phirehose enqueueStatus (streaming) ?
thanks in advance
Migrated from code.google.com:
What steps will reproduce the problem?
consume() hangs in the while loop:
while ($bytesLeft > 0 && $this->conn !== NULL && !feof($this->conn)...
because $BytesLeft never reaches zero and the server still believes the stream is open.
Fixed by putting extra check in while() statement:
(the part && (time() - $lastStreamActivity) <= $this->idleReconnectTimeout) -- is new)
while ($bytesLeft > 0 && $this->conn !== NULL && !feof($this->conn) && ($numChanged = stream_select($this->fdrPool, $fdw, $fde, 0, 20000)) !== FALSE && (time() - $lastStreamActivity) <= $this->idleReconnectTimeout) {
$this->fdrPool = array($this->conn); // Reassign
$this->buff .= fread($this->conn, $bytesLeft); // Read until all bytes are read into buffer
$bytesLeft = ($statusLength - strlen($this->buff));
}
Also added a log entry just after the while() { } area to track issue:
if ((time() - $lastStreamActivity) > $this->idleReconnectTimeout) {
$this->log('Idle timeout: Stream died in the middle of package reception. '.$bytesLeft.' bytes left. Data='.$this->buff);
}
Is there a way to specify parameters like with
, replies
, delimited
and etc defined by Twitter(https://dev.twitter.com/streaming/reference/get/user) in phirehose?
Cloud anyone help me out about that?
Hi
I found that no cpu problem with other filter's condition except setLocation.
I use this for filter's condition.
$sc -> setLocations(array(
array(-180,-90,180,90)//Any geotagged Tweet
));
In enqueueStatus(), I just print statuses like this :
$data = json_decode($status, true);
if (is_array($data) && isset($data['user']['screen_name'])) {
print $data['user']['screen_name'] . ': ' . urldecode($data['text']) . "\n";
}
Result : CPU immediately 100%
server environment is :
OS : RHEL6
PHP : 5.4
Thanks
Hello,
I have the Phirehose PHP interface running smoothly with the Twitter Streaming API. I am using the filer method and setting up the location that I want to log the tweets from.
The issue is that I only receive tweets and none retweet, is this a limitation of the API itself of is there anything I can do on the coding to get the retweets (I only need the retweets)? Any tips ?
Thank you,
I've built an app that will capture tweets that are sent as replies to a specific account.
I've noticed that this is pretty spotty at best. There are times where it captures things perfectly, then there are others where enqueueStatus()
just doesn't seem to get called or passed any data.
Has anyone else experienced this? The very first thing I do inside my method is this
if(is_array($data) && $data['in_reply_to_status_id'] !== null && isset($data['id_str'])) {
Logger::output($data['user']['screen_name'] . ': ' . urldecode($data['text']));
}
Where Logger::output()
is just a method that prints out the string. There are times when this simply doesn't happen even though a reply has been sent. Does this mean Phirehose is failing to get the tweet, or is the Streaming API simply not capturing it for some reason (spam filter perhaps)?
Anyone else experience this?
https://github.com/reactphp/
http://reactphp.org/
ReactPHP provides a framework for asynchronous event-driven PHP programming. This would allow a script to do more than just parse tweets. It would also move a lot of the logic out of Phirehose, allowing it to be more lean and easy to understand and modify.
I'd be willing to help with this effort if you think it'd be a good direction to go.
Im testing the source code and its great (running example script) but if change:
$stream->setTrack(array('recipe')); // to
$stream->setTrack(array('spanish string')); /* i have:
date error| /script..php-> insert | You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1 | INSERT INTO json_cache SET raw_tweet = "Tjs=", tweet_id =
*/
any idea ?
There are a few broken links on the Introduction page of the Wiki to old Twitter docs.
I've updated the links here: https://github.com/jdbevan/phirehose/wiki/Introduction
Stream doesn't seem to be running today from Twitter... anyone else experiencing?
https://github.com/fennb/phirehose/blob/master/example/filter-oauth.php#L26
At least, it is the unique example that contain an exit.
hi guys, i recently tried to run the filter-oauth.php example and i take this error:
Fatal error: Uncaught exception 'PhirehoseConnectLimitExceeded' with message 'TCP failure limit exceeded with 21 failures. Last error: Unable to find the socket transport "ssl" - did you forget to enable it when you configured PHP?' in C:\wamp\www\strtt\Phirehose.php on line 580
any ideas?
thanks in advance
How do I get the results of single quotes?
$sc->setTrack(array('hasan'));
hasan getting the results
but
hasan'ın doesn't getting the results
thank you for helps
It would be very nice if you could tag every change made. It makes it harder to use with composer reliably and conveniently.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.