keithamoss / scremsong Goto Github PK
View Code? Open in Web Editor NEWScrem song! Screm song!
Screm song! Screm song!
This view allows a person to triage the stream of tweets, photos, et cetera by performing a few key actions:
The view is a collection of columns representing a stream of content (Twitter mentions, Twitter searches, Instagram searches). One column represents one API call.
Content should be updated in near-real time.
When a new tweet arrives for a thread that's been marked as DONE
we want to set it back to PENDING
and let the assigned user know. This will facilitate more complex use of threads for e.g. multiple discussions, favouriting/retweeting a nice no-action-required reply that we get.
Are web sockets still the blessed way of pushing data to sync up state in multiple remote clients? Where do service workers come in?
Right now we're hackily pulling data back via window.setInterval
and it's all a bit awful. Before we go too much further down that path let's investigate the blessed way to push data to clients.
We'll need to make it work for:
Triagers
loadMoreRows
(these always need to be stored)Reviewers
Application
Resources
app.tsx
up into its respective modules (re-ducks and redux-typed-modules and Strongly typed Redux modules made easy!)assignment_id
for easier updating in futurereviewer_id
and review_status
getState()
Serializers
on the backendJust visually hidden (data will still be received).
Show the user profile picture in the column header.
Simple select menu to choose between seeing all columns or just your columns.
Relates to #24 down the line.
+ X more new tweets
below/next to the tweetMisc
We don't use Tweepy much yet - only for streaming and backfill - but we're going to be using it more for threading. Make sure we're following best practice for catching and handling rate limits errors.
http://docs.tweepy.org/en/latest/code_snippet.html#handling-the-rate-limit-using-cursors
Link from each tweet to its twitter.com reply/favourite/retweet page on:
We'll fallback to modals using web intents.
c.f. #40
Is this actually an issue? (We're seeing stale data, but so what?)
Tweepy looks good.
We'll need to handle:
Required database tables:
Documentation:
Streaming With Tweepy
Consuming streaming data
Streaming message types
Filter realtime Tweets
Standard streaming API request parameters
Some potentially useful results for the future:
Twitter Stream re-connect best practices (For python-twitter, not Tweepy - but the principles are the same.)
Tweepy. Make stream run forever
Twitter Streaming API limits?
shouldComponentUpdate
OR a tweet selector on UserReviewQueueViewContainer
and TweetColumnContainer
so that they only update when the assignments/column change (which hold the ids of the tweets), not when any tweet object changesassignments
or just that there's a tweet_assignment
?TweetColumn
and look for performance improvementsTweetColumn
)loadMoreRows
work for future tweets (scrolling up)? How does it now that tweets are being added at the top and not elsewhere?To me this feels like my preferred mode of triage, but others want to see everything.
As a user I should be able to toggle between seeing all tweets and only see tweets that haven't been:
DONE
)Some considerations:
tweet_ids
array back up to find the closest one maybe?)This should be part of a new settings screen and allow for fine grained control for that user. i.e. Checkboxes for seeing Dismissed, Assigned, Done, Closed.
Edit: Decision made not to implement this - would cause more problems (e.g. needing to undo, making mistakes) than it's worth.
Right now we rely entirely on loadMoreRows
from react-virtualized to handle loading tweets. React-Virtualized has the option to show "placeholders" - for example, as the user is scrolling or as loadMoreRows
is loading data. As we move towards issues like #26 we'll need to potentially sync a large number of tweets in.
Let's investigate this further and see how we can use it to improve the UX of seeing at least something for the tweet placeholders rather than blank spaces.We could avoid the need to have all tweet objects client-side if we instead showed placeholder tweets in the columns as the user is scrolling (up or down), fire events up to the store, debounce/wait those events OR wait until the user stops scrolling, then fetch them as a batch, shove them into the store, and replace the placeholder tweets with the actual tweets (and recalculate the cell heights).
This would handily remove issues around having to send potentially huge payloads when the user is scrolled a long way down the column and we're trying to send them all tweets between now and their current position.
Relates to #26.
In addition to the standard set of tweet actions, our Scremsong actions (assign to user, pre-canned reply, mark as read) we'd may like to have further custom actions in the future to:
This is required if we can't get an increased rate limit per #21. Refer to #6.
Just a visual at this stage. Call the API directly.
Actually, given the rate limit endpoint itself is rate limited we might want to cache it locally too. Then we can make use of it in stuff like threading where we want to start backing off or changing behaviour based on our API usage. Think about what our fallback position is for "we've been rate limited in this 15 minute period and the queue of tweets to process is building up" (e.g. In that "mode" we can assign sub-tweets to people as individual tweets that have no thread relationship).
Actually, we should be refreshing and logging our rate limits pretty frequently so we can do post-election analysis of how close we came. This would be paired with logging from the functions that use the API to give us a clear picture of how we're using the API and help inform future enhancements/changes in application behaviour.
Future enhancements:
https://developer.twitter.com/en/docs/tweets/compliance/overview
Will need to handle:
shouldComponentUpdate
in TweetColumn
classes: any
Manually dispatching tweets is tiresome.
Setup an API endpoint that takes the following options:
It will then:
tweet_id
in the data store and add n
scremsong_social_media_cache
table to cache the results of appropriate API calls
Id
Platform
(e.g. Twitter, Instagram, ...)Request Type
(e.g. GET, POST)Endpoint
(e.g. /searches/tweets)Parameters
(URL encoded parameter string)Last Requested
(Timestamp)Result
(BLOB)TTL
(Timestamp)In Progress
(Boolean)Pruning Strategy
(Age, # of Items)Pruning Threshold
(Per Above)TTL Base
TTL Increment
TTL Max
TTL
timestamp will increase using a back-off approach if we receive no new data.sleep
and wait a few times and then fail back to the client.marginTop
errorhashtag
with the same value as text
]) on rendering each cardUse the Docker container approach we took with Ealgis. Only rebuild and redeploying when there's a new version minted.
nuke and rebuild
approach)Right now we rely entirely on loadMoreRows
from react-virtualized to handle loading tweets. React-Virtualized has the option to show "placeholders" - for example, as the user is scrolling or as loadMoreRows
is loading data.
Let's investigate this further and see how we can use it to improve the UX of seeing at least something for the tweet placeholders rather than blank spaces.
A neat feature for the triage view for really busy elections would be keyboard shortcuts to:
up
and down
tweets in a columnleft
and right
between columnsa
) to a specific user (1, 2, ... n) or unassign it (u
)d
)r
)This should help a lot with really high volume elections.
Implementation will be interesting since we'll need to pass messages down to columns ("The currently selected tweet is id blah
- style that as selected) and have action fire off that listen to keyboard events. It feels like wrapping the triage view in a new KeyboardShortcuts
component, or having it sit in TriageView
(like TweetColumnAssigner
), might be the easiest way to go.
Comment: Not all triagers would use this, and those that would may not have enough time to develop and reinforce the muscle memory that makes it useful.
We already send back assignments, users, et cetera onconnect
, so I think we just need to handle sending back new tweets that have arrived since the user was disconnected?
We could do this as part of an onreconnect
action called in onconnect
if app.disconnected
was false
- we'd send the highest tweetId we knew about and the backend could just send back everything that's happened since.
onconnect
?We might need to break the onconnect
data sending out into a separate resync process. That could look something like:
See also #24
#democracysausage
Wiping cloud front removed the old js files, but then people get a blank page until they refresh again. It just sends the index.html back for requests for those assets.
Do we have no-cache on index.html so it doesn't serve up did files?
Hacky workaround: Some code embedded in index.html that refreshes the page if assets aren't loaded.
Thanks to the strict rate limiting on Twitter we may need to have an API usage view for admins to examine the prettified output of rate-limit-status. We may also need log requests so we have the underlying data required to analyse and tweak our caching strategies.
When replying to a post it should be possible to select from a set of pre-canned responses that will pre-fill the reply box. Users will still be able (and encouraged) to edit them, and they won't post automatically.
Users should be able to define their own additional pre-canned responses.
Ideas for improving how we collect, use, and display tweet threads.
The view for regular users exists to process tasks that have been assigned to them - e.g. tweets to reply to.
Tasks will be displayed as native content (tweets, grams) with relevant actions: reply, quote + retweet. Once an action is taken the user should be prompted to "mark as done". Tasks will be able to be marked as done separately (to allow for "no action required" scenarios).
Replying will happen in-line if we're under the rate limit for this 15 minute block. If not, and the backend returns a rate limit error, we'll fallback to modals using web intents.
Threads will need to be updated to include replies. Here is one approach. To conserve rate limits we may have to require to user to take an action to see updates - e.g. Clicking refresh, only seeing one assigned tweet at a time.
Notifications of new content should be displayed to the user in some fashion e.g.
Any user can switch to the view of any other user and 'act' as them to action their tweets. (e.g. Makes clearing queues easier if they go out to vote.)
Users should be able to "go offline" indicating they're not currently able to process tweets. This flag that next to their name in the assignment list.
If possible (within rate limits) replies should also be updated in near-real time and shown. If not possible, we'll have to re-think the UI and UX.
Reading:
The idea for the future is to allow users to:
Right now the backend is handling all of the:
We had an idea about moving this logic all to the frontend, so the backend just sends all of the information about columns and any tweets and the frontend:
Handling this client side might look like adding some selectors in between receiving the action from the web socket and dispatching the action. Selectors seem like a better fit because reducers should just be simply modifying the store, not making decisions about what goes in the store.
Considerations:
get_social_columns_cached()
in twitter_streaming.py
will need to be handled. Not sure how that sort of cross-process clearing would even work given Django is separate to Celery.Our workaround to the react-virtualized issue of adding tweets at the top of an infinitely loading list is to:
componentDidUpdate
Tweet threads are hard - there's no API concept for them. We're getting replies to our replies streamed in by virtue of including @DemSausage
as one of our search terms - BUT Twitter's API has removed all of the streaming endpoints for tweets from a given account. As such, we can't get our replies to people and this stops us following the tweet chain down from the assigned tweet through to all replies.
Threads will need to be updated to include replies. Here is one approach. To conserve rate limits we may have to require to user to take an action to see updates - e.g. Clicking refresh, only seeing one assigned tweet at a time.
resolve_tweet_parents(tweetId)
Finds all of the parent tweet objects (from local, from remote) for a given tweet. Returns tweets[].resolve_tweet_children(tweetId)
Given a parent tweet (a tweet that has no in_reply_to) find all children of all tweets (except to @DemSausage). Returns tweets[].build_relationship(tweets[])
Given a set of tweets that represent a complete relationship build a new relationship object (per below).create_assignment(tweets[])
Given a set of tweets that represent a complete relationship call build_relationship()
and stuff save a new assignment row. Always saves new tweet objects. Sets a created_on
and last_updated_on
date.update_assignment(assignmentId)
Given an assignment refresh all of its tweets via resolve_tweet_children()
and, if needed, update the assignment in the database. If an assignment was marked as DONE, change it back to PENDING. Always saves new tweet objects. Updates thelast_updated_on
date.resolve_tweet_parents()
. If parent is NOT part of an assignment, save the tweets we found and issue a NEW_TWEET
event for the original tweet. If parent IS part of an assignment, call resolve_tweet_children()
to refresh the thread, pass it to update_assignment()
, and then issue NEW_TWEET
, UPDATED_ASSIGNMENT
, and if necessary a COMPLETED_ASSIGNMENT_WAS_UPDATED
events that send all of the new tweet objects along. Show the assigned user a notification of one of their assignments being updated and highlight that on the queue UI in some fashion and show the tweet being as already part of an assignment in the triage UI.resolve_tweet_parents()
if necessary - pass that or the tweet itself to resolve_tweet_children()
, pass that to create_assignment()
, and then issue NEW_ASSIGNMENT
events that send all of the new tweet objects along to update the queue and triage UIs.dirty
flag to distinguish tweets we saved, but couldn't resolve stuff for and thus didn't send to the clients.get_tweets_from_user_since_tweet_from_api()
?userIds
Enhancements
issueEdit: Ooh, a more elegant approach to the data fetching would be to have tweet streaming take care of filling in missing in-reply-to tweets before it writes the received tweet to the database. Then we just need to handle the UI side (which can be separate Tweet components with some of our own CSS applied).
Do a backend POC of the structure for one thread, wire it up to the frontend loosely, then build the frontend around that and make sure it works end-to-end. Then we can build out the rest of the backend logic (Celery queues, et cetera).
since_id
.since_id
. Don't show tweets from us in the triage columns."You're assigned tweets have some new replies"
.<Tweet />
elements for each without worrying about doing fancy threading.parent_id
, thread_id
, and tweet_id
parent_id
, thread_id
and thread_data
(JSON field with tweet_ids: string[]
and relationships: <some json>
tweet_ids
as a flat array of all tweets in the relationship that we CAN index.Relationships example:
[
"1",
"2",
{
"tweet_id": 3,
"children": [
"10",
{
"tweet_id": 11,
"children": ["20", "21"]
},
"12"
]
},
"4"
]
Users can opt to stop receiving assignments (going inactive), but we're going to need to handle users going offline and disconnecting. This can happen by:
We don't want to spam folks with "User
is offline/online" notices, so have a think about how to approach this. Off the top of my head:
Discussion from user testing:
We got rate limited once during the Victorian election, so we'll definitely have issues with bigger elections.
readonly
on interface propertiesconst
Columns will display on the Triage view in the TweetDeck style. They'll be built from a set of pre-configured searches.
Individual users will be able to add their own columns based on custom searches and save them as a custom deck. Users will be able to load decks.
https://developer.twitter.com/en/docs/tweets/filter-realtime/api-reference/post-statuses-filter.html
https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters#track
https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets
Tweepy not getting full text
Tweet updates
Thoughts and feedback from H:
When the user reloads the page (for whatever reason - technical issues, memory issues) we want them to come back to the same position they were at in each column.
Internally within each column we already track the tweet that's at the top of the column as the user scrolls, and then maintain that position when new tweets are added to the top.
This would involve extending that and:
onconnect
so the user has right slice of the tweet store for each columncomponentDidMount
to set them up on loadA few considerations:
Ideas/Thoughts
onPositionUpdate
to prevent invalid inputRelates to #43
db-stack
Ref. #15
ScremsongExceptions
so that the calling code knows to handle them (e.g. Marking tweets as dirty)Right now we store all tweets we receive during a session and never discard any. This has implications for memory usage during a really busy election.
This can include:
sinceId
?Important: think discarding tweets above out current scroll position in a column breaks our assumptions around react-virtualized because we never get a loadMoreRows
call for them and so never get a chance to "get them back" if a user scrolls up and brings them into view.
One approach to this is to:
onRowsRendered
their positions (top + bottom + overscan)requestAnimationFrame
)We'd need to consider:
c.f. #30
Tasks will be displayed as native content (tweets, grams) with relevant actions: reply, quote + retweet. Once an action is taken the user should be prompted to "mark as done".
Replying will happen in-line if we're under the rate limit for this 15 minute block. If not, and the backend returns a rate limit error, we'll fallback to modals using web intents.
If we can, let's cache our replies in the database to avoid extra API calls to show.status. (c.f. #15)
Let's use Celery, not our hacked together queueing solution.
Connection reset by peer
errors?guest
still showing up in the logs?logger
?Standard search API
Using the standard search endpoint
Standard streaming API request parameters
Streaming tweets: Reconnecting best practice
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.