Comments (10)
hey @syntonym i'll update the documentation tomorrow with some ideas for the direction. I've just been very busy as of late, but should free up soon. One thing that needs to be done is a scraper that takes a link and extracts the icon, title and maybe a few lines of text that can be used for the summary for each post. I was going to try to build this on the server side with BeautifulSoup and then store the results in the datastore.
from feedmyfriends.
I put together the scraper last night ended up using lxml, feel free to poke around the code and let me know if you have any questions.
from feedmyfriends.
Any specific reason to choose lxml?
from feedmyfriends.
Speed on the Server Side. After googling around lxml's library is based on C and is considered considerably faster than beautifulSoup, which is based on regex. http://stackoverflow.com/questions/4967103/beautifulsoup-and-lxml-html-what-to-prefer
We can also consider doing the parser on the client side with javascript.
My thinking for the process is:
user submits link > posts to server > server scrapes key fields > server stores the post in db > server return the json representation of the link to the view > renders as a new post in the view.
Since it has to go through those steps i wanted to make sure that the server side portion was as fast as possible.
It might be more efficient on the client side. I'm not sure what libraries are available in javascript.
from feedmyfriends.
I don't think scraping the links clientside is a good idea, because you either have to a) validate the input, which needs probably the same amount of time as simply scraping or b) trust that the user input is right which opens up security flaws.
What do you think of something like this?
- user visits site and pulls (ajax) content to show
- user submits new link -> posts to server -> pulling every x seconds if link is already up -> after link is scraped successull show it in content
- server gets post for new link -> scrape it -> add it to content
from feedmyfriends.
I would suggest websockets for pushing new content to clients. This would turn the process into the following:
- User visits sitde -> user opens websocket to get content -> websocket get new content when ever there is some
- user submits new link -> posts to server
- server gets post for new link -> scrape it -> add to content -> push content via websocket
from feedmyfriends.
I'll definitely look into using websockets. I've been using REST/AJAX for awhile so that's more comfortable for me at least for prototyping. I'll finish constructing the basic functionality of the site (so that it at least starts to work and can switch over).
I'll looking to coding up a websockets module as a Handler for these sort of events. Always good to learn new stuff, thanks for the link.
cool article:
http://lostechies.com/chrismissal/2013/08/06/browser-wars-websockets-vs-ajax/
from feedmyfriends.
Interesting read! I think it realy depends on the application. Personally I find REST the way to go for most "basic" webstuff but some problems are hard to solve with it or are kinda hacks. For example I always found ajax technics besides simple PUT/GET requests (like long polling or polling continously) hackish. Websockets seem to be a clean solution.
from feedmyfriends.
was reading your diagram(github should build actual diagrams into the service, maybe the next app!).
in the step in bold
user submits new link -> posts to server -> pulling every x seconds if link is already up -> after link is scraped successull show it in content
can you elaborate what you mean by that?
from feedmyfriends.
I was thinking of something like this:
Server:
#I don't know out of my head what the needs are for PUT
#so i use POST here purpose of showing
@route("/api/put_content", methods=["POST"])
def put():
id = new_id()
content[id] = request.content
@route("/content"):
def give_content():
return(jsonify(content))
Clientside (pseudocode):
<form>
<input></input>
<button onclick:push_new_content>Push new content</button>
</form>
<script type="python">
#I know that python does not work on clientside inbrowser :(
push_new_content():
ajax.post("/api/put_content", form.content)
update_content_on_page():
jquery.get("#content_container").delete_all_children()
for c in content:
jquery.get("#content_container").add_children(c)
while True:
sleep(5)
global content
new_content = ajax.get("/content")
if new_content != content:
content = new_content
update_content_on_page()
</script>
from feedmyfriends.
Related Issues (13)
- TODO: Back-end Implementation HOT 11
- Data Schema HOT 1
- Add functionality for add link HOT 9
- Create Parser HOT 1
- Need to Finish Add new post javascript routine to automate posting HOT 2
- Allow Background Tasks and Workers HOT 4
- add google analytics slug
- add log viz as heroku add on
- design pills to look like the one's on radiolab website
- sqlalchemy HOT 2
- Need to add default favicon when none is available HOT 1
- Need to connect New Posts to a feed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from feedmyfriends.