Coder Social home page Coder Social logo

raquo / hnapp Goto Github PK

View Code? Open in Web Editor NEW
60.0 60.0 10.0 310 KB

Hacker News faceted search engine, RSS & JSON feeds

Home Page: http://hnapp.com

License: MIT License

Python 53.87% CSS 14.18% JavaScript 11.47% PLpgSQL 0.54% HTML 19.93%
hacker-news rss search search-engine

hnapp's People

Contributors

raquo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

hnapp's Issues

Add subdomains to host:domain.com searches

Currently, searching for host:domain.com will only find submisions from domain.com and www.domain.com.

However, github project pages are located on sudomain.github.io, and there is no way to search all of them.

Implement and use a reversed() functional index on the domain field to searching for subdomains. Use the same query syntax host:github.io (subdomain search is probably the desired behavior in most cases)

Feeds/data not updating or only periodically

Hi there,

Love the service, longtime user. Noticed the last few days the RSS feeds are not updating, or updating only once a day instead of regularly during the day. Example, RSS feed (with a query for >score I am subbed to, no activity all day and then I noticed at like 11PMish PST 20 new items appeared in feed.

Anything up? Maybe new year rollover or something related? Wasn't sure how to reach out, twitter or here, so giving it a shot. Hope can be rectified/back on regular updates during the day if possible.

Appreciate it!

Searching for domain.com or username does not produce expected results without host: and author: prefixes

When searching for a domain name without specifying it via the host: prefix, hnapp should find items from that domain anyway. Same for username.

Examples of unsatisfactory searches:
http://hnapp.com/?q=type%3Astory+techcrunch.com (no stories from domain techcrunch.com)
http://hnapp.com/?q=patio11 (no comments by patio11, only about him)

Possible ways to fix that:

  • Implement a "did you mean" feature. If any word in the query exists as a username in the DB, or looks like a URL / domain name, show a link to a query that fixes that.
    • Make sure to only do this for non-dictionary words and non-technical terms, or it will get annoying.
  • Create a new tsv_meta TSVECTOR field and put the domain and username in it.
    • Must disable stemming for this one.
  • Amend the DB query to check every word in the query against domain and username fields. Inefficient.

Should hnapp support reddit?

reddit has orders of magnitude more content than HN, so having hnapp handle all of it would be a challenge. However, it's not impossible if done right, so if you want this please leave a comment here and let us know what kind of queries you'd like to run against reddit's posts or comments.

Highlight query words in search results

For example, this search should highlight all occurrences of "angular": http://hnapp.com/?q=angular

Make sure to use stemming so that all matching variations of the word are highlighted. tsv_body contains a list of stems with ranks before stop-words are removed, so it could be used to find positions of the word in the original body.

However, the original body is in HTML, which brings to problems: 1) body should be plaintext-ed before it's converted to TSVECTOR, and 2) if we do get it plaintexted, the ranks lose meaning because HTML-to-plaintext is strictly speaking a one-way conversion.

Maybe the answer lies in how PostgreSQL stemming works. Read up some more on that.

wildcards

Wildcards don't seem to work. For example I want -ukrai* in my search. Thanks

Unable to search for "extension type:story"

Thanks for a great service, only one problem so far: the above search term invariably results in "Server error" message. No other term that I've used causes this behavior. Grateful if you could check it out.

image

Planning to terminate hnapp in late 2024.

Hey everyone, hnapp in its current incarnation has been up and running for 9 years now.

It has served us well, well me at least, and perhaps other people, since someone always notices when it goes down because it's overfilled its disk again (patent pending distributed server monitoring system, just you wait), but well, I haven't touched this code in forever, and if I want to run it anywhere other than its dusty old server, I need to port it to python 3, update all dependencies, set up its hosting anew, and I just don't have the energy for any of that. And, I realized that if hnapp went away, I wouldn't be all that sad. Perhaps it's better to let it go while it's still nice, before it becomes a chore.

Next year this version of hnapp will turn 10 years old (original is 12 years old already), and that's as good a time as any to pull the plug.

I don't run any analytics, and don't collect any personal data, so I don't know how many people still use hnapp, nor do I have a good way to contact any of them. I guess I'll put a banner on the website advising users of the impeding shutdown at some point.

If anyone wants to help upgrade hnapp to python 3 and make it hostable in a docker container, or something like that, please let me know.

Add stemming for technical terms

I want to get everything about angular just by searching for angular, not angular | angular.js | angularjs โ€“ these all should be equivalent.

PostgreSQL supports search dictionaries, those should be useful: http://www.postgresql.org/docs/9.1/static/textsearch-dictionaries.html

How to get a list of technical terms? Get a list of words from HN comments that are not in the English dictionary, but are mentioned many times on github/stackoverflow? Actually, maybe use stackoverflow's tags or something? Or github project names?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.