Coder Social home page Coder Social logo

karpathy / arxiv-sanity-lite Goto Github PK

View Code? Open in Web Editor NEW
1.2K 1.2K 131.0 1015 KB

arxiv-sanity lite: tag arxiv papers of interest get recommendations of similar papers in a nice UI using SVMs over tfidf feature vectors based on paper abstracts.

Home Page: https://arxiv-sanity-lite.com

License: MIT License

Python 68.00% JavaScript 7.40% CSS 8.09% HTML 16.18% Makefile 0.32%
arxiv deep-learning flask machine-learning

arxiv-sanity-lite's Introduction

I like deep neural nets.

arxiv-sanity-lite's People

Contributors

ajdinre avatar karpathy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arxiv-sanity-lite's Issues

Connection reset by peer

With running the arxiv_daemon, I mostly am getting the response from arxiv "Connection reset by peer", which loops and loops for 1000 times before I get the message

"ok we tried 1,000 times, something is srsly wrong. exiting."

Is there a reason you set it to 1000? Why is this looped in the first place, is arxiv supposed to be finicky about this? Regardless, I feel like hammering arxiv so much is probably not preferred. Perhaps set it to a lower value?

Strange thing is, it doesn't always happen. Sometimes, I do get a connection immediately and a proper response from arxiv. That never happens after a few loops of "Connection reset". Then, a minute later if I try it would loop for the full 1000 times again. Is this an issue on arxiv side (like I'm on a blocklist of one of their load-balancing servers), or is this an arxiv-sanity-lite issue? Any ideas?

BioArxiv integration

Hi, great site :) would there be capacity to integrate bioarxiv articles in the future. I am aware of forks which have done this but they seem to be offline.

AI based peer review

Here are 50 parameters that could be used for ranking and evaluating research papers in an AI peer review system:

  1. Relevance to the field
  2. Originality and novelty
  3. Clarity of research objectives
  4. Soundness of methodology
  5. Rigor of experimental design
  6. Quality of data collection
  7. Appropriateness of sample size
  8. Robustness of statistical analysis
  9. Validity of conclusions drawn
  10. Adequacy of literature review
  11. Contribution to existing knowledge
  12. Potential for real-world application
  13. Societal impact and significance
  14. Ethical considerations addressed
  15. Reproducibility of results
  16. Scalability of proposed solutions
  17. Generalizability of findings
  18. Interdisciplinary relevance
  19. Clarity of writing and presentation
  20. Logical flow and organization
  21. Adherence to formatting guidelines
  22. Grammar and linguistic quality
  23. Appropriate use of figures and tables
  24. Sufficient explanation of technical terms
  25. Balanced and unbiased reporting
  26. Acknowledgment of limitations
  27. Discussion of future research directions
  28. Practical implications discussed
  29. Theoretical contributions made
  30. Creativity and innovation
  31. Depth of analysis
  32. Breadth of scope
  33. Attention to detail
  34. Integration of multiple perspectives
  35. Engagement with counterarguments
  36. Persuasiveness of arguments
  37. Coherence of narrative
  38. Effective use of citations
  39. Quality of sources cited
  40. Appropriate level of technical detail
  41. Accessibility to non-specialist readers
  42. Potential for generating further research
  43. Timeliness and relevance of topic
  44. Alignment with journal scope and aims
  45. Adherence to ethical research practices
  46. Disclosure of conflicts of interest
  47. Effectiveness of abstract in summarizing key points
  48. Appropriateness of keywords chosen
  49. Suitability for target audience
  50. Overall impact and significance of the work

Based on this a user can filter for any particular parameter and AI system can give 1 to 10 rating 1 being lowest 10 being highest and we can add up for all 50 parameters and pick greatest papers .

Here we can use different models to rate these 50 parameters and then take average or may be weighed average considering number of model parameters and then select best research papers.

For this we can use API on the website like openai or local ollama type llm and then users can use their own llm to rank research papers.

Please implement this if possible if not make me a contributor and and I will make a pr to implement this. This is purely for the name of ๐Ÿ”ญ science.

papers.labml.ai

Hi @karpathy,

We built papers.labml.ai in May (introductory tweet) to discover research papers based on popularity on Twitter. We were using arxiv-sanity to discover papers and I started this as a side project inspired by it (partly because it was down from time to time).

We worked on it on and off since May and have added a bunch of features, such as:

  • Popular papers based on Tweets
  • Link source codes, annotated implementations, videos, Reddit and Hackernews discussions, and other resources related to the paper
  • Conferences (iclr 2022, neurips 2021)
  • Short two-line summaries of the papers to quickly browse through lists of papers
  • Similar papers based on language model embeddings

And we are working on something very similar to tags on sanity-lite (which we call lists).

We love to hear your feedback and suggestions. Thanks for releasing your work.

Screenshot 2021-11-14 at 10 24 45

Screenshot 2021-11-14 at 10 25 36

Screenshot 2021-11-14 at 10 27 01

MIssing paper(s)?

I noticed that out paper is missing from arxiv-sanity. It was stuck in moderation for a while so maybe couldn't be indexed properly? I assume there might be other papers affected by the same issue.

Link to missing paper

We are building Skim - inspired by arXiv Sanity with improvements :)

Hi @karpathy, thank you for introducing arXiv Sanity Lite!

Few of my peers and I are developing Skim https://skimhq.tech - Spotify for ML World - inspired by arXiv Sanity.

Currently it supports:

  • Creating a list of papers as "Rack" (similar to Spotify playlist)
  • See similar papers based on TF-IDF based features
  • See popular conferences and their racks (arXiv papers bundled into racks based on their yearly proceedings) and conference statistics as well - giving you complete information about a conference :)
  • Search across all papers, racks, conferences and user base

We would like to discuss more and share an invite to you - so that we can collaborate on this and improve over time.
Please let us know - [email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.