Comments (3)
Thanks, that is interesting idea. I wonder if we could make it always enabled and have the item in multiple sources.
We would probably need to replace the source
column in the items
table with an m:n
association table. Will need to check the performance implications.
from selfoss.
This is a very nice idea, what are you using as identifier to deduplicate? The url?
What if the two feeds return a different content? Should not be an issue if you're using the full text recovery though.
from selfoss.
what are you using as identifier to deduplicate? The url?
The UID. Most commonly, this is the post URL but it is not required. For example blogger.com will use something like tag:blogger.com,1999:blog-6112936277054198647.post-403878284366003238
.
What if the two feeds return a different content? Should not be an issue if you're using the full text recovery though.
We could have findAll
return the source
id in addition to item
id and check whether the content
and url
matches when the source id does not, and only deduplicate it then.
That would also probably resolve the uid
collisions.
The issue that items will be missing from some of the sources will still remain, though, which is why I would like to test the performance impact of having sources
table in m:n
relation to items
.
from selfoss.
Related Issues (20)
- Twitter : You currently have access to a subset of Twitter API v2 endpoints [...] HOT 2
- Custom spout not displayed in drop down menu HOT 2
- Item's date as a future date HOT 4
- The auto_collapse option is ineffective HOT 3
- Mark as read behavior with unread filter HOT 1
- The More button sometimes appears by mistake HOT 1
- Rss feed HOT 1
- Higher loglevel for authentication errors HOT 2
- Add support for dark mode HOT 2
- Too big icons on mobile phones
- YouTube feed discovery broken HOT 1
- Filter on RSS Feed does not work HOT 2
- OPML import failed HOT 2
- Meta Graph API v12.0? HOT 1
- Mobile Website tries to load nonexistend path HOT 2
- question about "items_lifetime" HOT 4
- No support for encoding such as gzip or brotli? HOT 6
- Atom Over XMPP: XMPP (RFC 6120), PubSub (XEP-0060) and Atom (RFC 4287) HOT 7
- Up-to-date nginx config example using subfolder? HOT 1
- is digest authentication supposed to work? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from selfoss.