Comments (7)
@hornc - Is there any additional feedback you want to solicit before closing this? When you're ready, please label it "Close: will not fix" and close it.
from openlibrary.
I wrote VacuumBot which cleans up records. Recently I've made it update formats and paginations and noticed that some records referred to authors of type redirect or delete. I noticed because the API rejects records with references to redirecting or deleted pages.
I have been correcting these rejects manually, but as of today VacuumBot checks for the existence of authors in linked Work records, and
- removes authors (without checking for correctness) from the Edition record if there are authors in the Work, or
- follows location properties in redirects to new author records, or
- undeletes deleted authors when it finds one following references from Editions.
This won't fix them all, but it's a start.
from openlibrary.
@bencomp: How is VacuumBot run? Manually? In a cron job?
from openlibrary.
@bfalling I ran VacuumBot manually. The last time was some years ago.
from openlibrary.
@hornc is this an Issue of just running the bot again or do we need to improve the bot further for existing records?
from openlibrary.
I'm not convinced undeleting authors is a safe step to do automatically. I have had to resolve many 'please see' authors names and also many 'DELETE' author names, some of which have had many different original authors and erroneous entries merged into one item, so untangling the correct original individual is not always straightforward, or even possible. Ideally we should not delete real individual author records at all but use redirects. Some entries however are created for data fragments and do not represent authors at all, so if these are deleted they should remain so.
@tfmorris questioned this edit recently:
https://openlibrary.org/authors/OL2630272A/please_see_Leonard_Lee_Rue_III?m=history
I don't think it is correct for Import Bot to make this sort of change automatically.
I suggest closing this issue as I'm not sure what value "Undelete authors linked to editions" across the board gives us, I imagine it'll do just as much harm as good depending on the situation. Specific problems should be raised with examples so they can be addressed appropriately.
In general, deleted items were deleted for a reason. If dangling author references in editions is still a problem, examples should be identified and we can come up with a process for correcting the data. From what I have seen such cases are likely to be symptoms of other problems that can't be fixed by simply undeleting.
from openlibrary.
Multiple defined un-delete authors in the code:
https://github.com/internetarchive/openlibrary/search?q=%22undelete+author%22&unscoped_q=%22undelete+author%22
from openlibrary.
Related Issues (20)
- Bulk Tagger: Dry-run mode HOT 1
- Bulk Tagger: Highlight staged options HOT 5
- ILE: Increase visibility of selected works HOT 8
- Imports should de-duplicate fields on import HOT 1
- My Books header title localization broken for some pages
- Reading log carousels overflow on tablets HOT 2
- HOW TO: Find & Claim a Good First Issue HOT 1
- Copydocs: Add error logging when records fail to import HOT 1
- Covers API – many cover images are no longer appearing HOT 2
- Document differences between production and local development HOT 7
- Standardize Fonts HOT 2
- Make font size consistent throughout the site HOT 2
- Update `sqlite3` to >= 3.35.0 to support `RETURNING`
- Python 3.12: tokenize.TokenError: ('unterminated string literal HOT 4
- Attempting to link an existing account silently fails HOT 3
- Link to gitter chat should be removed as it is unattended HOT 2
- Search for edition `translation_of` field HOT 5
- BAD API edit HOT 1
- Tabs on Trending page should have same style as those on search page HOT 4
- Create Github Project Management Bot HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openlibrary.