Coder Social home page Coder Social logo

Comments (7)

brad2014 avatar brad2014 commented on May 7, 2024 1

@hornc - Is there any additional feedback you want to solicit before closing this? When you're ready, please label it "Close: will not fix" and close it.

from openlibrary.

bencomp avatar bencomp commented on May 7, 2024

I wrote VacuumBot which cleans up records. Recently I've made it update formats and paginations and noticed that some records referred to authors of type redirect or delete. I noticed because the API rejects records with references to redirecting or deleted pages.

I have been correcting these rejects manually, but as of today VacuumBot checks for the existence of authors in linked Work records, and

  • removes authors (without checking for correctness) from the Edition record if there are authors in the Work, or
  • follows location properties in redirects to new author records, or
  • undeletes deleted authors when it finds one following references from Editions.

This won't fix them all, but it's a start.

from openlibrary.

bfalling avatar bfalling commented on May 7, 2024

@bencomp: How is VacuumBot run? Manually? In a cron job?

from openlibrary.

bencomp avatar bencomp commented on May 7, 2024

@bfalling I ran VacuumBot manually. The last time was some years ago.

from openlibrary.

sbshah97 avatar sbshah97 commented on May 7, 2024

@hornc is this an Issue of just running the bot again or do we need to improve the bot further for existing records?

from openlibrary.

hornc avatar hornc commented on May 7, 2024

I'm not convinced undeleting authors is a safe step to do automatically. I have had to resolve many 'please see' authors names and also many 'DELETE' author names, some of which have had many different original authors and erroneous entries merged into one item, so untangling the correct original individual is not always straightforward, or even possible. Ideally we should not delete real individual author records at all but use redirects. Some entries however are created for data fragments and do not represent authors at all, so if these are deleted they should remain so.

@tfmorris questioned this edit recently:
https://openlibrary.org/authors/OL2630272A/please_see_Leonard_Lee_Rue_III?m=history
I don't think it is correct for Import Bot to make this sort of change automatically.

I suggest closing this issue as I'm not sure what value "Undelete authors linked to editions" across the board gives us, I imagine it'll do just as much harm as good depending on the situation. Specific problems should be raised with examples so they can be addressed appropriately.

In general, deleted items were deleted for a reason. If dangling author references in editions is still a problem, examples should be identified and we can come up with a process for correcting the data. From what I have seen such cases are likely to be symptoms of other problems that can't be fixed by simply undeleting.

from openlibrary.

hornc avatar hornc commented on May 7, 2024

Multiple defined un-delete authors in the code:
https://github.com/internetarchive/openlibrary/search?q=%22undelete+author%22&unscoped_q=%22undelete+author%22

from openlibrary.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.