Coder Social home page Coder Social logo

Comments (12)

desjarlais avatar desjarlais commented on June 29, 2024

I've seen WordPad handle Word documents that Word doesn't. I think at times what happens there is that WordPad is a limited client/reader of DOCX files, so it just ignores many aspects of the file and displays only the simpler data and gets to skip certain content.

Is the idea here to handle some of these scenarios where no corruption is technically found, but Word is still not going to open the file and giving the user an option to pass it off to a client that will at least read some of the data is better than leaving them with the sense that it has no issues?

from wordcorruptdocchecker.

socrtwo avatar socrtwo commented on June 29, 2024

from wordcorruptdocchecker.

desjarlais avatar desjarlais commented on June 29, 2024

I've already started implementing the Open Xml SDK, so far works fairly well so I should be able to get this type of behavior into the program. I have some additional testing and verifying to work through, then I'll push the changes to github. Thanks for the report, but closing this for now.

from wordcorruptdocchecker.

socrtwo avatar socrtwo commented on June 29, 2024

from wordcorruptdocchecker.

desjarlais avatar desjarlais commented on June 29, 2024

I downloaded the SDK and then added a reference to it, then I use it to open the file and if it fails, there are still corrupt tags. The SDK works better than automating the client application for a scenario like this.

from wordcorruptdocchecker.

socrtwo avatar socrtwo commented on June 29, 2024

from wordcorruptdocchecker.

desjarlais avatar desjarlais commented on June 29, 2024

I'll keep looking into this because I do see the same behavior where it flagged the file as being correct still. I thought the SDK validated the Xml on open, so I'll need to do some research.

from wordcorruptdocchecker.

desjarlais avatar desjarlais commented on June 29, 2024

I see what I did wrong, I forgot to try pulling the actual contents from the document.xml file. I was just opening the zip container, which is going to work. It is the document.xml that we need to try pulling the content from that will tell us if it still has bad tags. Fixed and pushed those changes.

from wordcorruptdocchecker.

socrtwo avatar socrtwo commented on June 29, 2024

from wordcorruptdocchecker.

socrtwo avatar socrtwo commented on June 29, 2024

from wordcorruptdocchecker.

desjarlais avatar desjarlais commented on June 29, 2024

I don't think it removes any actual content. The xml elements in question are the AlternateContent (AC) blocks. Each AC block will have multiple representations of the content, including fallback. It is up to the reader/client to choose which version of the AC block to read. Removing the fallback just removes one "version" of the content.

The caveat here would be a file that had an AC block and ONLY a fallback. In which case, yes the content would probably be deleted as well, but I have yet to see a corrupt file that had a bad fallback AND no other options in the AC block.

from wordcorruptdocchecker.

socrtwo avatar socrtwo commented on June 29, 2024

from wordcorruptdocchecker.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.