Coder Social home page Coder Social logo

desjarlais / wordcorruptdocchecker Goto Github PK

View Code? Open in Web Editor NEW
11.0 11.0 2.0 87 KB

I created this tool to fix corrupt Word documents (non-binary, just open xml files). Mainly this applies to the .docx format. It basically just checks through a list of known corruptions and then applies a fix if it comes across one of those scenarios in the file.

License: MIT License

C# 100.00%
corrupt corrupt-documents corruption document fix

wordcorruptdocchecker's Introduction

desjarlais github stats

wordcorruptdocchecker's People

Contributors

desjarlais avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

nico509 phil-isl

wordcorruptdocchecker's Issues

Word Corrupt Doc Checker Reports Fixing a File That Won't Open in Word 2016

Word Corrupt Doc Checker reports fixing a file I will privately share with the author. It reports:
Invalid Tag: </w:txbxContent></w:pict></mc:Fallback></mc:AlternateContent> Replaced With: </w:txbxContent></v:textbox></v:shape></w:pict></mc:Fallback></mc:AlternateContent>

The fixed file won't open in Word 2016, but will open if the </v:shape> tag is replaced with a </v:rect> one.

Program does not show buttons when Windows text is magnified to 175%

Great program Brandon! I have been recommending it in this Microsoft Community thread regarding Microsoft Word errors: "Unspecified Error, word/document.xml, Line:2, Column: 0".

I have a big monitor and my eyes are having "senior" problems so in the "Customize your display" section of System settings of PC settings, I set "Change the size of text, apps and other items: 175%."
image

When I do this, wordcorruptdocchecker in the new Github releases, v1.2.0.4 and v.1.2.0.5 don't magnify proportionally for me. Instead the right side of the software Window with the buttons, is cut off. I do not have this problem with the Codeplex version and when I change the magnification to 100%, the problem goes away.
image

Here's what I see with the Codeplex version even at 175%:
image

When Word Corrupt Doc Checker Finds the Document Does Not Contain Invalid XML...

When Word Corrupt Doc Checker finds that a document does not contain invalid XML, it also marks the target document as being inaccessible because it being in use, leading to a possibly false hope that releasing the file from another program might allow it to be fixed.

For example here is a recent error message from a file that the program says doesn't have invalid:

This document does not contain invalid xml.
ERROR: The process cannot access the file 'D:\Paul D Pruitt\Desktop\corrupt_docx_files\Unspecified error\math_tags\Japanese_characters_file(Fixed).docx' because it is being used by another process.

Suggestion: If No Invalid XML is Found, Try Opening the Non-Opening DOCX in WordPad

After zip repair and if Word Corrupt Doc Checker does not find any invalid XML, to attempt to open the file in WordPad. I have found that surprisingly once the zip structure is fixed, many if not most repairable DOCX files will open in WordPad without any further repair. I know this is not a prestigious solution, but sometimes simple is best.

Corrupt Document is Only Partially Fixed by Program

A corrupt document I recently was asked to fix, is partially repaired by the program. The problems seem to revolve around the same usual suspect tags.

I'm not sure which tags are not being properly addressed. The lack of complete result occurred even when the check box for removing all Fallback check marks was checked. I also ran the files through the program in succession more than one cycle and this didn't seem to help either.

I will send the file to the author separately.

Change the Name of the Project to the More Grammatically Correct "Corrupt Word DOCX Checker"

Word Corrupt DOC Checker sounds clumsy. This is probably because adjectives need to follow a certain order, see: http://www.gingersoftware.com/content/grammar-rules/adjectives/order-of-adjectives/[http://www.gingersoftware.com/content/grammar-rules/adjectives/order-of-adjectives/](url). Here "Corrupt" is a quality (#2 on the list), which should be in front of "Word" which is a proper adjective (#7) which properly precedes I think "DOC" which I think is a purpose or qualifier (#8 in the list order).

If the program can only work with DOCX files, the program should replace DOC with DOCX. If it works with other variants like the macro enable DOCM files, perhaps the long form, "Document" should be reintroduced, but this may be a matter of taste.

Finally Microsoft has grammar guidelines which say that when referring to Word, see: https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/publications.aspx[https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/publications.aspx](url). This sentence "You may not:
use the Marks as the leading word or most prominent element in your publication, seminar, or conference title;" may be applicable to "Word Corrupt DOC Checker" which starts with "Word" clearly referring to "Microsoft Word" as its leading word.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.