How to Recover, Repair, or Salvage the Text from a Corrupt Word 2007 DOCX Format Document

Word 2007 files are really zipped collections of mostly XML files.  XML is not tolerant of file corruption.  The text from a Word 2007 document is found in the document.xml file within the zipped collection. From the errors generated it appears that Word 2007 is using a corrupt tolerant unzipper, but a standard corrupt intolerant XML reading algorithm or module to even salvage text from this XML file within corrupt Word 2007 docx files.

However one can still try to recover the text and maybe even formatting in 6 different ways:

  1. The following method may recover text and formatting.  If that fails, you will also be given the choice to try to just recover the text. Choose the “Open” choice on the file menu as usual to open the corrupt file, however instead of double clicking on the target, select the file with a single left click and then click on the tiny arrow on the right hand side of the “Open” button in the lower right hand corner of the Open dialogue window.  This now gives you several choices.  One of the choices is “Open and Repair”.  Choose this choice.  If Word fails to repair the document and recover text and formatting, it will offer a 2nd choice to salvage just the text as mentioned.  Take this 2nd choice if necessary.
  2. Another recovery method within Word 2007 itself will just recover the text if successful.  Again use the Open dialogue window but choose the down arrow in the file type list box instead this time.  It is above the “Cancel” button and in Word 2007 probably reads: “All Word Documents (*.docx”.  Look for “Recover Text from Any File (*.*)”.  Make this choice and then select your file and hit the Open button per normal.
  3. I have written a piece of freeware which can be downloaded from here http://www.s2services.com/hosted-freeware/dd2txt-1.0.zip.  It will recover just the text from docx files without formatting except paragraph indications.  It succeeds at recovering the text where Word 2007 fails because because it does not use a standard XML reading algorithm or formatting.
  4. This next method will recover text and formatting if successful.  This strategy looks for backup, deleted and temporary versions of your file of your file.  There is at least one commercial program, “Recover My Files” from the Get data Company, and maybe others which can specifically do this.  An adress to it can be found on my page of links for recovering from file corruption: http://www.s2services.com/corrupt-file-commercialware.htm.
  5. I’m making this a separate step but it is the same strategy as step 4 only it is done manually instead of with commercial program and is therefore more difficult.  It will recover formatting and text if successful.  Do this by following Microsoft’s detailed instruction here: http://support.microsoft.com/kb/827099.
  6. Maybe the easiest and most effective way to recover from Word file corruption is to use commercialware to work on the file you do have, but of course that cost money.  In the best cases this strategy recovers formatting and text.  At the wors it recover just text, perhaps in a similar way as the freeware from step 3. Most of the major players are listed in links on my page listed in step 4.  One warning with commercialware is try the demos first!  See if it will work before buying, which almost goes without saying. A 2nd warning is only about 2/3 of the commercial programs from companies listed on the page, have added the ability to recover the new docx format. So look carefully at the program descriptions of the Word recovery software to see if they have added Word 2007 docx format recovery capabilities.

About the Author

Paul Pruitt has 12 years experience in computer support and 6 years in data recovery. He has computer A+, Net+, MCP, HDI Help Desk, and ITIL Foundation certifications as well a bachelor and master’s degrees in biology. His website list of freeware and free resources for recovering data lost to file corruption, unwanted file deletion, failing disks and lost passwords is here: http://www.s2services.com.

download microsoft office 2007 full version free