Introduction to Microsoft's Open XML file formats
Microsoft have announced that their new version of Office 2007 will be adopting file formats based on XML. Microsoft have named this change as 'Microsoft Office Open XML' formats (MOOX), and will apply to the following components of the Microsoft Office suite:
- Microsoft Office Word 2007
- Microsoft Office Excel 2007
- Microsoft Office PowerPoint 2007
It should be noted that the Office 2007 can still save files in binary format like previous versions of Office. However, the default will now be Open XML.
Microsoft adopting the XML format will provide many benefits to businesses, developers and individuals.
Smaller file sizes All Microsoft Office Open XML formatted files are compressed, potentially reducing file size by 75% and therefore reducing the disk storage space required to store the files. Compressing the files also reduces the bandwidth required to send the files by email, ftp, over networks or across the Internet. Office 2007 files are automatically compressed when saving, and uncompressed when opening. No additional software is required to compress the data, as this is a built-in component of Microsoft Office 2007.
Improved data recovery Microsoft's Open XML formatted files are created in a modular structure. This means that if a chart in the middle of the data has become corrupt, you should be able to access the data before and after.
Better privacy Using Microsoft's document inspector, you are now able to view files that may contain potentially sensitive information such as the document's author, comments and file location.
Data integration and interoperability Another benefit of basing files on XML is that the data can be accessed by any application that supports XML and zip compression.
Macro identification Under Office 2007, it will be easier to determine which files contain macros (which can be potentially dangerous). All Office files that have the 'x' suffix (I.E '.docx') cannot contain macros. Instead, only Office files that have the 'm' suffix (I.E. 'docm') can contain macros.