Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
481 views
in Technique[技术] by (71.8m points)

php - Parsing and generating Microsoft Office 2007 files (.docx, .xlsx, .pptx)

I have a web project where I must import text and images from a user-supplied document, and one of the possible formats is Microsoft Office 2007. There's also a need to generate documents in this format.

The server runs CentOS 5.2 and has PHP/Perl/Python installed. I can execute local binaries and shell scripts if I must. We use Apache 2.2 but will be switching over to Nginx once it goes live.

What are my options? Anyone had experience with this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The Office 2007 file formats are open and well documented. Roughly speaking, all of the new file formats ending in "x" are zip compressed XML documents. For example:

To open a Word 2007 XML file Create a temporary folder in which to store the file and its parts.

Save a Word 2007 document, containing text, pictures, and other elements, as a .docx file.

Add a .zip extension to the end of the file name.

Double-click the file. It will open in the ZIP application. You can see the parts that comprise the file.

Extract the parts to the folder that you created previously.

The other file formats are roughly similar. I don't know of any open source libraries for interacting with them as yet - but depending on your exact requirements, it doesn't look too difficult to read and write simple documents. Certainly it should be a lot easier than with the older formats.

If you need to read the older formats, OpenOffice has an API and can read and write Office 2003 and older documents with more or less success.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...