Embedding Foreign Objects (PDF, TXT, GIF, etc…) into Microsoft Word using OpenXml 2.0
(Well, in collaboration with COM)
I got a lot from this site, so here I asked and answered my own question in order to give back a little on a topic in which I had difficulty finding answers on, hope it helps people.
There are several examples out there that show how to embed an Office Document into another Office Document using OpenXml 2.0, what’s not out there and easily understandable is how to embed just about any file into and Office Document.
I have learned a lot from other people’s code, so this is my attempt to contribute. Since I am already using OpenXml to generate documents, and I am in need of embedding other files into Word, I have decided use a collaboration of OpenXml and COM (Microsoft Office 2007 dll’s) to achieve my goal. If you are like me, “invoking the OLE server application to create an IStorage” doesn’t mean much to you.
In this example I’d like to show how I use COM to PROGRMATICALLY get the OLE-binary data information of the attached file, and then how I used that information within my OpenXml document. Basically, I am programmatically looking at the OpenXml 2.0 Document Reflector to get the information I need.
My code below is broken down into several classes, but here is an outline of what I am doing:
- Create an OpenXml WordProcessingDocument, get the System.IO.FileInfo for the file you want to Embed
- Create a custom OpenXmlEmbeddedObject object (this is what holds all the binary data)
- Use the binary data from the above step to create Data and Image Streams
- Use those Streams as the File Object and File Image for your OpenXml Document
I know there is a lot of code, and not much explanation… Hopefully it is easy to follow and will help people out ?
Requirements:
? DocumentFormat.OpenXml dll (OpenXml 2.0)
? WindowsBase dll
? Microsoft.Office.Interop.Word dll (Office 2007 – version 12)
? This the main class that starts everything, opens a WordProcessingDocument and class to have the file attached
using DocumentFormat.OpenXml.Packaging;
using System.IO;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Wordprocessing;
public class MyReport
{
private MainDocumentPart _mainDocumentPart;
public void CreateReport()
{
using (WordprocessingDocument wpDocument = WordprocessingDocument.Create(@"TempPathMyReport.docx", WordprocessingDocumentType.Document))
{
_mainDocumentPart = wpDocument.AddMainDocumentPart();
_mainDocumentPart.Document = new Document(new Body());
AttachFile(@"MyFilePathMyFile.pdf", true);
}
}
private void AttachFile(string filePathAndName, bool displayAsIcon)
{
FileInfo fileInfo = new FileInfo(filePathAndName);
OpenXmlHelper.AppendEmbeddedObject(_mainDocumentPart, fileInfo, displayAsIcon);
}
}
? This class in an OpenXml helper class, holds all the logic to embed an object into your OpenXml File
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Validation;
using DocumentFormat.OpenXml.Wordprocessing;
using OVML = DocumentFormat.OpenXml.Vml.Office;
using V = DocumentFormat.OpenXml.Vml;
public class OpenXmlHelper
{
/// <summary>
/// Appends an Embedded Object into the specified Main Document
/// </summary>
/// <param name="mainDocumentPart">The MainDocument Part of your OpenXml Word Doc</param>
/// <param name="fileInfo">The FileInfo object associated with the file being embedded</param>
/// <param name="displayAsIcon">Whether or not to display the embedded file as an Icon (Otherwise it will display a snapshot of the file)</param>
public static void AppendEmbeddedObject(MainDocumentPart mainDocumentPart, FileInfo fileInfo, bool displayAsIcon)
{
OpenXmlEmbeddedObject openXmlEmbeddedObject = new OpenXmlEmbeddedObject(fileInfo, displayAsIcon);
if (!String.IsNullOrEmpty(openXmlEmbeddedObject.OleObjectBinaryData))
{
using (Stream dataStream = new MemoryStream(Convert.FromBase64String(openXmlEmbeddedObject.OleObjectBinaryData)))
{
if (!String.IsNullOrEmpty(openXmlEmbeddedObject.OleImageBinaryData))
{
using (Stream emfStream = new MemoryStream(Convert.FromBase64String(openXmlEmbeddedObject.OleImageBinaryData)))
{
string imagePartId = GetUniqueXmlItemID();
ImagePart imagePart = mainDocumentPart.AddImagePart(ImagePartType.Emf, imagePartId);
if (emfStream != null)
{
imagePart.FeedData(emfStream);
}
string embeddedPackagePartId = GetUniqueXmlItemID();
if (dataStream != null)
{
if (openXmlEmbeddedObject.ObjectIsOfficeDocument)
{
EmbeddedPackagePart embeddedObjectPart = mainDocumentPart.AddNewPart<EmbeddedPackagePart>(
openXmlEmbeddedObject.FileContentType, embeddedPackagePartId);
embeddedObjectPart.FeedData(dataStream);
}
else
{
EmbeddedObjectPart embeddedObjectPart = mainDocumentPart.AddNewPart<EmbeddedObjectPart>(
openXmlEmbeddedObject.FileContentType, embeddedPackagePartId);
embeddedObjectPart.FeedData(dataStream);
}
}
if (!displayAsIcon && !openXmlEmbeddedObject.ObjectIsPicture)
{
Paragraph attachmentHeader = CreateParagraph(String.Format("Attachment: {0} (Double-Click to Open)", fileInfo.Name));
mainDocumentPart.Document.Body.Append(attachmentHeader);
}
Paragraph embeddedObjectParagraph = GetEmbeededObjectParagraph(openXmlEmbeddedObject.FileType,
imagePartId, openXmlEmbeddedObject.OleImageStyle, embeddedPackagePartId);
mainDocumentPart.Document.Body.Append(embeddedObjectParagraph);
}
}
}
}
}
/// <summary>
/// Gets Paragraph that includes the embedded object
/// </summary>
private static Paragraph GetEmbeededObjectParagraph(string fileType, string imageID, string imageStyle, string embeddedPackageID)
{
EmbeddedObject embeddedObject = new EmbeddedObject();
string shapeID = GetUniqueXmlItemID();
V.Shape shape = new V.Shape() { Id = shapeID, Style = imageStyle };
V.ImageData imageData = new V.ImageData() { Title = "", RelationshipId = imageID };
shape.Append(imageData);
OVML.OleObject oleObject = new OVML.OleObject()
{
Type = OVML.OleValues.Embed,
ProgId = fileType,
ShapeId = shapeID,
DrawAspect = OVML.OleDrawAspectValues.Icon,
ObjectId = GetUniqueXmlItemID(),
Id = embeddedPackageID
};
embeddedObject.Append(shape);
embeddedObject.Append(oleObject);
Paragraph paragraphImage = new Paragraph();
Run runImage = new Run(embeddedObject);
paragraphImage.Append(runImage);
return paragraphImage;
}
/// <summary>
/// Gets a Unique ID for an XML Item, for reference purposes
/// </summary>
/// <returns>A GUID string with removed dashes</returns>
public static string GetUniqueXmlItemID()
{
return "r" + System.Guid.NewGuid().ToString().Replace("-", "");
}
private static Paragraph CreateParagraph(string paragraphText)
{
Paragraph paragraph = new Paragraph();
ParagraphProperties paragraphProperties = new ParagraphProperties();
paragraphProperties.Append(new Justification()
{
Val = JustificationValues.Left
});
paragraphProperties.Append(new SpacingBetweenLines()
{
After = Convert.ToString(100),
Line = Convert.ToString(100),
LineRule = LineSpacingRuleValues.AtLeast
});
Run run = new Run();
RunProperties runProperties = new RunProperties();
Text text = new Text();
if (!String.IsNullOrEmpty(paragraphText))
{
text.Text = paragraphText;
}
run.Append(runProperties);
run.Append(text);
paragraph.Append(paragraphProperties);
paragraph.Append(run);
return paragraph;
}
}
? This is the most important part of this process, it is using Microsoft's internal OLE Server, creates the Binary DATA and Binary EMF information for a file. All you have to here is call the OpenXmlEmbeddedObject constructor and all get’s taken care of. It will mimic the process that goes on when you manually drag any file into Word; there is some kind of conversion that goes on when you do that, turning the file into an OLE object, so that Microsoft can recognize the file.
o The most imporant parts of this class are the OleObjectBinaryData and OleImageBinaryData properties; they contain the 64Bit string binary info for the file data and ‘.emf’ image.
o If you choose to not display the file as an icon, then the ‘.emf’ image data will create a snapshot of the file, like the first page of the pdf file for example, in which you can still double-click to open
o If you are embedding an image and choose not to display it as an Icon, then the OleObjectBinaryData and OleImageBinaryData properties will be the same
using System.Runtime.InteropServices;
using System.Xml;
using System.Diagnostics;
using System.IO;
using System.Drawing;
using Microsoft.Office.Interop.Word;
public class OpenXmlEmbeddedObject
{
#region Constants
private const string _defaultOleContentType = "application/vnd.openxmlformats-officedocument.oleObject";
private const string _oleObjectDataTag = "application/vnd";
private const string _oleImageDataTag = "image/x-emf";
#endregion Constants
#region Member Variables
private static FileInfo _fileInfo;
private static string _filePathAndName;
private static bool _displayAsIcon;
private static bool _objectIsPicture;
private object _objectMissing = System.Reflection.Missing.Value;
private object _objectFalse = false;
private object _objectTrue = true;
#endregion Member Variables
#region Properties
/// <summary>
/// The File Type, as stored in Registry (Ex: a GIF Image = 'giffile')
/// </summary>
public string FileType
{
get
{
if (String.IsNullOrEmpty(_fileType) && _fileInfo != null)
{
_fileType = GetFileType(_fileInfo, false);
}
return _fileType;
}
}
private string _fileType;
/// <summary>
/// The File Context Type, as storered in Registry (Ex: a GIF Image = 'image/gif')
/// * Is converte