Share via


Creating a Template Open XML Document in Memory

One of the most effective ways to generate Open XML documents, spreadsheets, or presentations is to start with a ‘template’ document and then modify the document.  For instance, you can start with a blank word processing document that is set up with your desired styles, and then add paragraphs and tables as necessary.  Or you can start with a blank PowerPoint presentation that is set up with your desired theme, and add slides.  But this means that you have to have a template document lying around that you can open and modify.  This is a bit inconvenient and messy.  There is another approach – you can create the template document in memory, modify it, and then serialize it to your desired location.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCThe gist of this approach is:

  • Create your desired template document using Office 2007 (or Office 2003 with the compatibility pack installed).
  • Run a little program that reads the template document into memory as a byte array, converts the byte array to a base64 string, and writes a file that contains a simple C# assignment statement.
  • Paste this assignment statement into your application.  As necessary, you can convert this base64 string to an in-memory Open XML document, modify the document, and serialize it to wherever you want.

If you use this approach, you don’t need to keep a template document around that you can read and modify.  Your C# application is self-contained.

Note that you can use this approach to embed any binary file into a C# program.  This technique will work, of course, with DOCX, XLSX, or PPTX Open XML documents.

There is a slight performance improvement using this approach – you don’t have to go to disk to read the template document.  But more importantly, your application is cleaner, and deployment of your application is simpler.  You don’t need to deploy your ‘template’ document – it’s contained right there in your C# code.

This approach is related to the approach of working with in-memory Open XML documents.

Here is a simple C# program to convert a template document to the C# assignment statement.  This code uses the LINQ technique of chunking a collection into a list of arbitrarily long chunks:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

class Program
{
static void Main(string[] args)
{
byte[] byteArray = File.ReadAllBytes("Template.docx");

// the following expression creates the base64String, then chunks
// it to lines of 76 characters long
string base64String = (System.Convert.ToBase64String(byteArray))
.Select
(
(c, i) => new
{
Character = c,
Chunk = i / 76
}
)
.GroupBy(c => c.Chunk)
.Aggregate(
new StringBuilder(),
(s, i) =>
s.Append(
i.Aggregate(
new StringBuilder(),
(seed, it) => seed.Append(it.Character),
sb => sb.ToString()
)
)
.Append(Environment.NewLine),
s =>
{
s.Length -= Environment.NewLine.Length;
return s.ToString();
}
);

// output a C# assignment statement to create a string that contains the template
// document
string chunkedString =
String.Format("string templateDocumentBase64String = @\"{0}\";", base64String);
File.WriteAllText("chunk.cs", chunkedString);
}
}

When you run this program, it creates a small file that contains the assignment statement:

string templateDocumentBase64String = @"UEsDBBQABgAIAAAAIQDd/JU3ZgEAACAFAAATAAgCW0NvbnRlbnRfVHlwZXNdLnhtbCCiBAIooAAC
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
... lots of lines elided
ZG9jUHJvcHMvYXBwLnhtbFBLAQItABQABgAIAAAAIQAdnTy9ewEAAPcCAAARAAAAAAAAAAAAAAAA
AO0ZAABkb2NQcm9wcy9jb3JlLnhtbFBLAQItABQABgAIAAAAIQBZ7/GiEgcAAPg5AAAPAAAAAAAA
AAAAAAAAAJ8cAAB3b3JkL3N0eWxlcy54bWxQSwUGAAAAAAsACwDBAgAA3iMAAAAA";

The following code shows how to convert this string to an Open XML document that you can then modify.  This code uses the Open XML SDK (either V1 or V2):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;

public static class LocalExtensions
{
public static XDocument GetXDocument(this OpenXmlPart part)
{
XDocument xdoc = part.Annotation<XDocument>();
if (xdoc != null)
return xdoc;
using (StreamReader sr = new StreamReader(part.GetStream()))
using (XmlReader xr = XmlReader.Create(sr))
xdoc = XDocument.Load(xr);
part.AddAnnotation(xdoc);
return xdoc;
}

public static void PutXDocument(this OpenXmlPart part)
{
XDocument xdoc = part.GetXDocument();
if (xdoc != null)
{
// Serialize the XDocument object back to the package.
using (XmlWriter xw = XmlWriter.Create(
part.GetStream(FileMode.Create, FileAccess.Write)))
xdoc.Save(xw);
}
}
}

public static class W
{
public static XNamespace w =
"https://schemas.openxmlformats.org/wordprocessingml/2006/main";

public static XName body = w + "body";
public static XName r = w + "r";
public static XName p = w + "p";
public static XName t = w + "t";
}

class Program
{
static void Main(string[] args)
{
using (MemoryStream memoryStream = new MemoryStream())
{
string templateDocumentBase64String = @"UEsDBBQABgAIAAAAIQDd/JU3ZgEAACAFAAATAAgCW0NvbnRlbnRfVHlwZXNdLnhtbCCiBAIooAAC
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
... lots of lines elided
AO0ZAABkb2NQcm9wcy9jb3JlLnhtbFBLAQItABQABgAIAAAAIQBZ7/GiEgcAAPg5AAAPAAAAAAAA
AAAAAAAAAJ8cAAB3b3JkL3N0eWxlcy54bWxQSwUGAAAAAAsACwDBAgAA3iMAAAAA";
char[] base64CharArray = templateDocumentBase64String
.Where(c => c != '\r' && c != '\n').ToArray();
byte[] byteArray =
System.Convert.FromBase64CharArray(base64CharArray,
0, base64CharArray.Length);
memoryStream.Write(byteArray, 0, byteArray.Length);
using (WordprocessingDocument doc = WordprocessingDocument.Open(memoryStream, true))
{
// we now have an open in-memory WordprocessingDocument that we can modify
XDocument xDocument = doc.MainDocumentPart.GetXDocument();
// add a paragraph to the document
XElement body = xDocument.Root.Element(W.body);
body.AddFirst(
new XElement(W.p,
new XElement(W.r,
new XElement(W.t, "Hello World!")
)
)
);
doc.MainDocumentPart.PutXDocument();
}

// at this point, the MemoryStream contains the modified document.
// We could write it back to a SharePoint document library or serve
// it from a web server.

// in this example, we'll serialize back to the file system to verify
// that the code worked properly.
using (FileStream fileStream = new FileStream("Test.docx",
System.IO.FileMode.Create))
{
memoryStream.WriteTo(fileStream);
}
}
}
}

Code is attached.

TemplateOpenXmlDocumentInMemory.zip

Comments

  • Anonymous
    February 28, 2009
    PingBack from http://www.clickandsolve.com/?p=16128

  • Anonymous
    February 28, 2009
    Hi, why don't you simple put the template into the assembly as embeded resource? Then there is no need for large chunks of base64 codes inside your code and you can easily change it, check it to revision management etc. You can read it with Assembly ASM = Assembly.GetExecutingAssembly(); var stream = ASM.GetManifestResourceStream("Namespace of your Assembly.Name of template.docx"); using (WordprocessingDocument doc = WordprocessingDocument.Open(stream, false)) ... Ok, you had to make an in Memory copy to modify it so either clone it or copy it to a memory stream.            {

  • Anonymous
    February 28, 2009
    You're right, that's a much better solution. -Eric

  • Anonymous
    February 28, 2009
    Hi, what about putting the template into a resource file? This way one can have language dependent templates too. Martin

  • Anonymous
    March 06, 2009
    Ces dernières semaines furent assez complètes et complexes, et le temps m’a manqué pour partager avec

  • Anonymous
    March 18, 2009
    It’s been a couple of weeks since I posted, and I’ve come across several interesting blog posts and articles