Showing posts with label OpenXML. Show all posts
Showing posts with label OpenXML. Show all posts

Thursday, October 16, 2014

About the OpenXML SDK... Episode 020 of the Office 365 Developer Podcast

O365 Dev Podcast - Office 365 Developer Podcast: Episode 020 about Open XML SDK

In episode 20, Jeremy Thake chats to Doug Mahugh, Eric White and Chris Rae about the Open XML SDK.

...

Show Notes

...

image

..."

I've blogged about OpenXML enough (as you can see below) to think it was pretty cool to see the podcast post (does that make me weird? Na.... there's a bunch ELSE that makes me weird... lol :)

 

Related Past Post XRef:
Did you know you could update/contribute to some (OpenXML for now) MSDN Doc's via a GitHub repo?
Using the OpenXML SDK Productivity Tool to "decompile" Office documents (Turn *X files into the C# OpenXML SDK code that would generate them)

Open Sesame - Open XML SDK is now open source

Using OpenXML to load an Excel Worksheet into a DataTable (or just how different OpenXML is from the old Excel API we're used too)

Using OpenXML SDK to generate Word documents via templates (and without Word being installed)
Checking for Microsoft Word DocX/DocM Revisions/Track Changes without using Word... (via OpenXML SDK, LINQ to XML or XML DOM)
LINQ to XlsX... Using VB.Net, LINQ, the OpenXML SDK and a little C# helper, to query an Excel XlsX
Using native OpenXML to create an XlsX (Which provides an example of why I highlight tools that make OpenXML easier...)
Generating Xlsx's on the Server? You're using OpenXML, right? With help from the PowerTools for OpenXML?

Official boat-load, as in supertanker, sized OpenXML content list (Insert "One OpenXML content list to rule them all" here)
So how do I get from here to OpenXML? Got a map for you, an Open XML SDK Blog Map…
Where to go to scratch your OpenXML dev info itch…
"Open XML Explained" Free eBook (PDF)
The Noob's Guide to Open XML Dev (If you know how to spell OpenXML but that's about it, this is your Getting Started guide...)

Reusing the PowerShell PowerTools for Open XML in your C# or VB.Net world
PowerShell, OpenXML, WMI and the PowerTools for OpenXML = Doc generation for our inner geek
Because it’s a PowerShell kind of day… PowerTools for Open XML V1.1 Released
OpenXML PowerTools updated – Cell your Excel via PowerShell
Powering into OpenXML with PowerShell

Open XML SDK 2.0 for Microsoft Office Released – Automate Office documents without Office

Open XML 2.0 Code Snippets for VS2010 (and VS2008 too)
Open XML Format SDK 2.0 Code Snippets for Visual Studio 2008 – 52 C#/VB Code Snippets to help ease your Open XML coding
Open XML File Format Code Snippets for Visual Studio 2005 (Office 2007 NOT required)

Open XML SDK v1 Released

OpenXML Viewer 1.0 Released – Open source DocX to HTML conversion, with IE, Firefox and Opera (and/or command line) support

Wednesday, July 30, 2014

Did you know you could update/contribute to some (OpenXML for now) MSDN Doc's via a GitHub repo?

When writing my last post,Using the OpenXML SDK Productivity Tool to "decompile" Office documents (Turn *X files into the C# OpenXML SDK code that would generate them),  I came across this;

image

I'm like, "What?" No...

Yep!

OfficeDev/office-content

Contains content from dev.office.com that is openly editable by the public.

Ways to contribute

You can contribute to Office developer documentation in a few different ways:

*We're only taking documentation contributions for the OpenXML Conceptual content at this time

Repository organization

The content in the office-content repository is grouped first by article language, then by topic. The README.md file at the root of each topic directory specifies the structure of the articles within the topic.

Article within each topic are named by MSDN GUID rather than title name. This is a side effect of our document management process and cannot be changed at this time. We highly recommend using the table of contents within each topic directory (see links below) to navigate to the files you wish to view or edit.

Articles in this repository

Open XML

Before we can accept your pull request

...

SNAGHTML15f192b1image

Now that's cool...

Using the OpenXML SDK Productivity Tool to "decompile" Office documents (Turn *X files into the C# OpenXML SDK code that would generate them)

Ode To Code - Easily Generate Microsoft Office Files From C#

"...

These days, Office files are no longer in a proprietary binary format, and are we can create the files directly without using COM automation. A .docx Word file, for example, is a collection of XML documents zipped into a single file. The official name of the format is Open XML.

There is an SDK to help with reading and writing OpenXML, and a Productivity Tool that can generate C# code for a given file. All you need to do is load a document, presentation, or workbook into the tool and press the “Reflect Code” button.

image

The downside to this tool is that even a simple document will generate 4,000 lines of code. Another downside is that the generated code assumes it will write directly to the file system, however it is easy to pass in an abstract Stream object instead.

So while this code isn’t perfect, the code does produce valid document and..."

I've been blogging about the OpenXML SDK for years now, but I think this is the first time I've seen this part of it, this utility. And like he says, 4K LoC is like, well, allot, it does look like an awesome way to learn the low level OpenXML SDK ins and outs.

 

Related Past Post XRef:
Open Sesame - Open XML SDK is now open source

Using OpenXML to load an Excel Worksheet into a DataTable (or just how different OpenXML is from the old Excel API we're used too)

Using OpenXML SDK to generate Word documents via templates (and without Word being installed)
Checking for Microsoft Word DocX/DocM Revisions/Track Changes without using Word... (via OpenXML SDK, LINQ to XML or XML DOM)
LINQ to XlsX... Using VB.Net, LINQ, the OpenXML SDK and a little C# helper, to query an Excel XlsX
Using native OpenXML to create an XlsX (Which provides an example of why I highlight tools that make OpenXML easier...)
Generating Xlsx's on the Server? You're using OpenXML, right? With help from the PowerTools for OpenXML?

Official boat-load, as in supertanker, sized OpenXML content list (Insert "One OpenXML content list to rule them all" here)
So how do I get from here to OpenXML? Got a map for you, an Open XML SDK Blog Map…
Where to go to scratch your OpenXML dev info itch…
"Open XML Explained" Free eBook (PDF)
The Noob's Guide to Open XML Dev (If you know how to spell OpenXML but that's about it, this is your Getting Started guide...)

Reusing the PowerShell PowerTools for Open XML in your C# or VB.Net world
PowerShell, OpenXML, WMI and the PowerTools for OpenXML = Doc generation for our inner geek
Because it’s a PowerShell kind of day… PowerTools for Open XML V1.1 Released
OpenXML PowerTools updated – Cell your Excel via PowerShell
Powering into OpenXML with PowerShell

Open XML SDK 2.0 for Microsoft Office Released – Automate Office documents without Office

Open XML 2.0 Code Snippets for VS2010 (and VS2008 too)
Open XML Format SDK 2.0 Code Snippets for Visual Studio 2008 – 52 C#/VB Code Snippets to help ease your Open XML coding
Open XML File Format Code Snippets for Visual Studio 2005 (Office 2007 NOT required)

Open XML SDK v1 Released

OpenXML Viewer 1.0 Released – Open source DocX to HTML conversion, with IE, Firefox and Opera (and/or command line) support

Wednesday, July 16, 2014

Using OpenXML to load an Excel Worksheet into a DataTable (or just how different OpenXML is from the old Excel API we're used too)

dotnet thoughts - Read Excel as DataTable using OpenXML and C#

In the current project we were using OpenXML extensively for reading Excel files. Here is the code snippet, which will help you to read / convert Excel files to DataTable.

image

..."

You've heard me whine about how, while OpenXML is cool and how nice it is that we can access Office 2007+ files without Office or third party apps, yet the API is pretty darn different for traditional Office Object Model users? This screenshot shows why... Parts, SharedStringTables, oh my... It's not hard, just takes a while to wrap your head around.

Thursday, June 26, 2014

Being open to opening OpenXML documents in Visual Studio with the now open source Open XML Package Editor for VS 2012/2013

OpenXML Developer - Open XML Package Editor Released for VS2012 and VS2013

image

Chris Rae recently announced on his blog that we have released a new version of the Open XML Package Editor, which now works on Visual Studio 2012 and 2013!

As anyone knows who has seen any of my screen-casts, the Open XML Package Editor is my go-to tool for opening and editing Open XML documents. It is a vital tool for Open XML Developers. After installing, you can drag and drop Open XML documents onto Visual Studio, navigate through the various parts, open parts for editing in the very excellent XML editor that is in Visual Studio, and modify any relationship in the package. Unfortunately, until this release, you had to keep a copy of Visual Studio 2010 around in order to use the tool, a pain to say the least. Well, no more. Now it works with the latest versions of Visual Studio, and furthermore, we will never get into the situation again where it only works for previous versions of Visual Studio. Since it is open source, you, I, or anyone else can quickly do the port to new versions of VS. It now supports Visio's new VSDX format and has some other minor fixes and enhancements.

We have published the code on GitHub under the Apache 2.0 license. If you just want to download the new version of the Package Editor, it's here on the Visual Studio Gallery. [GD: Post Leached in Full]

We all know that OpenXML documents (DocX, XlxX, PptX, *X, etc, etc) are really just zip file containers with standardize manifests, contents and packaging right? (Don't believe me? Rename a .DocX to .zip and see).
And sure, you can open and spelunk the unzipped contents of the document, it's not the easiest. Instead you've got to use an OpenXML explorer, one like this one, the Open XML Package Editor. And hey you can even stay in your favorite tool of choice (Visual Studio of course!). And now that it's open source, it's even cooler!

 

Related Past Post XRef:
Open Sesame - Open XML SDK is now open source

Using OpenXML SDK to generate Word documents via templates (and without Word being installed)
Checking for Microsoft Word DocX/DocM Revisions/Track Changes without using Word... (via OpenXML SDK, LINQ to XML or XML DOM)
LINQ to XlsX... Using VB.Net, LINQ, the OpenXML SDK and a little C# helper, to query an Excel XlsX
Using native OpenXML to create an XlsX (Which provides an example of why I highlight tools that make OpenXML easier...)
Generating Xlsx's on the Server? You're using OpenXML, right? With help from the PowerTools for OpenXML?

Official boat-load, as in supertanker, sized OpenXML content list (Insert "One OpenXML content list to rule them all" here)
So how do I get from here to OpenXML? Got a map for you, an Open XML SDK Blog Map…
Where to go to scratch your OpenXML dev info itch…
"Open XML Explained" Free eBook (PDF)
The Noob's Guide to Open XML Dev (If you know how to spell OpenXML but that's about it, this is your Getting Started guide...)

Reusing the PowerShell PowerTools for Open XML in your C# or VB.Net world
PowerShell, OpenXML, WMI and the PowerTools for OpenXML = Doc generation for our inner geek
Because it’s a PowerShell kind of day… PowerTools for Open XML V1.1 Released
OpenXML PowerTools updated – Cell your Excel via PowerShell
Powering into OpenXML with PowerShell

Open XML SDK 2.0 for Microsoft Office Released – Automate Office documents without Office

Open XML 2.0 Code Snippets for VS2010 (and VS2008 too)
Open XML Format SDK 2.0 Code Snippets for Visual Studio 2008 – 52 C#/VB Code Snippets to help ease your Open XML coding
Open XML File Format Code Snippets for Visual Studio 2005 (Office 2007 NOT required)

Open XML SDK v1 Released

OpenXML Viewer 1.0 Released – Open source DocX to HTML conversion, with IE, Firefox and Opera (and/or command line) support

Wednesday, June 25, 2014

Open Sesame - Open XML SDK is now open source

Open XML SDK goes open source

Brian Jones is the principal GPM of the Office Development Platform.

Today is an exciting day for Office developers—we’re open sourcing the Open XML SDK on GitHub! We’re eager to work with the community on continual improvements to the SDK’s functionality and scalability, and to explore new platforms and technologies to support developer platforms such as Mono, an open source implementation of .NET Framework. It’s been over seven years since we released the initial preview of the Open XML SDK, and over that time it’s been one of the key tools developers have used for building solutions that consume, create, and modify Office documents.

I encourage you to head over to GitHub and take a look at the project. We’d love your participation! We posted it under the .NET Foundation. In addition to the SDK itself, we opened all of the Open XML conceptual documentation in MSDN for public review/contributions. A living copy of the docs is now in GitHub for you to edit and review. Pull requests welcome!

The Open XML SDK is a key piece of our overall developer platform. The trends around mobile apps connected to the cloud have expanded the role that Office documents can play in solutions. Many of our Fortune 100 customers have built solutions leveraging the SDK, especially in the banking and health care sectors. We average over 10,000 downloads a month, and the SDK is also widely distributed in other software packages, such as accounting tools.

...

In another post, we provided a great drilldown into the architecture of the SDK and a ton of great examples.

As you’ve probably noticed lately, we’re making a big push to open a lot of our developer technologies to the community. We have a few really cool projects already in GitHub, like the Office 365 SDK for Android Preview, as well as the Open XML package editor. We’ve shifted the Office extensibility model to use open standards like HTML and JavaScript, and we’re exposing Office 365 data (documents, mail, and calendars) through RESTful APIs leveraging oAuth. You’ll see us continue to do more of this, and we’d love to hear any feedback you might have on our UserVoice.

If you’re already an Open XML developer, this is definitely an exciting day. If you haven’t built solutions yet on Open XML, I strongly encourage you to go take a look and try out some of the examples. You’ll be surprised by what you can build.

image..."

The Microsoft open source wagon just keeps on rolling! The OpenXML spec has been open for a while and now the SDK is too. Heck I wonder what else is going to be opened up? The Fluid UI? Windows Live Writer (please, please, please)? Guess 2014 is going to officially be "The Year Microsoft Opened"...

 

Related Past Post XRef:

Using OpenXML SDK to generate Word documents via templates (and without Word being installed)
Checking for Microsoft Word DocX/DocM Revisions/Track Changes without using Word... (via OpenXML SDK, LINQ to XML or XML DOM)
LINQ to XlsX... Using VB.Net, LINQ, the OpenXML SDK and a little C# helper, to query an Excel XlsX
Using native OpenXML to create an XlsX (Which provides an example of why I highlight tools that make OpenXML easier...)
Generating Xlsx's on the Server? You're using OpenXML, right? With help from the PowerTools for OpenXML?

Official boat-load, as in supertanker, sized OpenXML content list (Insert "One OpenXML content list to rule them all" here)
So how do I get from here to OpenXML? Got a map for you, an Open XML SDK Blog Map…
Where to go to scratch your OpenXML dev info itch…
"Open XML Explained" Free eBook (PDF)
The Noob's Guide to Open XML Dev (If you know how to spell OpenXML but that's about it, this is your Getting Started guide...)

Reusing the PowerShell PowerTools for Open XML in your C# or VB.Net world
PowerShell, OpenXML, WMI and the PowerTools for OpenXML = Doc generation for our inner geek
Because it’s a PowerShell kind of day… PowerTools for Open XML V1.1 Released
OpenXML PowerTools updated – Cell your Excel via PowerShell
Powering into OpenXML with PowerShell

Open XML SDK 2.0 for Microsoft Office Released – Automate Office documents without Office

Open XML 2.0 Code Snippets for VS2010 (and VS2008 too)
Open XML Format SDK 2.0 Code Snippets for Visual Studio 2008 – 52 C#/VB Code Snippets to help ease your Open XML coding
Open XML File Format Code Snippets for Visual Studio 2005 (Office 2007 NOT required)

Open XML SDK v1 Released

OpenXML Viewer 1.0 Released – Open source DocX to HTML conversion, with IE, Firefox and Opera (and/or command line) support

Friday, February 21, 2014

Microsoft Open Specifications Posters v2 released (Think "Wow, that's allot of spec's" Posters)

Microsoft Downloads - Open Specifications Posters

The Open Specifications Posters (PDF format) make it easy for interoperability developers to explore the Open Specifications overview documents for Office client, Lync, SharePoint, Office file formats, Exchange Server, SQL Server, and Windows.

Version: 5.0

Date Published: 2/21/2014

ExchangeOpenSpecPoster.pdf, 556 KB

MicrosoftOpenSpecPoster - Accessiblility Version.pdf, 336 KB

OfficeLyncOpenSpecPoster.pdf, 669 KB

SharePointOpenSpecPoster.pdf, 606 KB

SQLOpenSpecPoster.pdf, 1,011 KB

WindowsOpenSpecPoster.pdf, 1.0 MB

The Open Specifications Posters (PDF format) make it easy for interoperability developers to explore the Open Specifications overview documents for Office client, Lync, SharePoint, Office file formats, Exchange Server, SQL Server, and Windows. The posters display, by functional area, the protocols, file formats, and related technologies, as described in each overview document. A high-contrast poster is also provided for those with visual accessibility needs that contains listings for all functional areas .

Some cube art to help when you get visited by the "Microsoft is closed and the devil" guy (I know you know that guy...)

Here's a snap of the Windows PDF;

imageSNAGHTML1fe1a2a9

 

Related Past Post XRef:
Office/Exchange File Format,Specification and Protocol Documentation refreshed
Microsoft Format and Specification Documentation 0712 Refresh (Think Office 2013 CP update). Oh and some SharePoint Doc's too
Microsoft Format and Specification Documentation Refresh ("Significantly changed technical content") [Updated: Includes updates for Office 15 Technical Preview ]
Microsoft Office File Formats and Microsoft Office Protocols Documentation Refreshed
Microsoft Office File Formats and Protocols documentation updated for Office 2010 (Think “Now with added ‘X’ flavor… DocX, PptX, XlsX, etc”)

Microsoft Open Specifications Poster

XAML Language Specification (as in the in the full XAML, WPF and Silverlight XAML Specs)

"Microsoft SQL Server Data Portability Documentation"

MS-PST file format specification released. Yep, the full and complete specification for Outlook PST’s is now just a download away.
Microsoft Office (DOC, XLS, PPT) Binary File Format Specifications Released – We’re talking the full technical specification… (The [MS-DOC].pdf alone is 553 pages of very dense specification information)
DOC, XLS and PPT Binary File Format Specifications Released (plus WMF, Windows Compound File [aka OLE 2.0 Structured Storage] and Ink Serialized Format Specifications and Translator to XML news)

Thursday, January 16, 2014

Third Party Office Library or OpenXML?

CodePlex - Aspose for OpenXML

The Open XML SDK for Office simplifies the task of manipulating Open XML packages and the underlying Open XML schema elements within a package. The classes in the Open XML SDK encapsulate many common tasks that developers perform on Open XML packages, so that you can perform complex operations with lines of code.

Using the classes in the Open XML SDK 2.5 is simple. When you have installed the Open XML SDK 2.5, open your existing project or application in Visual Studio, or create a new project or application. Then, in your project or application, add references to the following components:

  • DocumentFormat.OpenXml
  • WindowsBase
To add a reference in a Microsoft Visual Studio project
  • In Solution Explorer, right-click References and then click Add Reference. If the References node is not visible, click Project and then click Show All Files.
  • In the Add Reference dialog box, click .NET.
  • In the Component Name column, select the components (scroll if you need to), and then click OK.

This project covers the following topics:

What is the use of Aspose .NET Products?

Aspose are file format experts and provide APIs and components for various file formats including MS Office, OpenOffice, PDF and Image formats. These APIs are available on a number of development platforms including .NET frameworks – the .NET frameworks starting from version 2.0 are supported. If you are a .NET developer, you can use Aspose’s native .NET APIs in your .NET applications to process various file formats in just a few lines of codes. All the Aspose APIs don’t have any dependency over any other engine. For example, you don’t need to have MS Office installed on the server to process MS Office files. Below is a list of products we support for .NET developers:

..."

I've mentioned OpenXML in the past and that it's cool that you can use it to get all the deep deep data in Office *x files? Then you've also heard me say what a pain it can be if you're used to a more traditional Office Object Model. It's a completely different way of thinking about your documents... And doing that hurts my brain. So I go out of my way to find libraries that make it easier. One such, that we've bought in my day job, is Aspose. If you've read any MS dev mag, you've seen the ads for them.

I ran across this and sure, it's sales-ware, still it's useful to OpenXML dev's does a good job of showing the differences between the two approaches...

OpenXML SDK Word Processing Code Snippets - Create a word processing document

image

IMHO, if you can, use a third party library, free or commercial. OpenXML might get the job done and it is free, but the time you spend on it isn't (And remember, friends don't let friend Office interop!)

Thursday, December 05, 2013

Free Export DataSet/DataTable/List<t> to Excel (without using or even having Excel installed)

Code Project - A free "Export to Excel" C# class, using OpenXML

It's amazing that even now, in 2013, there are so many developers still asking for help on how to write C# and VB.Net code, to export their data to Excel.

Even worse, a lot of them will stumble on articles suggesting that they should write their data to a comma-separated file, but to give the file an .xls extension.

So, today, I'm going to walkthrough how to use my C# "Export to Excel" class, which you can add to your C# WinForms / WPF / ASP.Net application, using one line of code.

Depending on whether your data is stored in a DataSet, DataTable or List<>, you simply need to call one of these three functions, and tell them what (Excel) filename you want to write to.

  • public static bool CreateExcelDocument<T>(List<T> list, string ExcelFilename
  • public static bool CreateExcelDocument(DataTable dt, string ExcelFilename
  • public static bool CreateExcelDocument(DataSet ds, string ExcelFilename)

...

And that's all you have to do. The CreateExcelDocument function will create a "real" Excel file for you.

For example, if you had a created a DataSet containing three DataTables called

  • Drivers
  • Vehicles,
  • Vehicle Owners,

..then here's what your Excel file would look like. The class would create one worksheet per DataTable, and each worksheet would contain the data from that DataTable.

image

...

Look, friends don't let friends use Office InterOp... (omg, especially for server/automated ops!). There are any number of options now available, many free or reasonably priced. Just... don't.... do... it...

Thursday, February 21, 2013

Excel with Excel without Excel... Seven Excel/XLS Libraries

Ginktage - 7 Libraries for Reading and Writing from/to Excel File in C#

Few months back , I was making an R&D on the possibilities of reading and writing to/from the Excel file from .NET (C#) . At that point of time , I came across various libraries and SDK’s available for Reading and Writing from Excel File in C# .

In this blog post ,I will list some of the libraries used for reading and writing from/to Excel sheet using C#. Note that some of the libraries are free/open source and few are commercial one’s.

...

image..."

If you've ever Automated/Inter-op'd Excel, you know that it can be "fun." The primary issue isn't the the Object Model, it's that it's COM and with .Net that can be a challenge to get the Releasing right. Then there's the license requirements, versions, etc.

So in short, if you don't need it, don't use it. As shown above, there's a number of libraries that you can use to read/write Excel files without having Excel installed...

Friday, October 19, 2012

Do you DSOFile? Tips for using it on an x64 OS and with Open XML (Office 2007+ XML formats)

Visual Studio Office Development (VSOD) Support Team - Considerations for using Dsofile on 64 bit OS

"There is a well-documented sample program, DSOFile, that enables reading and writing Office document properties (both old format files like *.xls, *.doc and *.ppt, as well as the new open xml formats like *.xlsx, *.docx and *.pptx). The DSOFile sample is compiled as 32 bit.

If you are using this sample in a 32 bit application on a machine with Office 2007 SP2 , Windows 64 bit, then DsoFile will not be able to fetch the properties of Open Xml format files.  This is because Office 2007 SP2 did not ship the 32 bit version of msoshext.dll (shell extension handler) which is the component DSOFile uses to read/write properties from Open XML files .

This issue is fixed in hotfix KB 2483216. This is also included in the Office 2007 Cumulative Update for February 2011 and subsequently in Office 2007 SP3.

The Office 2010 version of he hotfix is KB 2483230. This is also included in Office 2010 Cumulative Update for February 2011 and subsequently in Office 2010 SP1

If you wish to use DSOFile from a 64 bit program, then you should recompile the DSOFile to target for 64 bit.

An alternative approach to using Dsofile would be to use Open Xml SDK (or System.IO.Packaging). A sample that demonstrates this is given below :-..."

Can you believe I've been using DSOFile for over  8+ years? I first blogged about it in 2004... I find it interesting that it's still around and kind of, sort of, mostly supported. One thing I didn't know what was it supported (kind of, sort of) Open XML doc formats (if Office 2007/2010 is installed or the given msoshext.dll was available, which kind of defeats the purpose of a "no Office needed COM component to access Doc Properties, but still...).

In any case, if you need access to the COM Doc/Summary Properties from Office binary doc's and/or the Open XML doc's, this COM component has been pretty rock solid for me for many years...

What is DSO file again? The Dsofile.dll files lets you edit Office document properties when you do not have Office installed

The Dsofile.dll sample file is an in-process ActiveX component for programmers that use Microsoft Visual Basic .NET or the Microsoft .NET Framework. You can use this in your custom applications to read and to edit the OLE document properties that are associated with Microsoft Office files, such as the following:

  • Microsoft Excel workbooks
  • Microsoft PowerPoint presentations
  • Microsoft Word documents
  • Microsoft Project projects
  • Microsoft Visio drawings
  • Other files that are saved in the OLE Structured Storage format

The Dsofile.dll sample file is written in Microsoft Visual C++. The Dsofile.dll sample file demonstrates how to use the OLE32 IPropertyStorage interface to access the extended properties of OLE structured storage files. The component converts the data to Automation friendly data types for easier use by high level programming languages such as Visual Basic 6.0, Visual Basic .NET, and C#. The Dsofile.dll sample file is given with full source code and includes sample clients written in Visual Basic 6.0 and Visual Basic .NET 2003 (7.1).

...

Information about OLE document properties
Every OLE compound document can store additional information about the document in persistent property sets. These are collectively called the "Document Summary Properties." These property sets are managed by "COM/OLE" so that third-party clients can read this information without the aid of the main application that is responsible for the file.
To help developers that are interested in reading document properties, we have provided the following two interfaces to manage property sets:
  • IPropertySetStorage
  • IPropertyStorage
However, some high-level programming languages may have trouble using these interfaces because the interfaces are not Automation-compatible. To resolve this problem, developers can use an ActiveX DLL, such the "DsoFile sample" to read and to write the most common properties that are used in OLE compound documents. This applies particularly those that are used by Microsoft Office applications.
Use the DsoFile component from your custom application
The Dsofile.dll sample file reads and writes to both the standard properties and the custom properties from any "OLE Structured Storage" file. This includes, but is not limited to, the following:
  • Word documents
  • Excel workbooks
  • PowerPoint presentations

Because of the size and the speed of the Dsofile.dll sample file, the DLL can be much more efficient than trying to Automate Office to read document properties

...

 

Related Past Post XRef:
Download details: Developer Support OLE File Property Sample (DSOFILE) (DSOFile.DLL 2.0)
DSOFile.dll 2.0

Monday, April 23, 2012

Using native OpenXML to create an XlsX (Which provides an example of why I highlight tools that make OpenXML easier...)

CodeProject - Creating basic Excel workbook with Open XML

The purpose of this article is to describe how to create an Excel workbook using solely DocumentFormat.OpenXml.dll (namespace is DocumentFormat.OpenXml).

In order to test the samples you have to download and install the Open XML SDK 2.0 from Download Center.

The demo is created for both C# and Visual Basic.

These standards define the structure and the elements for the Office files. The Office files (like xlsx for Excel) themselves are zipped files that contain a specific directory and file structure. The files that hold the content of a spreadsheet are xml files like any other xml files.

In case of Excel files a basic xlsx file contains for example following files:

  • /[Content_Types].xml: Defines parts and extensions for the spreadsheet
  • /xl/workbook.xml: For e xample sheets that are included in the workbook
  • /xl/styles.xml: Styles used in the worksheets
  • /xl/sharedStrings.xml: Strings that are shared among cells
  • /xl/worksheets/sheet1.xml...: The actual worksheets

The actual package contains more files but in the scope of this article these are the most interesting ones. The demo projects included show few operations that are done to produce and modify these files.

About the project

The project itself is very simple. It consists of two classes: MainWindow class and a static Excel Class. The Excel class is responsible of all the operations done against the Excel spreadsheet. It's kinda utility class, but note that it's nowhere near ready. It's supposed to be used as a learning tool or a seed to an actual implementation.

When writing this demo I found out that Excel is very picky on the XML files. One surprise was that the order of the elements in XML files is very important. For example elements in style sheet such as fonts, fills, borders, cellStyleXfs, cellXfs etc must be in specific order. Otherwise the document is interpreted as corrupted.

Another observation was that the indexes of the elements are quite often used (for example the index of a shared string). However there is no support in the library to fetch the indexes so the collections have to be looped in order to calculate the index of a desired element.

So one of the best tools when building this was a utility to extract data from the xlsx (=zip) file to see what is the actual content.

image

Yes, that creates an XlsX. Stuff that anyone who has used the Excel Object Model will cause a minor brain explosion. Parts, Packaging, ShareStringTables, oh my...

This is a great example of why I keep my eyes open for examples and wrappers that make the promise of Open XML a little more accessible to mere mortals.

 

Related Past Post XRef:
Generating Xlsx's on the Server? You're using OpenXML, right? With help from the PowerTools for OpenXML?

Wednesday, March 28, 2012

Generating Xlsx's on the Server? You're using OpenXML, right? With help from the PowerTools for OpenXML?

OpenXML Developer - Quick Generation of Spreadsheet Data and Cell Styles

"This example looks at a couple of OpenXML spreadsheet topics. I have been working with the cell styles a lot lately and this is a first example showing how to add some of the named styles to a spreadsheet cell. I plan to include even more style options in my next example and blog post. Also, after I posted my example for generating a pivot table, some very helpful people mentioned that it was quite slow with large amounts of data. This example also shows an alternative method of generating large amounts of data in a worksheet.

The example code can be found at PowerTools for OpenXML on Codeplex. Look for the 2.2.3 release of the PowerTools Core.

The screen-cast is divided into two parts. The first half introduces the example and shows how to use the methods in the PowerTools Core library. The second half shows the details of how the XML is generated. If you are only interested in using the code as is, then you can skip the second part.

image

PowerTools for Open XML - PowerTools for OpenXML 2.2 (Note you want 2.2.3...)

image

Here's a snap of all you need to create a XlsX.  If you've ever used OpenXML, you'll know just how much time this can save you (and how much more sense this makes). OpenXML is great, but it's NOT the kind of Office API you're used too. This kind of library makes it that much easier to use...

SNAGHTML42fa277c

No Excel, no Office, all Net.

image

 

Related Past Post XRef:
Reusing the PowerShell PowerTools for Open XML in your C# or VB.Net world
PowerShell, OpenXML, WMI and the PowerTools for OpenXML = Doc generation for our inner geek
Because it’s a PowerShell kind of day… PowerTools for Open XML V1.1 Released
OpenXML PowerTools updated – Cell your Excel via PowerShell

Monday, January 02, 2012

Using OpenXML SDK to generate Word documents via templates (and without Word being installed)

Application design and programming in .NET - Utility to generate Word documents from templates using Visual Studio 2010 and Open Xml 2.0 SDK

This utility generates Word documents from templates using Content controls. The utility will be enhanced later as per feedback and source code is available for download at http://worddocgenerator.codeplex.com/. It has been created in Visual Studio 2010 and uses Open Xml 2.0 SDK which can be downloaded from http://www.microsoft.com/download/en/details.aspx?id=5124.

The purpose of creating this utility was to use the Open Xml 2.0 SDK to generate Word documents based on predefined templates using minimum code changes. These documents can either be refreshable or non- refreshable. I’ll explain this difference later. Also there is no dependency that Word should be installed.

A few samples for generating Word 2010 documents have been provided. More samples can be added later as per feedback. The screenshots below display the sample template and the document generated out of this template using this utility.

...

image..."

In short, generate Word documents on servers, in automated processes, etc, without Word being installed. [Insert lame "Friends don't let Friends use the Word Automation on Servers" statement here]

Thursday, October 13, 2011

The Noob's Guide to Open XML Dev (If you know how to spell OpenXML but that's about it, this is your Getting Started guide...)

OpenXML Developer - Getting Started with Open XML Development

"This blog post introduces the first in a series of screen-casts that are specifically for a developer starting development with Open XML for the first time. It is a project that I've been meaning to work on for some time, and I recently received the mandate that this should get done, so this is the start of it. In this video, I discuss the Open XML standard from a high level, discuss the resources that helped me get started, and point you to places to find additional resources. I've already recorded the second video, in which I discuss the various tools that you will want to be familiar with in order to do Open XML development. In the third video, I'll discuss the various typical development scenarios for Open XML. In the fourth video, I'll discuss platforms, languages, and libraries, and in the fifth, I'm going to discuss my current thoughts on development approaches. (At least, this is my current plan. We'll see how it proceeds.)

If you are an experienced Open XML developer, this first video in the series is probably not for you. This first video is targeted towards developers who know Open XML is a document format based on XML, and maybe not much more. Experienced developers may get something from subsequent videos, though.

..."

There's some things I love about the OpenXML SDK/format and some I hate (mostly how different the SDK API's are from the Office API's) but the like easily overrides the dis-like. Having an open format that's fully documented and easy(er) to spelunk is a night and day difference over trying to work directly with the Office binary formats.

So if you're a dev and interested in see what this OpenXML SDK thing, need to programmatically read/write OpenXML files (DocX, XlsX, etc) this video and series might be just the thing you need...

Wednesday, September 14, 2011

ODF/OpenXML, an update on the story...

Eric White's Blog - New Paper published by Peter O’Kelly – Revisiting Open Document Format and Office Open XML: The Quiet Revolution Continues

"Three years ago Peter O’Kelly wrote a paper titled, “What’s Up, .DOC? Open XML Formats, OpenDocument Format, and the Revolutionary Implications of XML in Productivity Applications.” That paper was a part of an industry-wide debate about Open XML and ODF. He has recently published a new paper that analyzes the current state of document formats.

This new paper, Revisiting Open Document Format and Office Open XML: The Quiet Revolution Continues [GD: click through for the actual link], discusses:

  • The business value of standardized, XML based document formats
  • A brief history of Open XML and ODF
  • The 2008 Open XML ISO controversy, and the response to Peter’s “What’s Up, .DOC?” Paper
  • An assessment of current Open XML and ODF market dynamics
  • Current standards activity
  • Projections into the future

..."

SNAGHTML23e56faf

Here's a snip from the document;

"Synopsis
It has been several years since the lively and highly polarized market debate about the relative merits and standards significance of the Open Document Format (ODF) and Office Open XML (OOXML) file format standards. Although ODF and OOXML have since largely faded from the mainstream technology industry press and blogosphere radar, both standards have continued to evolve and gain market support, with significant benefits for all organizations seeking to optimize their use of information contained in documents created with productivity applications.

This document provides an overview of the status and significance of ODF and OOXML. It starts with a summary of the business value of open and XML-based document formats, along with a review of the ODF/OOXML historical debate, including a recap of a widely-discussed January 2008 Burton Groupi report which included what were, at that time, considered provocative conclusions and market projections.

The document continues with a summary of some of the most impactful ODF- and OOXML-related industry changes during recent years, including Microsoft’s (surprising, to many market observers) commitment to support and contribute to both ODF and OOXML, as well as Oracle’s acquisition of Sun Microsystems, and the acquisition’s ramifications for OpenOffice.org (which served as the starting point for ODF, in 2000).

The analysis concludes with some market projections about likely next steps, as both ODF and OOXML continue to evolve.

..."

The story of the battle has been pretty quite recently, with OpenXML seemly slowly making its way deeper into, and natively used by, the business world (think DocX, XlsX, etc)

Monday, August 29, 2011

1 page, 101 Office 2010 Code Samples

Office Developer Center - Office 101 Code Samples

"Microsoft Office 2010 gives you the tools needed to create powerful applications. These Visual Basic for Applications (VBA) code samples can assist you in creating your own applications that perform specific functions or as a starting point to create more complex solutions.

Each code sample consists of approximately 5 to 50 lines of code demonstrating a distinct feature or feature set in VBA. Each sample includes comments describing the sample, and setup code so that you can run the code with expected results or the comments will explain how to set up the environment so that the sample code runs.)

...

image..."

That's an official boat load of Office 2010 code samples.... :)

Thursday, July 28, 2011

Open XML Opens Office Document Metadata (without Office)

Microsoft has been releasing a number of Open XML SDK v2 samples on the MSDN Code Gallery in the past few days, samples that interested me professionally, and so wanted to round the up for easy reference.

The key point is that these provide examples of doing things for the old/binary formats, required using Office COM, whereas now we can do it all without installing Office. Just another reason to love the more modern approach taken with the Open XML file format.

Here's an example of one of the above links;

Retrieving Comments from Word 2010 Documents by Using the Open XML SDK 2.0

"The sample provided with this article includes the code necessary to retrieve the XML block that contains all the comments from a Word 2007 (or later) document. The following sections walk you through the code, in explicit detail. When you use the sample code to retrieve the comments, the procedure returns an XML element, named w:comments, which contains the XML block of information from the original document. It's up to you (and your application) to interpret the results of retrieving the comments.

...

SNAGHTML522683bf

 

Related Past Post XRef:
Checking for Microsoft Word DocX/DocM Revisions/Track Changes without using Word... (via OpenXML SDK, LINQ to XML or XML DOM)
LINQ to XlsX... Using VB.Net, LINQ, the OpenXML SDK and a little C# helper, to query an Excel XlsX

Official boat-load, as in supertanker, sized OpenXML content list (Insert "One OpenXML content list to rule them all" here)
So how do I get from here to OpenXML? Got a map for you, an Open XML SDK Blog Map…
Where to go to scratch your OpenXML dev info itch…
"Open XML Explained" Free eBook (PDF)

Open XML SDK 2.0 for Microsoft Office Released – Automate Office documents without Office
Opening OpenXML, the Open XML Package Editor Power Tool for Visual Studio 2010
Open XML 2.0 Code Snippets for VS2010 (and VS2008 too)
Open XML Format SDK 2.0 Code Snippets for Visual Studio 2008 – 52 C#/VB Code Snippets to help ease your Open XML coding

OpenXML Viewer 1.0 Released – Open source DocX to HTML conversion, with IE, Firefox and Opera (and/or command line) support

Powering into OpenXML with PowerShell

Microsoft Office File Formats and Microsoft Office Protocols Documentation Refreshed
Microsoft Office File Formats and Protocols documentation updated for Office 2010 (Think “Now with added ‘X’ flavor… DocX, PptX, XlsX, etc”)

Monday, July 18, 2011

LINQ to XlsX... Using VB.Net, LINQ, the OpenXML SDK and a little C# helper, to query an Excel XlsX

OpenXML Developer - Query Open XML Spreadsheets in VB.NET using LINQ

"When working with SpreadsheetML, one of the most common needs is to retrieve the data from a worksheet or a table in as easy a fashion as possible. There has been a fair amount written for C# developers to do this, but not nearly as much for VB.NET. Some time ago, I wrote a blog post, Using LINQ to Query Excel Tables, which introduced a few C# classes and extension methods that make it easy to query SpreadsheetML. This post presents a super-easy way to use that code from VB.NET.

To make it as easy as possible to get going using LINQ with VB to access SpreadsheetML, I've recorded the following screen-cast that walks through the process of building a VB.NET application that uses the code from that blog post. Here is the video:

...

SNAGHTML1e8894fe

...

SNAGHTML1e88582d

..."

Nice example code and video that I've not seen elsewhere...

Wednesday, June 15, 2011

Official boat-load, as in supertanker, sized OpenXML content list (Insert "One OpenXML content list to rule them all" here)

OpenXMLDeveloper.org - New list of Open XML content now available

"There is a whole lot of content (articles, screen-casts, and blog posts) on Open XML. Over the last couple of months, I have been making a list of as much Open XML content as I can find. I have found and categorized some 380 pieces of content, and organized them by keyword and author. Keywords (and author names) in the content list are hyperlinks, so it is easy to navigate around and, for instance, look at the list of Open XML content that pertains to SharePoint, or all of the content associated with content controls. I have made this list so it is very easy to maintain, so if you are an author of some good Open XML content, please drop me an email (eric at ericwhite.com) and I'll be happy to include it in the list. And of course, as I discover (or write) new content, I'll be adding it to the list. If you need to find an article or screen-cast that explains how to accomplish some particular task, this should be the first place you start. Click on the following link to see the content list. This link will remain the same as I update the list, so feel free to bookmark it.

Open XML Content List

The way that I created this list (and keep it updated) is interesting in and of itself. I'm maintaining this list as a table in an Open XML spreadsheet. I used the code that I presented in the post on Using LINQ to Query Excel Tables. I didn't have to modify that code at all to make it work with my table in my worksheet. Then, I wrote some LINQ code to transform the results of the query into HTML tables, with all appropriate links in the right place, and so on. To update the pages in the wiki on OpenXMLDeveloper.org, I need to simply paste two sets of generated HTML code into two pages in the wiki.

I'll be writing an article on the approach I took (published here on OpenXMLDeveloper.org of course).

..."

OpenXMLDeveloper.org - Open XML Content List

SNAGHTML39b5958

We're talking one massive list... 126 printed page long list. Meaning pretty much if you're looking for OpenXML information, start here.