Pages Menu
Categories Menu

Posted by on 20th November, 2008

SharePoint and InfoPath Data Extraction…

SharePoint and InfoPath Data Extraction…

I have worked over the last few years much with business process automation and especially with SharePoint. Here is a somewhat straightforward solution/approach to automate a much used, but manual process in SharePoint.

Let’s say you have a document library that users upload documents to. Unless you subscribe to the library and want to be notified, not much really happens when the document lands in the library.

So, what if you could have SharePoint take action immediately and initiate, let’s say a Workflow, or simply store the values somewhere else, or email a user with the details…or anything else that you want?

And not just with the fact that a document was uploaded (that is already built-in), but, even better, with data from the actual document…

Let me walk you through the scenario: we start out with one of the sample InfoPath forms: Asset Tracking and we publish a Form Library based off this InfoPath form to SharePoint. We end up with an empty Form Library as shown below.

User clicks on New to add a new document and InfoPath launches the corresponding form…

The user then fills out the form and chooses to close and save the form…

The form is saved to the library and SharePoint alerts our Feature of the ItemAdded event being triggered.

In our Feature we override that method and our custom code runs, goes into the InfoPath document, finds the XML elements it is expected to find and extracts its contents.

Then, in this case and to show the concept, we generate a text file that writes out that content we just collected…

If we open up that text file we see that our Feature collected the data of four fields on the form…

as we can see, this is exactly the data we filled our form with…

Cool, huh? Now, of course there is no reason to write any data to a text file, but imagine now being able to do anything with that data…tying an event in SharePoint to any external application and completely transparent to the user.

Let’s how it is done…

As you know, a SharePoint Feature contains two xml files: feature.xml and elements.xml.

The only change we make is in the elements.xml file where the ListTemplateID=115, instead of 101. The entire list of codes is a long one, but 115 is for InfoPath forms and so our elements.xml file will look like this:

This is needed for SharePoint to run the code in our Feature only when an event is fired in a XML Form Library, which ours is…

Since we are going to tell our code to dive into the InfoPath form, we need to now what to look for. So let’s open up an empty form and see what it looks like:

As you can see, it contains XML elements for every one of the form fields…

Below, I am pointing out the elements we want to extract the data from. In this case, four fields, but it could be as many as you want…

Ok, let’s look at our class: It is pretty much the same class used in the Developing a SharePoint 2007 Feature…a step by step tutorial., with some more meat added.

In general terms, these are the steps taken in the class:
[list style=”arrow”]

  • Get a reference to the site and item that fired the event
  • Create XmlTextReader to hold the raw XML (using MemoryStream)
  • Use a XmlResolver for authenitacion purposes
  • Load the raw XML into XmlDocument
  • Navigate the nodes to obtain proper elements
  • Write content of elements into text file

That’s it!


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using Microsoft.SharePoint;
namespace SharePointEventHandler
 public class ItemEventReceiver : SPItemEventReceiver
  private StreamWriter writer;
  public override void ItemAdded(SPItemEventProperties properties)
    //gets a reference to the SharePoint Site and the Item (InfoPath form)
    //that fired the event: we get this from the properties parameter...
    SPWeb thisWeb = new SPSite(properties.WebUrl.ToString()).OpenWeb();
    SPFile thisItem = thisWeb.GetFile(properties.AfterUrl.ToString());

    //We use a memory stream to strip out/ignore any linebreaks (nbsp) 
    //characters that InfoPath leaves in the served XML on the server.
    MemoryStream inStream = new MemoryStream(thisItem.OpenBinary());
    XmlTextReader reader = new XmlTextReader(inStream);
    XmlDocument myXml = new XmlDocument();

    //needed to authenticate the .load method when accessing secured 
    //document on SharePoint site. Will not work without...
    XmlUrlResolver resolver = new XmlUrlResolver();
    resolver.Credentials = System.Net.CredentialCache.DefaultCredentials;
    reader.XmlResolver = resolver;
    myXml.XmlResolver = resolver;

    //needed to handle the namespace in the InfoPath XML (my:)
    XmlNamespaceManager nsmgr = new XmlNamespaceManager(myXml.NameTable);
    nsmgr.AddNamespace("my", "[...]12:27");

    //load the raw XML into the XML object...

    //extracts the main XmlNodes
    XmlNode node_Employee=
     myXml.DocumentElement.SelectSingleNode("my:employee", nsmgr);
    XmlNode node_Assets=
    myXml.DocumentElement.SelectSingleNode("my:assets", nsmgr);

    //Now write out all the data to our text file or do anything else you want with it...
    writer.WriteLine("Event Details");
    writer.WriteLine("CurrentUserId: " + properties.CurrentUserId.ToString());
    writer.WriteLine("BeforeUrl: " + properties.BeforeUrl.ToString());
    writer.WriteLine("SiteId: " + properties.SiteId.ToString());
    writer.WriteLine("ListId: " + properties.ListId.ToString());
    writer.WriteLine("Form Data Content");
     "Employee Name:"+node_Employee["my:employeeName"].InnerText);
     "Employee Department:"+ node_Employee["my:employeeDepartment"].InnerText);
     "Asset ID: "+node_Assets["my:asset"]["my:assetID"].InnerText);
     "Asset Description: "+node_Assets["my:asset"]["my:assetDescription"].InnerText);
   catch (Exception ex)
    properties.Cancel = true;
    properties.ErrorMessage = ex.Message;
    //write out our exception and innerException