In my last post I wrote about how one can extract InfoPath processing instructions from a message inside an orchestration. While it seems to work all right, the idea generally has nothing to do with business. A better place for doing this would be a pipeline. So I wrote a simple pipeline which is basically identical to the default XMLReceive with one exception - in the decode stage it has a component that does work of extracting processing instructions from an XML document and subsequently writing them into the message context.
Solution files can be downloaded here.
A few notes about the component which I called PIExtractor:
1. It is designated for the decode stage. I think it is quite an appropriate place for it :-).
2. It extracts processing instructions that are immediate children of the root (/). That is, any processing instructions that are descendants of the document element are not extracted.
3. It places processing instructions into the XMLNORM.ProcessingInstruction property which is a standard property so you don't need to deploy a custom property schema. XMLNORM.ProcessingInstruction is basically used for outgoing messages so it seems all right to make use of it for ingoing messages.
4. It does not promote XMLNORM.ProcessingInstruction but writes it. Which means that you cannot use this propery as a filter expression for subscribers. The property is not promoted mostly for the reason that processing instructions may be longer than 256 characters and promoted properties are limited to 256 characters in length while written properties have no length limitation. Anyway, I can hardly imagine why one should filter on Processing Instructions :-).
5. To handle large XML documents relatively efficiently, PIExtractor uses XPathDocument and XPathNavigator. XPathDocument has a bug (feature?) - its constructor closes the passed stream. To avoid this I wrote a simple wrapper class around Stream with an empty implementation of the Close method.
6. Working with streams in pipelines is somewhat complicated. So the output message is cloned from the input message and then its context is altered. No copying streams are performed - I didn't want to degrade performance with redundant allocation of memory.
The component along with the pipeline were tested in the BizTalk 2004 SP1 environment.