I have some legacy XML documents stored in a database as a blob, which are not well formed XML. I'm reading them in from a SQL database, and ultimately, as I am using C#.NET, would like to instantiate them as an XMLDocument.
When I try to do this, I obviously get an XMLException. Having looked at the XML documents, they are all failing because of undeclared namespaces in specific XML Nodes.
I am not concerned with any of the XML nodes which have this prefix, so I can ignore them or throw them away. So basically, before I load the string as an XMLDocument, I would like to remove the prefix in the string, so that
<tem:GetRouteID>
<tem:PostCode>postcode</tem:PostCode>
<tem:Type>ItemType</tem:Type>
</tem:GetRouteID>
becomes
<GetRouteID>
<PostCode>postcode</PostCode>
<Type>ItemType</Type>
</GetRouteID>
and this
<wsse:Security soapenv:actor="">
<wsse:BinarySecurityToken>token</wsse:BinarySecurityToken>
</wsse:Security>
becomes this :
<Security soapenv:actor="">
<BinarySecurityToken>token</BinarySecurityToken>
</Security>
I have one solution which does this like so :
<appSettings>
<add key="STRIP_NAMESPACES" value="wsse;tem" />
</appSettings>
if (STRIP_NAMESPACES != null)
{
string[] namespaces = Regex.Split(STRIP_NAMESPACES, ";");
foreach (string ns in namespaces)
{
str2 = str2.Replace("<" + ns + ":", "<"); // Replace opening tag
str2 = str2.Replace("</" + ns + ":", "</"); // Replace closing tag
}
}
but Ideally I would like a generic approach for this, so I don't have to endlessly configure the namespaces I want to remove.
How can I achieve this in C#.NET. I am assuming that a Regex is the way to go here?
UPDATE 1
Ria's Regex below works well for the requirement above. However, how would I need to change the Regex to also change this
<wsse:Security soapenv:actor="">
<BinarySecurityToken>authtoken</BinarySecurityToken>
</Security>
to this?
<Security>
<BinarySecurityToken>authtoken</BinarySecurityToken>
</Security>
UPDATE 2
Think I've worked out the updated version myself based on Ria's answer like so :
<(/?)\w+:(\w+/?) ?(\w+:\w+.*)?>