In part one of this article series, we started the journey of transforming CSV files to XML with PowerShell. Let’s start dipping our toes into the XML waters. One of the reasons I’m moving from CSV to XML is to capture type information for different properties.
For demonstration purposes, I’m going to import a CSV file and type the incoming object properties using the functions I shared in my previous article.
The variable $data is now a collection of typed objects, imported from the CSV file. To get this into an XML file that I can use outside of PowerShell, I initially thought I'd have to show you how to create an XML document from scratch. Instead, we can take advantage of a cmdlet called ConvertTo-XML. This cmdlet will create an XML document in memory.
To create the file, I need to invoke the Save() method, and I recommend using a complete file name and path.
Here's a sample of the end result. You'll notice that there is a type attribute for each property node. That will come in handy when we attempt to reimport this XML into PowerShell. You can also see that the nodes are generic, i.e., Object. Although you may happy with that, I'm never satisfied. Instead, I want to rename nodes so that, in this file, I have <Processes> and <Process> since that is what they are. Renaming an XML node is a bit tricky because there's no simple method to rename a node. Instead, you'll need to follow a process. First, I'm going to re-create my XML document from $data.
I changed variable names to remind me that I'm working with in-memory XML. I have not saved it to a file, because I need to copy existing nodes with a new name. I'll start with all of the actual objects.
I'm using an XPath query to select everything that is <Object>…</Object>. Be careful, as XML is case-sensitive. Here's the tricky part. For each node in the collection of child nodes, I need to create an empty node with my new name, i.e., Process. Next, I'll copy the existing node's InnerXML property to the new node, which has the effect of copying it. I also need to copy the type attribute for each node. I can then insert the new node into the document and delete the original node.
Looking at the XML document, I now have this: I have to repeat the process for the top-level, Objects, node.
Finally all I need to do is save the XML document.
Now the XML is improved. Considering I started with a CSV file this is pretty good. In the screenshot, you can see that some properties have no type, and that's because there is no value for those properties. If you wanted the XML file to be more complete, you could manually edit the file, or you could try something else. Suppose you have some XML data that's converted from PowerShell.
By default, everything is a string. Let's fix that using a type map, like I used with my CSV files. If you don't have a CSV file to start from, then you'll manually need to create the hashtable from the properties. Next, I wrote a function to go through each child node and apply a type based on the mapping table.
If you rename nodes from the defaults Objects, you'll need to specify the new node type. My sample hasn't been renamed, so all I need to do is this:
And save the results to a file.
All nice and orderly. Next time we'll look at importing XML files like this and writing objects to the pipeline, and we'll try to come with a final set of commands for transforming CSV files to XML.