Last Update: Sep 04, 2024 | Published: Dec 17, 2014
I see a lot of posts, questions and scripts that involve importing a CSV file into PowerShell. On one hand, working with a CSV file couldn’t be easier. On the other hand, PowerShell will happily turn any imported CSV file into a collection of objects, where working with those objects can be a little confusing if you don’t completely understand the process. I think part of the problem PowerShell beginners face is that they are still thinking about text, especially when importing a CSV file. Let me show you what I’m talking about.
Here’s a sample CSV file that includes test data I want to use in PowerShell.
"Date","Name","Service","Key","Size","Test" "11/18/2014 6:45:30 AM","Data_1","ALG","","25","false" "11/18/2014 2:17:30 AM","Data_2","AppIDSvc","2","-30","false" "11/17/2014 11:57:30 PM","Data_3","Appinfo","3","42","true" "11/17/2014 11:21:30 AM","Data_4","AppMgmt","4","-7","true" "11/17/2014 11:33:30 AM","Data_5","aspnet_state","","28","true" "11/16/2014 8:51:30 AM","Data_6","AudioEndpointBuilder","6","35","false" "11/16/2014 6:00:30 PM","Data_7","AudioSrv","7","12","false" "11/16/2014 3:41:30 AM","Data_8","AxInstSV","8","","true" "11/16/2014 11:39:30 AM","Data_9","BDESVC","9","32","true" "11/16/2014 11:03:30 PM","Data_10","BFE","10","15","false" "11/18/2014 4:03:30 AM","Data_11","BITS","11","28","true" "11/15/2014 7:57:30 AM","Data_12","Browser","12","31","true" "11/14/2014 12:23:30 PM","Data_13","bthserv","13","16","true" "11/17/2014 10:49:30 PM","Data_14","CertPropSvc","14","-39","false" "11/14/2014 5:33:30 AM","Data_15","cfWiMAXService","15","20","true" "11/13/2014 12:13:30 AM","Data_16","clr_optimization_v2.0.50727_32","16","42","true" "11/15/2014 10:54:30 PM","Data_17","clr_optimization_v2.0.50727_64","17","28","true" "11/16/2014 11:57:30 PM","Data_18","clr_optimization_v4.0.30319_32","18","-45","true" "11/14/2014 2:38:30 AM","Data_19","clr_optimization_v4.0.30319_64","19","24","true" "11/11/2014 11:13:30 AM","Data_20","cmdAgent","20","44","true" "11/16/2014 1:33:30 AM","Data_21","COMSysApp","21","24","true" "11/17/2014 8:37:30 AM","Data_22","ConfigFree Gadget Service","22","41","true" "11/11/2014 6:28:30 PM","Data_23","ConfigFree Service","23","33","true" "11/13/2014 4:45:30 PM","Data_24","CryptSvc","24","47","true" "11/9/2014 6:28:30 PM","Data_25","CscService","25","26","true"
I’ll import once and save the results to a variable so I don’t have to keep re-importing the data.
$data = import-csv c:scriptstestdata.csv
As you can see, each value in the CSV header became a property name.
I recommend you make sure the column entries in the header line do not have any spaces. It isn’t illegal, but it makes working with the resulting properties a bit more difficult as you have to quote the property name.
Next, let’s say I want to get results where the size is less than 0. Looking at the output, you might think this will work:
$data | where {$_.size -lt 0}
Given the first screenshot you would expect to see at least Data_2 with a size value of -30. But this isn’t the case.
The only result is an object with nothing defined for the size property. When you face this situation, this is where Get-Member is useful. Exactly what type of objects are in $data?
Notice that all of the properties are treated as strings. This means that any operation that relies on something other than a string will fail to run properly.
The date sort is wrong, and I shouldn’t be seeing anything with a test value of false. One way around this issue is to tell PowerShell to treat properties as a certain type. Let’s go back to my first filtering expression.
$data | where {$_.size -as [int] -lt 0} | format-table
I’m using the –as operator and telling PowerShell to treat the size property as an integer. Now I get the result I expect.
You have to do this sort of type conversion whenever you are doing something with the property. However, using –as doesn’t always work. If you try this with my test data, then you’ll see that it won’t work.
$data | where {$_.Test -as [boolean]} | format-table
An alternative is to use one of the built-in methods to convert the value.
You’ll see that all of these methods require a parameter value, which helps PowerShell determine how to format the value. For our purpose, this System.IFormatProvider method comes down to two choices: a number or data.
[System.Globalization.NumberFormatInfo]::CurrentInfo
[System.Globalization.DateTimeFormatInfo]::CurrentInfo
The easiest way is to reference them directly with the .NET Framework. We already know that the size property will be treated as a string. But with a little help, we can tell PowerShell to treat it as an integer.
Now I can run a command like this:
$data | where {$_.Test.Toboolean([System.Globalization.NumberFormatInfo]::CurrentInfo)} |
sort {$_.Date.todateTime([System.Globalization.DateTimeFormatInfo]::CurrentInfo)} -descending |
format-table
I will admit this is a bit advanced, but it works.
It is important for you to remember that you haven’t changed the underlying objects in $data, only how they are being processed in your PowerShell expression.
As you can see, I still have string properties.
One way to avoid all of this is to give your newly imported objects the proper type from the very beginning. Allow me to share an advanced function of mine called Import-TypedCSV.
#requires -version 3.0 Function Import-TypedCSV { <# .Synopsis Import a CSV file with typed property values. .Description PowerShell will happily import any CSV file and turn it into an object. Unfortunately, PowerShell will treat everything as a string. This command will attempt to dynamically assign a type to each imported property. If a property is missing a value you will be prompted to provide one. .Example PS C:> Import-TypedCSV c:workdata.csv | sort starttime -Descending | Select Name,StartTime,VM | format-table Name StartTime VM ---- --------- -- thunderbird 12/4/2014 4:22:50 PM 640290816 powershell_ise 12/4/2014 1:42:36 PM 1101860864 SnagitEditor 12/4/2014 8:05:33 AM 522137600 waterfox 12/4/2014 8:05:32 AM -1445900288 Dropbox 12/4/2014 8:05:26 AM 356761600 Snagit32 12/4/2014 8:05:23 AM 466067456 SugarSync 12/4/2014 8:05:18 AM 321622016 googledrivesync 12/4/2014 8:05:17 AM 213491712 BoxSync 12/4/2014 8:05:11 AM 880230400 explorer 12/4/2014 8:04:53 AM 1332060160 Import-CSV would have treated StartTime as a string which would have resulted in an inaccurate sort. .Example PS C:> Import-TypedCSV c:workdata.csv | where {$_.vm -ge 800mb} Name : powershell_ise Path : C:WINDOWSSystem32WindowsPowerShellv1.0powershell_ise.exe StartTime : 12/4/2014 1:42:36 PM VM : 1101860864 WS : 212324352 Name : BoxSync Path : C:Program FilesBoxBox SyncBoxSync.exe StartTime : 12/4/2014 8:05:11 AM VM : 880230400 WS : 134922240 Name : explorer Path : C:windowsExplorer.EXE StartTime : 12/4/2014 8:04:53 AM VM : 1332060160 WS : 118648832 .Example PS C:> dir c:workreport5.csv | import-typedcsv | sort lastmodified -Descending | format-table Modified Name Path Size Type -------- ---- ---- ---- ---- 10/31/2014 8:43:03 AM databits.txt C:workdatabits.txt 2282783 .txt 8/26/2014 8:55:56 AM nwork.xml C:worknwork.xml 3053890 .xml 8/8/2014 4:30:36 PM modules.xml C:workmodules.xml 9161022 .xml 6/5/2014 5:37:14 PM LWAPlugin64BitInstaller32.msi C:workfooLWAPlugin64BitInstaller32.msi 7622656 .msi .Notes Last Updated: December 4, 2014 Version : 2.0 Learn more about PowerShell:Essential PowerShell Learning Resources.Inputs [string] .Outputs [PSCustomobject] .Link Import-CSV #> Param ( [Parameter(Position=0,Mandatory=$True,HelpMessage="Enter the CSV filename and path", ValueFromPipeline,ValueFromPipelineByPropertyName)] [ValidateScript({Test-Path $_})] [Alias("PSPath")] [string]$Path, [ValidateNotNullorEmpty()] [string]$Delimiter="," ) Begin { Write-Verbose -Message "Starting $($MyInvocation.Mycommand)" Write-Verbose "Using delimiter: $delimiter" } #begin Process { Write-Verbose "Importing $path" #import the data $data=Import-Csv -Path $path -delimiter $delimiter #get the first element Write-Verbose "Returning the first item out of $($data.count)" $item=$data[0] #get the property names $properties=$item | get-member -MemberType NoteProperty | Select -ExpandProperty Name Write-Verbose "Found $($properties.count) properties" #Define some regular expression patterns to identify types based on values $datetime="^d{1,2}/d{1,2}/d{2,4}|^d{1,2}:d{1,2}:d{1,2}" $number="^d+$|^-d+$" $double="^d+.d+$|^-d+.d+$" $boolean="^True$|^False$" #define a variable for the custom properties $custom=@() foreach ($property in $properties) { $value=$item.$property #cast each $property specifically as a string and pull out any spaces [string]$property=$property.Replace(" ","") Write-Verbose ("Testing property {0}. Sample value is {1}" -f $property,$value) if ($value) { Switch -regex ($value) { $datetime { Write-Verbose "Treating as Datetime" $type="datetime" $exp=$ExecutionContext.InvokeCommand.NewScriptBlock("[$type]`$_.$property") } $number { #further test value to determine an appropriate class Write-Verbose "Testing $value" Switch ($value) { {$_ -as [int32]} { Write-Verbose "Treating as Int32" $type="Int32" Break } {$_ -as [uint32]} { Write-Verbose "Treating as Uint32" $type = "uint32" Break } {$_ -as [int64]} { Write-Verbose "Treating as int64" $type = "int64" Break } {$_ -as [uint64]} { Write-Verbose "Treating as Uint64" $type = "uint64" Break } Default { #treat as a string if nothing matches Write-Verbose "Can't determine numeric value." $type = "string" } } #switch <# old code if ($value -as [int32]) { Write-Verbose "Treating as Int32" $type="Int32" } elseif ($value -as [uint32]) { Write-Verbose "Treating as Uint32" $type = "uint32" } elseif ($value -as [int64]) { Write-Verbose "Treating as int64" $type = "int64" } elseif ($value -as [uint64]) { Write-Verbose "Treating as Uint64" $type = "uint64" } else { #treat as a double Write-Verbose "Treating as a Double" $type = "double" } #> $exp = $ExecutionContext.InvokeCommand.NewScriptBlock("[$type]`$_.$property") } $double { Write-Verbose "Treating as Double" $type = "Double" $exp = $ExecutionContext.InvokeCommand.NewScriptBlock("[$type]`$_.$property") } $boolean { Write-Verbose "Treating as a Boolean" $type = "boolean" #convert True and False strings into valid Boolean values $exp = $ExecutionContext.InvokeCommand.NewScriptBlock("if (`$_.$property -match 'True') {`$v=1} else {`$v=0} ; `$v -as [$type]") } Default { Write-Verbose "Treating as String" $type = "String" } } #switch } #if item.property else { #there must be an empty value in the first object so prompt for the type $msg = ("There was no value found for {0}. What type do you want to use? For example Int32, Datetime or String" -f $property) $type = Read-Host $msg $exp = $ExecutionContext.InvokeCommand.NewScriptBlock("if (`$_.$property) {[$type]`$_.$property} else {[null]`$_.$property}") Write-Verbose "Treating $property as $type. User-provided value." } #add the property to the custom property array if ($type -eq "String") { $custom+=$Property } else { #create a scriptblock from the property name $custom+=@{Name="$Property";Expression=$exp} } } #foreach #Now take data and 're-type' the objects $data | Select-Object -Property $custom } #process End { Write-Verbose "Ending $($MyInvocation.Mycommand)" } #end } #end function
Without going into extreme detail, I’ve tried to thoroughly document the function, where my command looks at the first line of data in the CSV file and attempts to determine the appropriate object type using regular expressions.
If any of the properties for the first object have no value, you will be prompted to specify the object type.
But now that everything is properly typed, using the data is much easier.
I do have yet one more alternative you might want to consider, especially if you often import the same type of CSV file repeatedly. It is definitely an advanced technique, which I’ll guide you through in a future article.