How to Use PowerShell Grep: Select-String and RegEx Real World Examples

PowerShell

Grep (Global Regular Expression Print) is a commonly used Linux command for searching strings of characters in text files. There’s is no such thing as PowerShell grep. But naturally, you can get the same functionality in PowerShell. You’ll just need to use the Select-String cmdlet instead.

In this article, I’ll show you how to get your text string into a PowerShell object, and then use Select-String to search the string for patterns.

PowerShell grep – Can I grep in PowerShell?

Grep is used in Linux to search for regular expressions in text strings and files. There’s no grep cmdlet in PowerShell, but the Select-String cmdlet can be used to achieve the same results. The Windows command line has the findstr command, a grep equivalent for Windows. But it’s better to use Select-String when working with PowerShell.

Once you have objects, you can then take full advantage of PowerShell. I’ll admit that using regular expressions (RegEx) can be a bit vertigo-inducing, so let’s look at using the Select-String cmdlet first. We’ll cover RegEx later.

Related article: Use a PowerShell Substring to Search Inside a String

The assumption is that the text output is in a predictable and known format, where you don’t have any null or empty values. I’m going to use a text file that has RAID and disk information.

This is what the file looks like:

Figure 1: Text-Based Output. (Image Credit: Jeff Hicks)
Text-Based Output. (Image Credit: Jeff Hicks)
​$file = "c:\work\raidreport.txt"

Return limited set of objects using Select-String

I only want the last part of the file turned into objects.

Figure 1: Text output to convert. (Image Credit: Jeff Hicks)
Text output to convert. (Image Credit: Jeff Hicks)

The column headings will be the property names. Your column heading could be anywhere in the file, so the technique you use to get the line might vary. I’ll be using a simple pattern with Select-String.

​$h = Get-content $file | select-string "^ID\s+Chassis"
Figure 2: Captured headings. (Image Credit: Jeff Hicks)
Captured headings. (Image Credit: Jeff Hicks)

Removing spaces, or other characters, from property names using a pattern

Personally, I don’t like spaces or other characters in the property names, so I need to do a little cleaning with regular expressions. First, I’m going to replace any parentheses or carats with nothing.

​$h = $h.ToString() -replace "\(|\)|<|>", ""

Note that because the parentheses characters are special regular expression characters, I need to escape them with a backslash.
If I wanted to cover more possibilities, then I could use this type of pattern:

​$h = $h.tostring() -replace "[^\s+|\w]",""

The pattern says to find anything that isn’t a space and isn’t a word character, and replace it with nothing. In either event, $h is now one step cleaner.

Figure 3: Removing extra characters. (Image Credit: Jeff Hicks)
Removing extra characters. (Image Credit: Jeff Hicks)

To get rid of the spaces, I need to turn to one more slightly advanced regex pattern.

​$h = $h.tostring() -replace "(?<=\S)\s{1}(?=\S)", "_"

This is tricky, because I want to leave the spaces between ID and Chassis, but remove them between RAID and ID. So I have to use something called a lookahead and lookbehind. The regular expression pattern is saying, if the current location is a single space, then look behind and see if it is a single non-space character and look ahead for another non-space character. If this is true, then replace the match with an underscore.

Figure 4: Removing spaces within property names. (Image Credit: Jeff Hicks)
Removing spaces within property names. (Image Credit: Jeff Hicks)

All that remains to build the list of property names is to split this on spaces.

​$names = $h -split "\s+"
Figure 5: The array of refined property names. (Image Credit: Jeff Hicks)
The array of refined property names. (Image Credit: Jeff Hicks)

Use Get-Content and Select-String to parse a text file

You can certainly create the array of property names manually, especially if you want to use something other than the originals. Next, we need to use Get-Content to parse.

​$d = Get-content $file | select-string "^c0d"

In much the same way as before, I’m going to turn each line into a separate object. To do that, I need to split each line into an array using the spaces as a delimiter.

Create a hashtable and convert each line of text into an object

Next, I can loop through the list of property names and create a hashtable using the corresponding value from the split line. The hashtable is then easily turned into a custom object.

$data = foreach ($line in $d) {
 $info = $line -split "\s+"
 $hash = [ordered]@{}
for ($i=0;$i -lt $names.count;$i++) {
$hash.add($names[$i],$Info[$i])
} #end For
[pscustomobject]$hash
} #end Foreach

The end result is a collection of objects.

Figure 6: New objects displayed in a table. (Image Credit: Jeff Hicks)
New objects displayed in a table. (Image Credit: Jeff Hicks)

There is one potential drawback, where every property is a string.

Figure 7: Converted object properties are strings. (Image Credit: Jeff Hicks)
Converted object properties are strings. (Image Credit: Jeff Hicks)

Build a mapping hashtable

One solution is to build a mapping hashtable.

​$typehash = @{}
foreach ($name in $names) {
 $typeHash.Add($Name,(Read-Host "What type is $name, i.e. string or datetime"))
}
Figure 8: Creating a Typename map. (Image Credit: Jeff Hicks)
Creating a Typename map. (Image Credit: Jeff Hicks)

The process to convert each line of text into an object is very similar to what I just showed you with the addition of converting each value into the necessary type.

$data = foreach ($line in $d) {
$info = $line -split "\s+"
$hash = [ordered]@{}
for ($i=0;$i -lt $names.count;$i++) {
  Switch ($typehash.item($names[$i])) {
   "string" { $Value = [convert]::ToString($info[$i]) }
   "int" { $Value = [convert]::ToInt16($info[$i]) }
   "int32" { $Value = [convert]::ToInt32($info[$i]) }
   "int64" { $Value = [convert]::ToInt64($info[$i]) }
   "datetime" { $Value = [convert]::ToDateTime($info[$i]) }
   "double" { $Value = [convert]::ToDouble($info[$i]) }
   "boolean" { $Value = [convert]::ToBoolean($info[$i]) }
}
  $hash.add($names[$i],$Value)
} #end For
 [pscustomobject]$hash
} #end ForEach

I’ve inserted a switch construct based on the corresponding typehash entry. So if $i is 1, then $names[$i] is ‘Chassis’, which has a corresponding value of ‘int’.

Figure 9: Testing the type hashtable. (Image Credit: Jeff Hicks)
Testing the type hashtable. (Image Credit: Jeff Hicks)

In this case, the value will be converted to Int16. You can add other types or conversion commands as necessary. But once I run the text through, I now have property typed objects in $data.

Figure 10: Verifying property types. (Image Credit: Jeff Hicks)
Verifying property types. (Image Credit: Jeff Hicks)

Sort and format your results

Now commands like this will work properly.

​$data | Sort Raid_ID,Slot -Descending | format-table -GroupBy RAID_ID -Property ID,Status,Media,Slot,Size* -AutoSize
Figure 11: Sorted and formatted results. (Image Credit: Jeff Hicks)
Sorted and formatted results. (Image Credit: Jeff Hicks)

Although I used more regular expressions in this article than I thought I would, most of these users were to select the text I wanted out of a larger file and to clean up names. If your text is simple and clean, all you need is a list of names, split each line into an array, and create a custom object joining names with values. In fact, the process can be very simple. Here’s command output that’s pretty close to complete.

Figure 12: A clean text file. (Image Credit: Jeff Hicks)
A clean text file. (Image Credit: Jeff Hicks)

Replace multiple spaces with a comma

I’m using a text file. This could be the result of a running a command line tool. PowerShell has a cmdlet, ConvertFrom-CSV, which would easily turn this into a set of objects. The tricky part with the current output is that each entry is separated by a number of spaces. But ConvertFrom-CSV is looking for a single character delimiter. Not a problem. I’ll replace the multiple spaces with a comma and then convert the result.

​(get-content C:\work\raidreport3.txt) -replace "\s+","," | convertfrom-csv
Figure 13: Using ConvertFrom-CSV. (Image Credit: Jeff Hicks)
Using ConvertFrom-CSV. (Image Credit: Jeff Hicks)

It can really be that simple. If you have any problems getting these techniques to work or have questions, please leave a comment.

Using RegEx

If you look at the output of the text file we were trying to parse above, you can imagine two different types of objects being displayed. I’m going to focus on the second part of the output, which is the “Disks in Use” section. Remember this is all text. I’m going to deviate a little from the original need for the sake of education.

Get specific information from a text string using Select-String

First, let’s say I only want to get the disks with a failed status.

The quickest solution is to use Select-String.

​get-content c:\work\raidreport.txt | select-string "Failed"

No denying it works, but we lost the heading.

Figure 2: Using Windows PowerShell's Select-String cmdlet to find matching text. (Image Credit: Jeff Hicks)
Using Windows PowerShell’s Select-String cmdlet to find matching text. (Image Credit: Jeff Hicks)

I could use a slightly more complex pattern to include the header.

​get-content c:\work\raidreport.txt | select-string "^ID\s+Chassis | Failed"
Figure 3: Adding the header. (Image Credit: Jeff Hicks)
Adding the header. (Image Credit: Jeff Hicks)

But this is all still text, and you know how I feel about PowerShell. I think it would be more helpful to turn this text back into a set of objects. The header text makes a natural list of properties. One solution is to use a regular expression pattern that includes named captures.

Create a pattern to describe the data to process

With a named capture, you can reference matching groups by a name. So I’ll create a pattern to describe the data I want to process.

[regex]$pattern = "(?<ID>\w+)\s+(?<Chassis>\d)\s+(?<Slot>\d)\s+(?<RAIDID>\w+)\s+(?<Status>\w+)\s+(?<Type>\w+)\s+(?<Media>\w+)\s+(?<Spare>\S+)\s+(?<SizeGB>\d+)"

I’m not a big fan of spaces or special characters in names, so I have modified a few items. The text inside the angle brackets will be the name of each matching group. The pattern that follows each name should capture the corresponding data. With this, I can match the text from the text file.

Use the pattern with Select-String

$m = get-content C:\work\raidreport.txt | select-string -Pattern $pattern

The variable $m is now a collection of MatchInfo objects, which are in the Matches property.

Figure 4: Match objects. (Image Credit: Jeff Hicks)
Match objects. (Image Credit: Jeff Hicks)

Knowing the capture names, I can access individual properties.

Figure 5: Obtaining the value of a named capture in Windows PowerShell. (Image Credit: Jeff Hicks)
Obtaining the value of a named capture in Windows PowerShell. (Image Credit: Jeff Hicks)

All I need to do is go through each match and enumerate each named capture. Fortunately I don’t have to hard code any names. I can get them from the regex object.

Figure 6: Listing named capture names. (Image Credit: Jeff Hicks)
Listing named capture names. (Image Credit: Jeff Hicks)

Create a custom PowerShell object for each pattern match

Although I don’t want that the first name of 0, which is always there. My intention is to create a custom object for each match.

$data = $m.Matches | foreach -begin {
$names = $pattern.GetGroupNames() | select -skip 1
} -process {
$hash=[ordered]@{}
foreach ($name in $names) {
$hash.add($name,$_.groups[$name].value)
}
[pscustomobject]$hash
}

The end result is a variable, $data, with a collection of custom objects.

Figure 7: Displaying converted matches to objects. (Image Credit: Jeff Hicks)
Displaying converted matches to objects. (Image Credit: Jeff Hicks)

Because I now have objects, I can use PowerShell cmdlets.

Figure 8: Using custom objects in PowerShell. (Image Credit: Jeff Hicks)
Using custom objects in PowerShell. (Image Credit: Jeff Hicks)

The only way all of this works is if you know what your data will look like. There’s also an assumption that there are no blanks or null values. If there were some gaps in the text output, my solution would most likely fail.

I hope you see the value in working with objects instead of text. Yes, I had to jump through a few hoops to convert the text, and I won’t disagree that regular expressions can be a tough nut to crack.

If you’d like a tool to make this process easier, then take a look at a function I published last year on my blog and see if that helps. As with most things PowerShell, there’s usually more than one answer, and I’ll show you another approach next time you might find a bit easier.