PowerShell foreach loops and ForEach-Object

From Svendsen Tech PowerShell Wiki
Jump to: navigation, search

In PowerShell there's a keyword called foreach that's used for looping over collections such as arrays (technically pipelines). The cmdlet ForEach-Object is used for processing objects coming in via the pipeline. These objects are found in the special variable "$_" inside the (process) script block you pass to ForEach-Object.

A bit confusingly (but conveniently), you can use both the shorthand (aliases) "%" and "foreach" instead of ForEach-Object in a pipeline. The logic behind that is as follows: If the word foreach is used at the beginning of a statement (or after a label), it's considered the foreach loop keyword, while if it's anywhere else, it'll be recognized as the ForEach-Object cmdlet/command. A cmdlet is a type of command.

For an article about the "regular" PowerShell for loop, click here.

This article was written about PowerShell version 2, which is the default in Windows 2008 R2 and Windows 7.



The foreach Keyword

As I mentioned, the foreach loop is used to iterate, or loop over, collections. I will present a few examples here, as well as a diagram formally explaining the syntax.

The foreach loop enumerates/processes the entire collection/pipeline before starting to process the block (statement list / code), unlike the pipeline-optimized Foreach-Object cmdlet which processes one object at a time.

You can not specify a type for the variable in a foreach loop. The error you will get if you try is:

PS E:\temp> foreach ([int]$var in 1..3) { $var }
Missing variable name after foreach.

The foreach Loop's Syntax

The syntax of the foreach loop is described in the following diagram:

Foreach-loop-syntax.jpg

Some foreach Loop Examples

Here are some examples.

A Basic foreach Loop Example

This loop just gets the numbers from 1 to 3, specified as a comma-separated list, and emits them to the pipeline. In this case it's printed to the console. I also demonstrate that the last value the variable has - the last element in the pipeline - is available after the loop.

PS C:\> foreach ($Num in 1, 2, 3) { $Num }
1
2
3
PS C:\> $Num
3

You can also assign the results to an array, like this:

PS C:\> $Array = foreach ($i in 1..3) { $i }
PS C:\> $Array
1
2
3
PS C:\>

Another foreach Loop Example

The loop below gets all numbers between 1 and 50 that are divisible by 10, using the modulus operator. The modulus operator (%) returns zero when there is no remainder and the number is evenly divisible. Zero is considered false - unless it's a string ("0" or '0'). I should mention that if PowerShell sees a "%" in the middle of a pipeline, it will interpret it as the ForEach-Object cmdlet. "%" is only the modulus operator when it's used as a binary operator.

The range operator ".." (two periods) enumerates ranges of integers between the start and end integer specified. If you specify a decimal number, it will be cast to the [int] type. PowerShell rounds N.5 numbers to the closest even integer, so 1.5 is rounded to 2 and 2.5 is also rounded to 2.

PS C:\> foreach ($Integer in 1..50) { if ( -not ($Integer % 10) ) { "$Integer is like totally divisible by ten" } }
10 is like totally divisible by ten
20 is like totally divisible by ten
30 is like totally divisible by ten
40 is like totally divisible by ten
50 is like totally divisible by ten
PS C:\>

A way that I personally find a bit easier to understand for this specific case, is using "-eq 0":

PS C:\> foreach ($Integer in 1..10) { if ($Integer % 10 -eq 0) { "$Integer is like totally divisible by ten" } }
10 is like totally divisible by ten

Example With A Pipeline

In the diagram above, it says you can have a pipeline, so let's test it. Here the pipeline is "dir -Path e:\temp | Where { -not $_.PSIsContainer }". This uses the Where-Object cmdlet, which filters out all elements where the last statement/expression within the block returns false, and only passes on the ones that evaluate to a true value. In this example, it filters out directories and only lets through files (non-directories). Read more about the Where-Object cmdlet here.

PS C:\> $Sum = 0
PS C:\> foreach ($File in dir -Path e:\temp | Where { -not $_.PSIsContainer }) { $Sum += $File.Length }

Let's look at the sum.

PS C:\> $Sum
362702012

There's a built-in constant for "MB" which allows you to do this to easily determine the size is about 346 MB:

PS C:\> $Sum/1MB
345.899593353271

Iterating A Variable

Of course you can also specify a variable such as an array in place of the "pipeline".

PS C:\> $NegativeNumbers = -3..-1
PS C:\> foreach ($NegNum in $NegativeNumbers) { $NegNum * 2 }
-6
-4
-2

The continue And break Keywords

To skip processing of the remainder of the block and move on to the next iteration in the loop, you can use the keyword continue. If you use the keyword break, you will break out of the (innermost) loop entirely (unless you specify a label).

Here's a quick demonstration:

PS C:\> $Numbers = 4..7
PS C:\> foreach ($Num in 1..10) { if ($Numbers -Contains $Num) { continue }; $Num }
1
2
3
8
9
10

And here's an example of break:

PS C:\> foreach ($Num in 1..10) { if ($Numbers -Contains $Num) { break }; $Num }
1
2
3
PS C:\>

If you have nested loops that aren't labelled, continue and break will act on the innermost block. As demonstrated here:

PS C:\> foreach ($Num in 1..3) {
    foreach ($Char in 'a','b','c') {
        if ($Num -eq 2) { continue }
        $Char * $Num # send this to the pipeline
    }
}

a
b
c
aaa
bbb
ccc
PS C:\>

The Special $foreach Variable

Here I demonstrate the special $foreach variable that exists within a foreach loop, and it has a .MoveNext() method (that emits output you probably want to suppress by piping to Out-Null, assigning to the $null variable or casting it to the [void] type accelerator).

This will give you odd numbers between 1 and 10:

PS C:\> foreach ($Int in 1..10) { $Int; $foreach.MoveNext() | Out-Null }
1
3
5
7
9
PS C:\>

Another use you might have for it, is if you have an array and need to get the next item after some condition is met. It can save you from setting flags, although it's about as clunky. Here's an example where I want the first letter after I've matched "c" (so I get "d").

PS C:\> "$Array"
a b c d e f g
PS C:\> foreach ($Letter in $Array) { if ($Letter -ieq 'c') {
>> $null = $foreach.MoveNext(); $foreach.Current; break } }
>>
d
PS C:\>

To suppress the unwanted boolean output from $foreach.MoveNext(), you can also cast the results to the void type accelerator:

[void] $foreach.MoveNext()

I'll toss in a real-world example where you have an array of values where every other element can be used as a key and a value in a hashtable:

PS C:\> "$Array"
Name John Surname Doe Occupation Homeless

To add this to a hash, you can use something like this:

PS C:\> $Hash = @{}
PS C:\> foreach ($Temp in $Array) {
>> $Key = $Temp; [void] $foreach.MoveNext();
>> $Value = $foreach.Current; $Hash.Add($Key, $Value) }
>>
PS C:\> $Hash

Name                           Value
----                           -----
Name                           John
Surname                        Doe
Occupation                     Homeless

PS C:\>

That was meant to be explicit. In the real world you would probably rather write something like this:

PS C:\> $Hash = @{}
PS C:\> foreach ($t in $Array) { [void] $foreach.MoveNext(); $Hash.$t = $foreach.Current }
PS C:\> $Hash

Name                           Value
----                           -----
Name                           John
Surname                        Doe
Occupation                     Homeless

Side note: In Perl you can simply assign an array or list to a hash and this is done automatically - with a warning if there's an odd number of elements in the list/array. The Perl code would simply be: my %hash = @array;

Labelled foreach Loops

Here's an example of some code where the outermost loop is labelled with the ":OUTER " part before the foreach keyword, indicating the name of the label (OUTER). When you use the continue or break keywords, you pass this label name as the first argument, to indicate the block to break out of (if no argument is passed, the innermost block is targeted).

:OUTER foreach ($Number in 1..15 | Where { $_ % 2 }) {
    "Outer: $Number"
    foreach ($x in 'a', 'b', 'c') {
        if ($Number -gt 5) {
            continue OUTER
        }
        $x * $Number
    }
}

This will produce output like this:

PS D:\temp> .\labeled-loop.ps1
Outer: 1
a
b
c
Outer: 3
aaa
bbb
ccc
Outer: 5
aaaaa
bbbbb
ccccc
Outer: 7
Outer: 9
Outer: 11
Outer: 13
Outer: 15

The ForEach-Object cmdlet

The ForEach-Object cmdlet, which can also be written as "%" or "foreach", is basically a type of foreach loop designed to work in a pipeline. After the command name or alias you choose (among the three mentioned), you specify a script block, denoted by a start and end curly bracket, in which you place code that's run once for each object coming in via the pipeline. It's much like the map keyword in Perl.

Inside the block with the code/statements, the current object in the pipeline is accessible in a so-called automatic variable called "$_". So to access the "Name" property of an object inside the ForEach-Object script block, you would use "$_.Name" (or "$_ | Select -Expand Name").

The ForEach-Object cmdlet's Syntax

The syntax is as follows:

<pipeline> | ForEach-Object { <ListOfStatements> } [| <optional pipeline>]

It can also take a begin and end block:

<pipeline> | ForEach-Object -Begin { <ListOfStatements> } -Process { <ListOfStatements> } -End { <ListOfStatements> } [| <optional pipeline>]

Pipeline Objects Are Generated And Processed One At A Time

An important difference between foreach as a keyword (looping construct) and the command/cmdlet ForEach-Object, is that the keyword/loop generates the entire collection specified in the pipeline before processing it. Foreach-Object on the other hand gets one object from the pipeline, processes it and passes it on, so it operates on one object at a time. This should give reduced memory usage in some cases.


Spanning Multiple Lines

To span multiple lines, you can use it just like the regular foreach loop, since it takes a script block where you can put any valid PowerShell code. You can also separate statements with semicolons, that act as newlines, for shorter/simpler stuff.

It can look like the following:

Foreach-object-example-multiple-lines.png

Multiple Script Blocks

Another feature of ForEach-Object is that it can take multiple script blocks. If you specify two script blocks, the first will be treated as the so-called begin block, while the second is the process block that's run once for each pipeline object. If you specify three, it's the same, and the third is the end block. There's more about this in the examples. There's some clunky logic here where the ForEach-Object technically gets multiple process blocks and deems one of them the begin block if there are two blocks, and no explicit -Begin parameter has been specified.

The begin block is executed before pipeline object processing begins, then the objects from the pipeline are processed one at a time in the process block, and finally the end block is run after all objects have been processed.

If you specify a ForEach-Object -Begin { } -End { } and then add more script blocks, they will be treated as multiple process blocks.

Flattening Of Arrays

I should probably also mention that nested arrays will be "flattened" when passed through Foreach-Object. I won't go into much detail on this, but you can work around it by prepending the unary comma operator before the automatic variable, like this:

$NestedArray2 = $NestedArray1 | %{ , $_ }

That would preserve the nesting. The process of flattening the array is referred to as "unraveling" and that it does this was a design choice by the PowerShell development team. Most of the time it's what you want when working at the command line.

Examples of how this preserves nesting:

PS D:\temp> ( @(1,2,3), @(4,5,6) | %{ , $_ } ).Count
2

And with unraveling:

PS D:\temp> ( @(1,2,3), @(4,5,6) | %{ $_ } ).Count
6

Some ForEach-Object Examples

Here are some examples.

A Basic ForEach-Object Example

Here I create a collection of numbers and just emit them to the pipeline in the ForEach-Object.

PS C:\> 2..4 | ForEach-Object { $_ }
2
3
4
PS C:\>

To store them in a variable for later use after the processing, just assign to a variable at the start of the line, like this:

PS C:\> $MyVariable = 0..3 | foreach { $_ * 2 }
PS C:\> $MyVariable
0
2
4
6

Random Example

Here I list files starting with "test.". Of course, specifying a filter to dir (which is an alias for Get-ChildItem) would be better in this case, but it's for the sake of demonstration - and we get pretty colors! This specific example is also really more suited for the the Where-Object cmdlet.

PS C:\> dir -Path E:\temp | % { if ($_.Name -like 'test.*') { Write-Host -Fore Yellow $_.Name } }
test.cmd
test.pl
test.ps1
test.txt
test.zip
PS C:\>

Foreach-object-example.png

Multiple Script Blocks Passed To ForEach-Object

I mentioned in the introductory text that ForEach-Object can take multiple script blocks. Here's an example that demonstrates the begin block, with two script blocks specified for ForEach-Object.

PS C:\> $Sum = 10
PS C:\> 2..4 | ForEach-Object { $Sum += $_ }
PS C:\> $Sum
19

PS C:\> # omg, wrong

PS C:\> 2..4 | ForEach-Object { $Sum = 0 } { $Sum += $_ }
PS C:\> $Sum
9

PS C:\> # that's more like it

There you can see that I initialize $Sum to 0 in the begin block in the second ForEach-Object example.

With An End Block As Well

Now, if you want to do something after all the objects in the pipeline have been processed, you can put it in the end block. Here's the same example as above including an end block.

PS C:\> 2..4 | ForEach-Object { $Sum = 0 } { $Sum += $_ } { $Sum }
9

Foreach-object-example-end-block.png

Using Explicit Parameter Names For The Script Blocks

PS C:\> dir -path c:\windows -filter *.log | foreach -Begin { $Cnt=0; $Size=0 }
 -Process { $Cnt++; $Size += $_.Length }
 -End { "Total log files: $Cnt - Total size: " + ($Size/1MB).ToString('N') + ' MB' }
Total log files: 9 - Total size: 2.68 MB
PS C:\>

Foreach-object-example-end-block-explicit.png

Hacking The continue Keyword In A ForEach-Object

Since ForEach-Object uses a script block, you can't break out of it with the continue keyword that's used for foreach and for loops. You can use the return statement instead to achieve a similar result. If you do use the continue keyword, it will act like the break keyword in a foreach, while or for loop. If you do use break in a ForEach-Object, it will act the same as continue. Note that the end block will also be skipped this way, if present.

PS C:\> $Numbers = 4..7
PS C:\> 1..10 | %{ if ($Numbers -contains $_) { return }; $_ }
1
2
3
8
9
10

And with continue we see that it stops processing the remainder of the pipeline:

PS C:\> $Numbers = 4..7
PS C:\> 1..10 | %{ if ($Numbers -contains $_) { continue }; $_ }
1
2
3
PS C:\>