PowerShell Cmdlet for Splitting an Array

From Svendsen Tech PowerShell Wiki
Jump to: navigation, search

I wrote a couple of advanced functions (it's just a few lines of code) for splitting an array/collection into chunks/parts for later processing in the pipeline (or otherwise) as smaller arrays. It's pretty simple, but quite useful for a number of scenarios. I've found myself needing this from time to time, so I finally decided to write a generic version of it.

Its performance on very large collections won't be too good, but I haven't actually benchmarked it or tested it extensively (yet). Array concatenation is something PowerShell does not do efficiently (I seem to remember the reasons for this being good though (creating a copy of the array)), and array concatenation seems difficult to avoid, and is used in this cmdlet, so there will be a performance penalty due to that.

I'm putting up two versions. The version called Split-Collection.ps1.txt works only with pipeline input The other one, called Split-CollectionParam.ps1.txt, also works correctly on a collection passed in as the parameter -Collection. In the latter version I use foreach () on the collection, which enumerates the entire collection and thus potentially could be more memory-intensive than the version working strictly with pipeline input, one element at the time, but you might see a performance gain as foreach loops are generally a lot faster than ForEach-Object. The latter one (with param) is more versatile as it works with both pipeline input and input passed as a parameter and would generally be recommended unless you know you need the other one.




Examples

It works like this:

PS C:\> 1..10 | Split-Collection -Count 2 | %{ $_ -join ', ' }
1, 2
3, 4
5, 6
7, 8
9, 10

PS C:\> (1..10 | Split-Collection -Count 2 | %{ $_ -join ', ' }).Count
5

PS C:\> 1..10 | Split-Collection -Count 5 | %{ $_ -join ', ' }
1, 2, 3, 4, 5
6, 7, 8, 9, 10

PS C:\> (1..10 | Split-Collection -Count 5 | %{ $_ -join ', ' }).Count
2

And you also get the remainder (requires its own logic...) if it doesn't divide evenly:

PS C:\> 1..10 | Split-Collection -Count 3 | %{ $_ -join ', ' }
1, 2, 3
4, 5, 6
7, 8, 9
10

PS C:\> (1..10 | Split-Collection -Count 3 | %{ $_ -join ', ' }).Count
4

Here's an example where I dot-source Split-CollectionParam.ps1 ("the second version") and pass in a collection as a parameter.

PS D:\> . D:\Dropbox\PowerShell\Split-CollectionParam.ps1

PS D:\> Split-Collection -Collection (1..10) -Count 5 | %{ $_ -join ', ' }
1, 2, 3, 4, 5
6, 7, 8, 9, 10

PS D:\> (Split-Collection -Collection (1..10) -Count 5 | %{ $_ -join ', ' }).Count
2

Split-Collection-Example.png

Code

Split-CollectionParam.ps1.txt

function Split-Collection {
    [CmdletBinding()]
    param(
        [Parameter(ValueFromPipeline=$true,ValueFromPipelineByPropertyName=$true)] $Collection,
        [Parameter(Mandatory=$true)][ValidateRange(1, 247483647)][int] $Count)
    begin {
        $Ctr = 0
        $Array = @()
        $TempArray = @()
    }
    process {
        foreach ($e in $Collection) {
            if (++$Ctr -eq $Count) {
                $Ctr = 0
                $Array += , @($TempArray + $e)
                $TempArray = @()
                continue
            }
            $TempArray += $e
        }
    }
    end {
        if ($TempArray) { $Array += , $TempArray }
        $Array
    }
}

Split-Collection.ps1.txt

function Split-Collection {
    [CmdletBinding()]
    param(
        [Parameter(ValueFromPipeline=$true)] $Collection,
        [Parameter(Mandatory=$true)][ValidateRange(1, 247483647)][int] $Count)
    begin {
        $Ctr = 0
        $Arrays = @()
        $TempArray = @()
    }
    process {
        if (++$Ctr -eq $Count) {
            $Ctr = 0
            $Arrays += , @($TempArray + $_)
            $TempArray = @()
            return
        }
        $TempArray += $_
    }
    end {
        if ($TempArray) { $Arrays += , $TempArray }
        $Arrays
    }
}

Download

2014-01-11: Put up two versions and documented briefly.

2014-01-11: Renamed the parameter -ChunkSize to -Count.