Swift’s Lazy Collections and Sequences

Warning: this article is out of date as of Swift 1.0 beta 4. Read this updated version instead.

The Swift non-member function map acts much like the Array.map member function, except you can pass in any kind of sequence or collection (click here for an article on how to write an algorithm like this).

But if you look at what map returns, it isn’t an array of results. It’s an object (a MapColectionView or a MapSequenceView).

That’s because map evaluates its results lazily. When you call map, no mapping takes place. Instead, the input and mapping function are stored for later, and values are only mapped as and when they are needed.

To see this happening, run the following code:

let r = 1...3
let mapped = map(r) {
    (i: Int) -> Int in
    println("mapping (i)")
    return i*2


for i in mapped {
    println("mapped (i)")


println("index 2 = (mapped[2])")

// prints out
mapping 1
mapped 2
mapping 2
mapped 4
mapping 3
mapped 6
mapping 2
index 2 = 4

The returned object from map mimics the properties from the object passed in. If you pass in a sequence, then you get back a sequence. If the collection you passed in only supported a bidirectional index (such as a Dictionary or a String, which can’t be indexed randomly), then you won’t be able to subscript arbitrarily into the mapped collection, you’ll have to walk through it.

map is not the only lazy function. filter and reverse also return these “view” classes. Combining these different lazily evaluated classes means you can build up quite complicated expressions without sacrificing efficiency.

Want to search backwards through an array or string? find(reverse(myarray), value) won't actually reverse the array to search through it, it just iterates over the ReverseView class, which serves up the array in reverse order.

Want the first 3 odd numbers that are a minimum of two arrays at each position? first_n(filter(map(Zip2(a1, a2), min) {$0%2 == 1}, 3) wont do any more zipping or minning or modding than necessary.123

There are also several lazily generated sequences you can also use:

enumerate numbers each item in a collection — it returns an EnumerateGenerator, that serves up those numbered elements on demand. This is useful in a for (count, val) in enumerate(seq) loop where you want to know both the value and how many values so far.

PermutationGenerator is initialized with a collection and a collection of indices into that collection, and serves up each element at each index in that order (thus allowing you to lazily reorder any collection – for example, PermutationGenerator(elements: a, indices: reverse(a.startIndex..<a.endIndex)) would be the reverse of a.4

GeneratorOf just takes a closure and runs it each time to generate the next value. So var i = 0;var naturals = GeneratorOf({ ++i }) sets naturals to a sequence that counts up from 1 to infinity (or rather, to overflow).

These infinite virtual sequences can be pretty useful. For example, to do the same thing as enumerate, you could write Zip2(naturals, somecollection). Or you could pass your infinite sequence into map or filter to generate further inifinite sequences.

(at the opposite end, if you have a single value, but what a function needs is a Sequence, you can use GeneratorOfOne(value) to create a quick sequence that will just serve up just your one value)

It's not all rainbows and unicorn giggles though. Imagine you did the following:

let r = 1...10_000
mapped = map(r) { megaExpensiveOp($0) }

let a1 = mapped
let a2 = mapped

Here the lazy evaluation will work against you. megaExpensiveOp will be run not ten but twenty thousand times.

“Shouldn't map cache the data?” you ask. Well that leads to the next complication. Take this code:

var i =0
let mapped = map(1...5) {
    i += 1
    return $0 + i

Every time you iterate over mapped now, you'll get different results.5

This behaviour might be put to very good use (say you wanted to peturb a data set with small random values). But if you weren't expecting it, that could be one nasty bug to track down.

So its good to know when you're using these functions what they are actually doing under the hood. Just keep in mind these caveats and an exciting life of lazy evaluation awaits you in the off-world colonies.

  1. OK I cheated, first_n isn't a Swift standard library function. But it should be! 
  2. This code would look a lot nicer if Swift came with a pipe operator 
  3. Yeah, this is a pretty rubbish example, I should have put the time in to think of a better real-life use case. 
  4. Obviously, reverse(a) would work better, but a more useful example is tough to fit on one line… 
  5. If you're unclear on how come, try reading this article

Changes to the Swift Standard Library in Beta 3

Xcode Beta 3 came out today, and with it a new version of the Swift standard library.

By far the majority of changes are to account for the publicized updates to the language:

  • replacing Type[] with [Type], and making use of the new [Key:Val] sugar for dictionaries
  • switching uncheckedArithOp for arithOpWithOverflow in all the numeric types
  • adding explicit bigEndian and littleEndian initializers to the integers
  • replacing succ and pred with successor and predecessor1
  • adding sorted as an operation on arrays and as a standalone algo, since sort now sorts in-place2

copy and unshare have been removed from Array, since they’re redundant now arrays have full value semantics. Assigning an array to another variable copies it. Or rather, it lazily copies it – the array will only by physically copied when you alter one of the elements. So really, the equivalent of unshare is called automatically when any value in the copied array or the copy is altered. You can still detect when this happens with the === operator, which returns true if two arrays are pointing to the same underlying store:

var a = [1, 2, 3, 4]
let b = a  // 'a' is "copied"
// but really, it's still the same
a === b    // returns true
a[2] = 2   // until you alter 'a'
// and now they're different
a === b    // returns false

This is really important to understand, in case you start to panic about all the unnecessary copying that might happen when you pass an array into a function as a parameter.

There are also a handful of other changes/additions that aren’t mentioned in the release notes.

Range now supports map as a member, so you can write (1...5).map { $0 * 2 } much like with an Array, rather than map(1...5) { $0 * 2 }. Note, Range.map returns an Array, unlike the map algorithm, which returns a lazily-computed view class.

Lots more types (such as Range) now have a getMirror() method.

min and max, which previously took two or more parameters (the “more” part being a variadic parameter) now have an overload with just two parameters. This means you can pass them as an argument to another function that expects a binary operator. So the following now compiles where previously it would throw an error:3

let a = [5, 3, 1, 7, 9]
let b = [2, 4, 1, 1, 7]

// now you can write this:
let mins = map(Zip2(a, b), min)

// instead of being forced to write this:
let mins = map(Zip2(a, b)) { min($0, $1) }

Finally, the already mysterious insertionSort got a bit more mysterious.

Previously there were two versions, both of which required a collection with a bidirectional index. One just required that the collection contents be comparable. The other allowed the caller to supply a predicate to compare elements with. That’s a pretty standard pattern, but oddly that predicate is declared inout. Why a sort algorithm needs to be able to change the caller’s comparison function I do not know, but it can!4

The new insertionSort that doesn’t take a predicate now requires a random-access index, rather than a bidirectional index. The version that takes the predicate still only requires a bidirectional index. It’s predicate is still inout though.

There are a few other changes, relating to unicode character processing and Objective-C interfacing, that I either don’t feel qualified to comment on or haven’t delved into. Feel free to mention antyhing interesting about them in the comments.

  1. Bit odd, that, I preferred the shorter versions. 
  2. I really hate they did this to the non-member version, as it means you can’t chain it. 
  3. If you don’t follow what this does – Zip2 takes two sequences and “zips” them together, producing a new sequence that has pairs of elements from the same point in both sequences. map then takes each element of this sequence and applies a function to each one. That function, min, takes the pairs and returns the minimum. So the whole expression returns a sequence of the minimum of each position in either array. 
  4. It does mean you can’t call it with a trailing closure expression. You have to create a variable and point that at a function and then pass that variable in. That can’t be the reason, can it?