Protocols and Assumptions

edit: subsequent to this article being written, the Swift standard library has been updated, and documentation-comments above the relevant methods of RangeReplaceableCollectionType now explicitly state: “Invalidates all indices with respect to self.” Bear this in mind as you read on:

What does it mean to implement a protocol? Possibly more than supporting methods with specific names.

Obviously the methods ought to do what their names imply – isEmpty shouldn’t fire the torpedoes. But as well as basic functionality, there are also guarantees about things like the method’s complexity, and possibly wider implications for how the class itself behaves.

More importantly, protocols might not guarantee a behaviour you think they do and are relying on. Using protocols with generics can sometimes give you the illusion of more guarantees than you actually have.

Suppose you want to write a function remove that removes entries from a collection in-place – that is, similar to the sort function, it takes a collection as an inout parameter, and removes from it all the entries that match a given value. 1

Unlike sort, which just requires MutableCollection, remove would need to use the new-in-beta6 RangeReplaceableCollectionType, which includes a removeAtIndex method. Armed with this, you might write the following: 2

func remove
    <C: RangeReplaceableCollectionType,
     E: Equatable
     where C.Generator.Element == E>
    (inout collection: C, value: E) {
        var idx = collection.startIndex
        while idx != collection.endIndex {
            if collection[idx] == value {
                collection.removeAtIndex(idx)
            }
            else {
                ++idx
            }
        }
}

Embedded in this code are a lot of assumptions about how collections and their indices behave when you mutate them, and some of these might not be valid, depending on the collection type. It works with String, the only explicitly range-replaceable collection in the standard library currently, as well as the secretly range-replaceable Array, ContiguousArray and Slice. But you could easily implement a new type of collection that was range-replaceable for which the above code would explode in flames.

The biggest assumption is that removing an element from a collection does not completely invalidate an index. That is, after you call collection.removeAtIndex(idx), idx remains a legitimate index into the collection rather than just becoming junk.

Next, there’s the assumption that when you remove a entry at an index, that index will now point to the next element in the collection. That’s why, after removing the entry, the code above just goes straight back around the loop without incrementing idx. You could put it another way – when you remove an element, the next element “moves down” to the position indexed by the removed element.

Finally, there’s the assumption that if the element that idx points to is the last element, then what idx will point to after you remove it will be endIndex. Or, to put it another way, as you remove the last element, endIndex “moves down” to equal the index of the last element.

By the way, this last assumption is why the code uses while idx != collection.endIndex rather than the more-elegant for idx in indices(collection). indices returns a Range object between the collection’s start and end, but it would be created before we start looping. Because endIndex is a moving target as we remove some entries from the collection, it won’t work for our purposes. A cleverer version of indices that returned something more dynamic might help, but that could have different undesirable side-effects.

Are all these assumptions legit? Well you can see they obviously are for the Array types, because these just use an integer for their index, with endIndex equal to count. When elements are removed, the rest of the array shuffles down. Even if that resulted in a full reallocation of the array’s memory, the assumptions would still hold because all the index does is represent a relative offset from the start of the array, not point to a specific location.

Strings are trickier, because their index type is opaque. Chances are it’s still an offset from the start of the string, but because Swift characters aren’t of uniform width, 3 that offset doesn’t necessarily increment by a constant amount each time. Still, if this is the implementation, the above assumptions would hold, and experimentation suggests they do.

What kind of collection might not adhere to these assumptions? Well, a very simple doubly-linked list implemention might not. 4 If the index for a linked list were a pointer to each node, then removing that node could leave the index pointing at a removed entry. You couldn’t just loop around without first pointing to the next node in the list:

func remove
    <C: RangeReplaceableCollectionType,
     E: Equatable
     where C.Generator.Element == E>
    (inout collection: C, element: E) {
        var idx = collection.startIndex
        while idx != collection.endIndex {
            if collection[idx] == element {
              // first grab the index of the next element
              let next = idx.successor()
              // then remove this one
              collection.removeAtIndex(idx)
              // and repoint
              idx = next
            }
            else {
                ++idx
            }
        }
}

But then this algorithm would no longer work correctly with arrays and strings! 5

So what’s the solution – should RangeReplaceableCollectionType mandate the kind of index validity behaviour our remove algorithm relies on? Or are the assumptions invalid and we need a better algorithm? (of which more in a later article) The Swift standard library is still evolving rapidly with each beta so it’s possibly a little early to tell. For now, be careful about the assumptions you make – just because all the current implementations of a particular protocol work a certain way doesn’t mean other implementations will.


  1. As opposed to a version that returned a copy, which would be called removed. I thought I didn’t like this convention at first, but I’m warming to it. 
  2. This code has an efficiency deficiency, which we’ll talk about in a later article. 
  3. For an in-depth explanation of Swift strings, see Ole Begemann’s article
  4. Singly-linked lists couldn’t implement removeAtIndex easily, and would probably have some kind of removeAfterIndex operation instead. 
  5. The C++ STL resolves this by having erase return an iterator for the entry just after the erased elements – along with a fairly draconian assertion that any removal invalidates all other iterators (not just ones at and beyond the removed value). But Swift’s removeAtIndex currently returns the removed value, rather than a new index. 

The case against making Array subscript return an optional

I got a lot of interesting feedback from my previous article, regarding the proposal to change array subscripts to return optionals. Some pro, some anti.

The pro came mostly from people who’d been burned by out of bounds exceptions often and wanted Swift to change in a way that would help them not crash.

The anti camp was obviously not pro crashing, but felt that using optionals was too strict – that it would end up being counterproductive because in fighting with the optionals, developers would likely introduce as many bugs as they eliminated. Their arguments are pretty convincing.

A question tied up in this is, what are optionals for? Should nil be rare like an exception, or commonplace? Should you avoid using optionals unless you absolutely have to, or use them often, to represent things like unknown information? For a good discussion on how they should be rare, see this article by @nomothetis.

There’s definitely a lot of optional mis-use out there, in sample code on the internet – optionals for delayed initialization of members (where lazy stored properties would be better), or for representing collections that might not have contents (rather than just returning an empty collection).

For guidance, we could look at the Swift standard library and how it uses optionals:

  • In sequences, optionals are routine and informational – nil means you’ve reached the end of the sequence.

  • In the case of String.toInt(), its more to return an error – you passed in a bad string. Using toInt() to detect whether a string is an integer feels iffy.

  • Array.first seems in-between – is getting back nil an error or information? If you were expecting your array to always have values, the former. If you’re writing something generic that needs to handle empty arrays, it’s the latter, a convenience for combining the check for empty and getting the value.

  • In find, nil means not found, because maybe the collection doesn’t contain that value. This isn’t an error. But there was an alternative – it could have returned Collection.endIndex, which is what the C++ find does. The optional forces you to check, whereas with endIndex you could forget/not bother.

Array subscripts are closest to the last example. You’ve got an index, but maybe it’s out of bounds, so you should be encouraged to check before you use it. So it seems like a good fit.

The problem is when it becomes annoying without benefit. Take this code:

for idx in indices(a) {
    // this is dumb, of course it has a value!
    if let val = a[idx] {
        // do something
    }
}

Faced with the above case too often, developers would probably start to get unwrap fatigue. They inspect the code, see that there’s no way the index could not be valid (there’s no index arithmetic going on there, just use of an index that is guaranteed to be within bounds), and just force unwrap instead.

This is a slippery slope. Once you start doing that, you do it all the time. Not just with guaranteed-safe cases but others that aren’t, and one time you use a closed rather than half-open range and bang, run-time error. Only it’s not a array bounds error, it’s a force-unwrap nil error. So we’re back to where we started, only with less helpful debug info and a bunch of exclamation marks all over our code.

So a better solution is needed – one that stays out of the way when doing things that can’t go wrong, but helps with handling the cases that can, like at the edges of ranges or when performing arithmetic on indices.

At this point, I’m pinning my hopes on the second option in my original article – a kind of index type that could never not point to a value in the collection. But implementing this is tricky – especially if it might involve changing the index type protocols, which would break everything that builds on top of them.

In the mean-time, if you feel your code would be better off with an optional array index function, you can of course extend array to provide it. The useful swiftz library has one, safeIndex, that you could re-use.

Null Pointer Exceptions fixed, next up…

Edit: there is a follow-up to this post, giving the case against this idea, which you should read after this one.

My not-statistically-proven assertion (after working in the LOB development mines for years) is that Null Pointer Exception is the #1 cause of crashes for apps written in memory-managed languages.1

The hope is, by introducing optionals, that this kind of error get pushed way down the leaderboard in Swift. Sure you can still force-unwrap a nil value. But you have to load the gun, cock it, and then point it at your foot. I love that the force-unwrap operator is an exclamation mark.

Who is the aspiring hopeful cause of crashes, eyeing that vacated top spot? I’m guessing the Array Out of Bounds exception.

This one is still very much at large in Swift. Int is the index for arrays, and there’s nothing stopping you blatting straight past the end, getting a runtime assertion or possibly even scribbling into memory depending on what you’re doing and how you’re compiling.

This is really just a step up from pointers.2 There’s just too much leeway for unreformed C programmers to write the same crappy old code:

// But it worked when I tested it!  
// (with an odd-sized array)
func reverseInPlace<T>(inout a: Array<T>) {
    var start = 0
    var end = a.count - 1
    while start != end {
        swap(&a[start], &a[end])
        start++; end--
    }
}

It doesn't have to be this way. It's totally fixable, just like null pointers were. There are two good options, an easy one and a (slightly) harder one.

The easy one

Make Array.subscript return an optional. Only if your index is within bounds will it return a value, otherwise it'll return nil. This is like the approach Dictionary takes with lookups. If your dictionary doesn't contain a particular key, you get a nil.

“But that's because dictionaries are different” you say. No they aren't. Dictionary has a method to check if a key is present. You could call that first and then, if it is, get the value. But people don't want to do that, so they skip straight to the getting out the value part, and because that's a bit risky, it returns nil if the key isn't there. Likewise, Array has methods for checking if the array is empty or if an index is beyond the end. You don't have to bother checking for that, but if you mess up, boom!

The most common case is probably getting the first element with a[0]. So common is this that beta 5 introduced Array.first. Which returns… an optional. If that doesn't convince you, I don't know what will.

I expect this change would elicit some moaning. Similar to the complaints about how you can't index randomly into the middle of a Swift string. But as with the string case, the question really is, how often do you need to do this anyway? Use for...in, use find, use first and last. Don't hack at your array like a weed.

Developers who hate it and want the old behaviour could just stick a ! after their subscript access and it'd be back to the way it was. Presumably these people are compiling -Ounchecked. Let's see whether their users find the snappy performance makes up for the random crashing.

The harder one

Stop using Int for indexing into random-access collections. Use a real index object.

Again, Dictionary already does this. To index over/into a dictionary you have to get a DictionaryIndex object, which has no public initializers so can only be fetched via methods on Dictionary.

If an index type is random-access (which Dictionary’s isn't), then they can be compared, added together, advanced by numeric amounts in O(n), just like an integer can.

To allow arrays to return a non-optional value when subscripted with this index, you'd need to tweak the index object to have successor() etc return an optional, with nil for if it's gone beyond the collection's bounds. This way, the index can be guaranteed to point to a valid entry.3

Again, this would make them more of a pain to use, in exchange for safety. But the other downside to this approach is to implement this, indices would need to know about the specific collection they index. Which means they'd need a reference to it, which introduces other complications like reference cycles and an increased size. Not a deal-breaker though.

Indices knowing about their container would have the side benefit of allowing them to become dereferenceable (see a previous post for more on this).

It would also allow them to guard against the following, which compiles and runs but is presumably a bad idea:4

let d1 = [1:1, 2:2, 3:3]
let d2 = [7:7, 2:2, 1:1, 4:4]

// get an index from d1
let idx = d1.startIndex.successor()
// and use it on d2...
d2[idx]  // returns (2,2)

No need to pick one

These two approaches are not mutually exclusive. Array could provide both an Int subscript and an index one – the former giving back an optional, the latter not.

I like that idea. The index option has downsides though. There's also a compromise, where the Int version is checked, but the index object version is unchecked, on the basis that if you're using indices you're thinking a bit more carefully about what you're doing.

But here's hoping at least the first option makes it into a subsequent beta before the current implementation's set in stone.

If there's a reason I'm missing that means these schemes are hopelessly naive, let me know at @airspeedswift on twitter.


  1. In non-managed code, the number one cause of crashes is crap, I still haven’t found the problem, where did my whole day go? 
  2. Obviously that’s a long step. You can’t do in Swift my favourite silly thing to do in C: char s[] = "hello"; int i = 3; cout<<i[s]<<endl; 
  3. All this is assuming the indices are into collections that aren't changing underneath them. Indexing mutating collections is a whole different ball-game. 
  4. Maybe it's ok? If the index is to a key common to both, it works. If it's an index to a key not in d2, you get an invalid index assertion. Still probably a bad idea. 

Jumping to conclusions

In an article yesterday, I mentioned that I wasn’t having much luck extending Dictionary to give it an initializer that took a sequence of tuples, because my attempt wasn’t compiling, even though I was sure that it should:

extension Dictionary {
    init<S: Sequence
      where S.GeneratorType.Element == Element>
      (_ seq: S) {
        self.init()
        for (k,v) in seq {
          self[k] = v
        }
     }
}

Looked like the problem was with destructuring a typealiased tuple type into its consitutents. Like a good little beta tester, I settled in this morning with a fresh pot of coffee to file a bug report. My plan was to reproduce the problem without the Dictionary part, to try and clarify exactly where the issue lay. I tried writing a couple of classes that used typealiased tuples, one taking two generic placeholders and typealiasing the pair of them as a tuple, and then writing a function on one that took the other.

After about 20 minutes plugging away at this, I was still unable to reproduce it independently, and was geting quite frustrated. Then I realized hang on, what made me so sure this was a problem with tuples?1 I pushed aside my failed attempt and just tried compiling this, and got the exact same error:

extension Array {
    func someFunc<S: Sequence
      where S.GeneratorType.Element == Element>
      (_ seq: S) {
        for e in seq {

        }
     }
}

Well, shit. That’ll teach me to overthink a problem. I scold myself for not applying Occam’s razor and assuming the cause was more complicated than it was.2 Once I stopped obsessing over the tuple part, the next step in fixing my code became obvious. for ... in is just shorthand for creating a generator and then looping while next() returns non-nil elements. So I tried this:

extension Array {
    func someFunc<S: Sequence
      where S.GeneratorType.Element == Element>
      (seq: S) {
        var gen = seq.generate()
        let elem = gen.next()
    }
}

Bingo! The call to gen.next() reports Cannot convert the expression's type 'Self.Element? to type 'Self.Element?'. Somehow type inference is borked. But if I qualify the declaration of elem as elem: Element? it works fine.

So now I know how to work around it, I can finally write my sequence initializer for dictionary.

But first: @rbrockerhoff replied yesterday with a gist that was very close to the Sequence initializer (it took MapCollectionView as an argument rather than the sequence directly, which somehow avoids the type inference issue in a similar same way that my version that took an Array did).

As part of this, he did something very smart, which is to separate out inserting the sequence into the dictionary into another method, and then calling it from init instead of having the loop in init itself. This is a great example of solving something generically first, and then specializing your generic solution to the immediate problem. So here’s the full solution:

extension Dictionary {
    init<S: Sequence
      where S.GeneratorType.Element == Element>
      (_ seq: S) {
        self.init()
        self.merge(seq)
    }

    mutating func merge<S: Sequence
      where S.GeneratorType.Element == Element>
      (seq: S) {
        var gen = seq.generate()
        while let (k: KeyType, v: ValueType) = gen.next() {
            self[k] = v
        }
    }
}

let pairs = enumerate(["zero","one", "two"])
let d: [Int:String] = Dictionary(pairs)
d[1] // returns {Some "one"}... yes!

Just one wrinkle – you still have to give the dictionary variable a type, it won’t infer it. I’m not sure if this is a bug in my code, a Swift bug, or a feature not a bug. If you know, let me know.


  1. I was possibly nudged towards the tuple assumption because of another bug in playgrounds with mapping arrays of tuples. 
  2. I did the same thing yesterday when I convinced myself the only way to write my own Dictionary.init() was to create another dictionary and assign it to self, before a helpful comment from @NachoSoto suggested I just try self.init(). Duh. 

Extending the Swift language is cool, but be aware of what you are using

Features like the @auto_closure keyword in Swift offer a great opportunity to extend the language in ways that might not be possible in other languages. For example, in this article we used it to implement the Ruby ||= operator.

Swift doesn’t yet have a native implementation of unless or until. These are just versions of if or while, but for when the predicate is not true. Obviously you could just stick a ! in front of your if clause, but some programmers prefer the readability of the opposing versions. The implementation of ||= could be rewritten as:

@assignment func ||=<T>(inout lhs: T?, rhs: @auto_closure () -> T) {
    unless(lhs) {
        lhs = rhs()
    }
}

I know, you think, I can implement unless and until myself! What’s nice is you can make them look almost like the native flow control statements, because closure expressions can be outside the parentheses of function calls if they are the last argument.1 Implementing unless seems pretty straightforward:

func unless<L: LogicValue>(pred: L, block: ()->()) {
    if !pred {
        block()
    }
}

// doSomething only if condition is false
unless(condition) {
      doSomething()
}

until is a little trickier, because you need to execute the predicate multiple times as you loop. But @auto_closure can help you out.

func until<L: LogicValue>(pred: @auto_closure ()->L, block: ()->()) {
    while !pred() {
        block()
    }
}

// doSomething until condition becomes true
until(condition) {
    doSomething()
}

Now, the conditional expression passed as the first parameter to until will be automatically wrapped up into a closure expression and can be called each time around the loop. Any variable used in both the parentheses and the block will be captured by both, so altering it in the block will affect the result of the condition (hence the condition can become true over time).

Armed with your new unless and until functions, you write a function to search a collection for the index to the first occurrence that doesn’t match a predicate:2

func findIfNot
  <C: Collection, L: LogicValue>
  (col: C, pred: (C.GeneratorType.Element) -> L)
  -> C.IndexType? {

    var idx = col.startIndex
    until(idx == col.endIndex) {
        unless(pred(col[idx])) { return idx }
        idx = idx.succ()
    }
    return nil
}

let a = [1, 2, 3]
let isOdd = { $0%2 == 1}
// find the first non-odd number
let idx = findIfNot(a, isOdd)

Arguments about SESE and other stylistic preferences aside, this function should do the job. Except instead it generates a compiler error, 'C.IndexType' is not convertible to '()'.

Why? Because the return after the unless is a return from that closure expression between the braces, not a return from the findIfNot function. That unless closure expects no value to be returned, so when you return idx it’s an error. But if you were just blithely using this unless function you found in a library as if it were just like an if statement, you might not realize this and the compiler error may come as a shock.

Now say you instead implemented findIfNot to take in an index into the collection as an inout, and advanced that index to the first non-match, instead of returning a value (and returning the endIndex if every item matches). This is pretty bad code, but here it is:

func findIfNot
  <C: Collection, L: LogicValue>
  (col: C, inout idx: C.IndexType,
  pred: (C.GeneratorType.Element) -> L) {

    until(idx == col.endIndex) {
        unless(pred(col[idx])) { return }
        idx = idx.succ()
    }
}

let a = [1, 2, 3]
let isOdd = { $0%2 == 1}
var idx = a.startIndex 
findIfNot(a, &idx, isOdd)
// idx will not be what you would hope 

Now, no compiler error. Instead, this code will fail much more subtly, always returning as if no non-match was found. If we replace the unless with an if, the code will run forever because the until block will return before moving idx forward, and then loop again.

Unit tests should catch this of course. But as a user of built-in-like functions, you should always be aware they aren’t quite the part of the language they might seem. The bugs resulting might not be as obvious as in this case. Also, this is a good argument for using a more “functional“ approach, by writing general algorithms like findIfNot rarely and testing them thoroughly, and then reusing them as much as possible, rather than writing explicit loops.

Should Apple extend Swift to cover this kind of case? Maybe a keyword that causes a closure that doesn’t return a value to be able to pass the return statement out and act on the calling function?3 Until they do, this is definitely a gotcha to be aware of.


  1. Sadly your implementions will always be lacking one feature that if or while have – leaving off the brackets around the predicate. You can’t write unless so it can be used as if pred { } – unless Apple extends “if the last parameter is a closure expression you can have it outside the parens” to be “if there are two parameters and the last is a closure expression, you can leave the parens off the first parameter”. 
  2. If you don’t follow what the code here is doing, try reading this introduction to algorithms on collections in Swift. 
  3. If I’ve missed something huge here and there’s already a way around this issue, leave a comment! 

Swift’s for loops avoid an easy mistake with variable capture

This article from a while back points out “the biggest mistake everyone makes with closures” in Ruby, Python, Perl and Javascript.1

Try running this code in Ruby, which counts i up from 1 to 3 creating a closure each time which returns the value of i:

a = []
for i in 1..3
  a.push(lambda { i })
end

for f in a
  print "#{f.call()} "
end

and it prints out the possibly surprising value 3 3 3.

Using more “functional” constructs doesn’t always save you. This Python also prints all 3s:2

a = [lambda: i for i in xrange(1, 4)]
for a in f:
  print f()

The article goes on to suggest some workarounds.

Here is the equivalent in Swift:

var a: Array<()->Int> = []

for i in 1...3 {
  a.append( { i })
}

for f in a {
  println("\(f()) ")
}

and happily, this prints 1 2 3.

Note, this also works with references to objects, so if you subsituted a for over a sequence of objects instead of Ints in the code above, each closure would capture a reference to a different object.

Be careful though, this applies to for loops but not to other loops like while or the C-style for(;;), so:

var i: Int = 0
while(++i <= 3) {
    a.append( { i })
}

for f in a {
    print("\(f()) ")
}

prints out 3 3s.

Why is this? Is this a special hack to make for...in behave less surprisingly? To understand why it does this, you need to realize that this:

for i in 1...3 {
  a.append( { i })
}

is just a shorthand equivalent to writing this code using the source sequence’s generator:

var g = (1...3).generator()
while let i = g.next()
  a.append( { i })
}

The answer to our question lies in the let i =, which declares a fresh variable i for each iteration of the loop, hence if you capture i, it will retain its value at the end of the block. If you use this form of while loop, you’ll get the same results as with the for...in version.


  1. Go here for an epic list of how other languages handle this issue. 
  2. The Ruby each will do something similar in version 1.8, though it was altered in 1.9 to behave differently which is somehow more scary. 

Default Parameters in Swift – Dynamically or Statically Bound?

edit: after this post originally went up, the Swift dev team confirmed on the forums that default parameters should be dynamically bound. However, as of Swift 1.1, they’re still statically bound.

Quick quiz. What does the following code do?

class Shape {
  func draw(colour: String = "Red") { 
    println("\(colour) shape")
  }
}

let s1 = Shape()
s1.draw()

It prints Red shape, right? No argument supplied, so the default value of Red is used.

Ok what about this?

class Circle: Shape { 
  override func draw(colour: String = "Blue") { 
    println("\(colour) circle") 
  } 
}

let s2 = Circle()
s2.draw()

It should Blue circle. No surprises there. Again no argument, but the overridden function defaults to a different color and prints a different shape.

Ok finally, what about this?

let s3: Shape = Circle()
s3.draw()

Blue circle again, right? s3‘s static type is the base class, but it points to an inherited class. Polymorphism means the overridden function runs, so it should be just like the previous example.

If Swift bound default parameters dynamically, sure. But as of Swift 1.1, it behaves like C++,1 and you actually get Red circle. This is because parameters are bound statically not dynamically. Similarly, which overloaded function is chosen is also done statically – as seen in our series on Swift overload resolution

Why is C++ like this? Because default arguments are determined at compile time, not at run time. So while the correct function is applied, the default argument is looked up via the static type of s3, which is a Shape not a Circle. This could lead to some quite surprising behavior, which is why this is Item 37 of the Effective C++ book, which has more info on why it is this way (it’s related to run-time performance) and some workarounds to avoid it.


  1. I wrote the same code in C++ to double-check this and, holy cow, writing some simple C++ was painful after a week of Swift.