Generic Collections, SubSequences and Overloading

So, you’ve figured out the whole indices versus distances situation, and now you’re feeling pretty happy and you start trying out some more generic collection algorithms. And before long you smack your head against one of the more confusing things about generic collections: a sequence’s SubSequence doesn’t necessarily contain the same element type as that sequence. This usually manifests itself in one of those compiler errors that has you yelling “what are you talking about you infernal machine” at the compiler before eventually realizing what you’ve done wrong and sheepishly fixing it.

For example, suppose you want to find the location of the first subsequence in a collection. Like indexOf, but for another sequence rather than an element. The simple brute-force approach would be to iterate over every subsequence, trying to match it. Something like this – except it won’t compile:

extension CollectionType
  where 
  Generator.Element: Equatable {

    func search<
      Other: SequenceType 
        where 
        Other.Generator.Element == Generator.Element
    >(pattern: Other) -> Index? {

        for idx in self.indices {
            // error: cannot convert value of type 'Other' 
            // to expected argument type...
            if suffixFrom(idx).startsWith(pattern) {
                return idx
            }
        }
        return nil
    }
}

Until you know why, this can be very confusing. You’ve constrained the other sequence to have the same elements as the collection, so why can’t you use startsWith on it?

Except you’re not calling startsWith on Self. You’re calling it on Self.SubSequence, because that’s what suffixFrom returns – it’s a slicing operation. And SubSequence.Generator.Element isn’t guaranteed to be the same as Self.Generator.Element.

Here’s the relevant declaration:

protocol CollectionType {
    typealias SubSequence : Indexable, SequenceType = Slice<Self>
}

There’s nothing about this definition that forces the elements of SubSequence to be anything specific. Really, this needs a where clause on the typealias – something that Swift doesn’t currently support:1

protocol CollectionType {
    typealias SubSequence : Indexable, SequenceType
       // not (yet?) valid Swift
       where SubSequence.Generator.Element == Generator.Element
}

So, in the mean-time, the way around this is to specify any requirements like this as part of the declaration. A fix that might seem to work, but is over-constraining, would be to require subsequences be the same type:

extension CollectionType
  where
  Generator.Element: Equatable,
  SubSequence == Self {
    // etc
}

This would compile, and it would work fine with types whose slices are the same type as themselves, like strings:

// a slice of a String.CharacterView is
// a String.CharacterView, so this is fine 
"hello".characters.search("ll".characters)

but not for arrays, who have an ArraySlice type, or arbitrary collections, that have the default Slice implementation.

// "type of expression is ambiguous without more context", apparently
[1,2,3,4].search(2…4)
// "ambiguous reference to member '..<'" uhhh what?
(0..<5).search(2...4)

Instead, you can constrain the elements to be the same while allowing the slice type itself to differ:

extension CollectionType
  where
  Generator.Element: Equatable,
  // guarantee subsequences contain the same element
  SubSequence.Generator.Element == Generator.Element {
    
    func search<
      Other: SequenceType
        where
        Other.Generator.Element == Generator.Element
    >(pattern: Other) -> Index? {
            
            for idx in self.indices {
                // and now this line will happily compile...
                if suffixFrom(idx).startsWith(pattern) {
                    return idx
                }
            }
            return nil
    }
}

[1,2,3,4].search(2...4) // returns 1
(0..<5).search([3,4])   // returns 3

Optimizing for Random Access

It’s often the case that you can optimize a collection algorithm if you know it has random access. For example, given the above algorithm, you can tweak it to avoid searching for a pattern that is longer than what remains of the collection, so long as both the collection and the pattern are random access and therefore able to calculate their lengths (and the lengths of their collection) in constant time. OK, maybe it’s not Boyer-Moore, but it’s better than nothing and the code to do this is pretty simple, in fact it’s more constraints than code at this point:

extension CollectionType
  where
  Generator.Element: Equatable,
  Index: RandomAccessIndexType,
  SubSequence.Generator.Element == Generator.Element {
    func search<
      Other: CollectionType
      where
      Other.Index: RandomAccessIndexType,
      Other.Index.Distance == Index.Distance,
      Other.Generator.Element == Generator.Element
    >(pat: Other) -> Index? {
            
        // if pattern is longer, this cannot match, exit early
        guard !isEmpty && pat.count <= count else { return nil }
        
        // otherwise, search from the start up to the 
        // end less the space for the pattern
        for i in startIndex...endIndex.advancedBy(-pat.count) {
            // check if a slice from this point
            // starts with the pattern
            if self.suffixFrom(i).startsWith(pat) {
                // if it does, we've found it
                return i
            }
        }
        // otherwise, not found
        return nil
    }
}

(one part of the where clause, Other.Index.Distance == Index.Distance, is maybe overconstraining, but the alternative would be fiddling around with casts – non-Int distances aren’t exactly common, even Bit’s distance is an Int).

We’ve talked about overload priority a long time ago, and it still pertains even with protocol extensions. The most specific function wins. Here, if your two collections are random access, this version will be called because it is more tightly constrained than the first version. So:

// calls the version that works for any index
"hello".characters.search("ll".characters)
// calls the version optimized for random-access
[1,2,3,4].search(2...4) // returns 1?

Now, there is a potentially nasty bug you could stumble into here, and the compiler won’t save you from it. Try removing the constraints requiring the indices be random access on the second version. It still compiles! Except, now this version will be called strings as well. But in the case of non-random indices, it’s not an optimized version, it’s potentially slower. The reason being, pat.count and endIndex.advancedBy aren’t constant time, they’re linear, because they involve incrementing the index one by one. Now it could be worse – at least it’s not accidentally quadratic, but other examples could easily be (imagine implementing a sort that assumed constant-time advance but didn’t get it), so it’s something to watch out for.

Protocols, Extensions and Dynamic Dispatch

Once upon a time, it was harder to have made this mistake. In earlier versions of the standard library there were two ways to advance an index: a free advance function, that was O(1) for random access indices and O(n) otherwise, and then there was an advancedBy method on only the RandomAccessIndexType.

As part of the great extensionizing, the free advance function went away, and advancedBy moved into ForwardIndexType. So now you can call it on any index type, but you’ll get different performance characteristics depending on the type you call it on.

If you look at the declaration of ForwardIndex.advancedBy, you’ll see that it’s declared twice. Once inside the protocol:

public protocol ForwardIndexType : _Incrementable {
    public func advancedBy(n: Self.Distance) -> Self
}

and then again just below, as an extension:

extension ForwardIndexType {
    public func advancedBy(n: Self.Distance) -> Self
}

This seems a little weird. Why declare it twice? Why not just declare it once as an extension? It has to do with static versus dynamic dispatch, and which actual implementation is called.

To show this, let’s create our own toy index, initially without random access:

struct MyIndex {
    let i: Int
}

extension MyIndex: BidirectionalIndexType {
    func successor() -> MyIndex {
        print("successing!")
        return MyIndex(i: i+1)
    }
    func predecessor() -> MyIndex {
        print("predecessing!")
        return MyIndex(i: i-1)
    }
}

// indices need to be Equatable
func == (lhs: MyIndex, rhs: MyIndex) -> Bool {
    return lhs.i == rhs.i
}

I’ve put prints in there so you can tell when the functions are called. Now, this index type will have the default advancedBy implementation, which just calls successor() multiple times. And because of the print, you can see it happening:

func castHolyHandGrenadeOfAntioch<
  I: ForwardIndexType
>(idx: I) {
    idx.advancedBy(3)
}

let idx = MyIndex(i: 1)
castHolyHandGrenadeOfAntioch(idx)
// prints “successing!” 3 times

OK, now, we conform the type to be random access by giving it advance and distance functions:

extension MyIndex: RandomAccessIndexType {
    func advancedBy(n: Int) -> MyIndex {
        return MyIndex(i: i+n)
    }
    func distanceTo(end: MyIndex) -> Int {
        return end.i - i
    }
}

and now, when we call the castHolyHandGrenadeOfAntioch, we see nothing printed. Instead, even though the function only required a forward index, it uses the better advancedBy implementation of the random-access index.

But… this only works because advancedBy had been declared in the original ForwardIndexType protocol declaration. If, on the other hand, we added another method as an extension to different index types – say, a retreatedBy function – this would not happen.

First, let’s do that for BidirectionalIndexType:

extension BidirectionalIndexType {
    // just an example, you don't have to do this!
    // since BidirectionalIndexType also supports
    // advancedBy(-distance)
    func retreatedBy(n: Distance) -> Self {
        var i = n
        var result = self
        // since Distance isn’t strideable, this
        // is a rare case where you might mourn the
        // loss of C-style for and the ++ operator…
        while i > 0 {
            result = result.predecessor()
            i -= 1
        }
        return result
    }
}

and then, an optimized version for RandomAccessIndexType:

extension RandomAccessIndexType {
    func retreatedBy(n: Distance) -> Self {
        return self.advancedBy(-n)
    }
}

But now, if we repeat our earlier experiment but calling this function, we get a different result:

// only requires bidirectional access
func runAway<I: BidirectionalIndexType>(idx: I) {
    idx.retreatedBy(10)
}

// and even though MyIndex is random access
let braveSirRobin = MyIndex(i: 10)

// the bidirectional version is called...
runAway(braveSirRobin)
// prints “predecessing!” 10 times

So we can see the difference between default implementations that are also declared as part of the protocol, versus ones just created as extensions. With the former, generic functions get to take advantage of specialized implementations, whereas with the latter, only the implementation that matches the generic constraint will be called. This is why you often see functions declared twice in the standard library. For example, on SequenceType, underestimateCount is declared like this, so that if you pass in a collection to a function that takes a sequence, it can check it’s length if it’s a collection, without risking consuming the sequence.


  1. That’s assuming that there isn’t a legitimate reason to vary elements in subsequences which I don’t… think… there is? 

My Talk at Swift Summit

Earlier in the year, I spoke at Swift Summit, and the video has just been posted.

I also added some annotations to the slides with more info, credits and further links, which you can find here on speaker deck.

Swift Summit was a really fun conference to attend with many great speakers, and assuming it happens again next year I’d heartily recommend it.

Protocols and Generics

In a talk last week at Swift Summit, I spoke a bit about Swift and performance, and about how structs and generic functions in Swift allow you to write high-level code without paying a significant performance penalty. This built on the example I posted a couple of weeks back about sorting nibbles in Swift.

Towards the end, I gave a couple of lines of code that seemingly behave the same, but are actually subtly different:

// given a protocol with no Self or associated type requirements, 
// such as Printable (one of the few like this in the standard library)

// how is this...
func f(x: Printable) { }

// different from this...
func g<T: Printable>(x: T) { }

This article is going to look at the differences between the two. Warning – some of the behaviour below is undocumented, and subject to change. All the code has been tested on Swift 1.2 beta 4, but may behave differently in future versions. Also, all size numbers assume a 64-bit platform.

Functional Differences

So what is the difference between the two? In the first case, the function takes an argument with the type of the protocol Printable, and can then access the argument value via any of the methods that protocol provides.

In the second case, g takes an argument of any type T that conforms to Printable. This means T is the type of whatever was passed in as the argument. If you pass in an Int, then T is an Int. If you pass in an array of integers, then T is an [Int].

For most purposes, this distinction doesn’t matter when implementing the body of the function. Even though T might be an Int or an [Int], you can’t use any properties not guaranteed by Printable because the function needs to be valid for any kind of type that conforms to Printable. So you can’t call x.successor() or x.count. Only x.description.

The most important functional difference between the two forms comes when the function returns a value. Suppose you wanted to implement your own version of the standard library’s abs function. It could look something like this:

func myAbs<T: SignedNumberType>(x: T) -> T {
    if x < 0 {
       return -x
    }
    else {
        return x
    }
}

// myAbs works for any kind of signed number 
// (e.g. Int, Int64, Float, Double etc)
let i: Int8 = -4
let j = myAbs(i)
// j will be of type Int8, with value 4

This function relies on 3 things provided by SignedNumberType: a negation operator, a less-than operator (via SignedNumberType conforming to Comparable), and the ability to create a zero of the same type for comparison (via IntegerLiteralConvertible). It compares to zero and then returns the original value or its negation. Importantly, it returns the same type that was passed in. If you pass in an Int8, you get back an Int8. If you passed in a Double, you get back a Double. If this function were implemented using protocols,1 you’d get back the type of the protocol. Not only would that be inconvenient for type inference purposes, you’d also probably need to cast the value back to the type you wanted using as! in order to make use of it.

Poking at Protocols

So aside from that, how else do the two functions differ? Well, there are still some ways you can tell the difference between the protocol and the T placeholder. For example, you can look at the size of the value:

func takesProtocol(x: Printable) {
    // this will print "40" every time:
    println(sizeofValue(x))
}

func takesPlaceholder<T: Printable>(x: T) {
    // this will print whatever the size of
    // the argument type is (for example, Int64
    // is 8, Int8 is 1, class references are 8)
    println(sizeofValue(x))
}

takesProtocol(1 as Int16)      // prints 40
takesPlaceholder(1 as Int16)   // prints 2

class MyClass: Printable {
    var description: String { return "MyClass" }
}

takesProtocol(MyClass())     // prints 40
takesPlaceholder(MyClass())  // prints 8

So it looks like Printable is some kind of fixed-sized box that holds any kind of value that is printable. This kind of boxing is quite a common feature in other languages like Java and C#. But here even references to classes are put inside this 40-byte box, which might surprise you if you‘re used to thinking of protocols as like references to pure virtual base classes. This Swift box is geared up to hold both value and reference types.

edit: as @norio_nomura points out, class-only protocols are smaller, at 16 bytes, as they never need to hold a larger-sized payload.

So what’s inside those 40 bytes? Mike Ash had a great series of posts in his Friday Q&A column about poking around inside Swift’s memory layout. Here we’re just going to do a much shorter bit of code to look into contents of the protocol:

// function to dump out the contents of a protocol variable
func dumpPrintable(var p: Printable) {
    // you could also do this with unsafeBitCast
    withUnsafePointer(&p) { ptr -> Void in
        let intPtr = UnsafePointer<Int>(ptr)
        for i in stride(from: 0, to: (sizeof(Printable)/8), by: 1) {
            println("\(i):\t 0x\(String(intPtr[i], radix: 16))")
        }
    }
}

let i = Int(0xb1ab1ab1a)
dumpPrintable(i)

// prints out:
0:       0xb1ab1ab1a
1:       0x7fff5ad10f48
2:       0x2000000000
3:       0x10507bfa8
4:       0x105074780

With a single 8-byte integer, the protocol appears to pack the value into the protocol value itself. The next two words look like uninitialized memory (their contents vary on each run), and then the last two are pointers to some metadata about the type that is actually being held. (Mike’s article goes into more detail about this metadata).

If you create a custom struct of size 16 or 24, this is also held within the first 3 words of the protocol. But if you go above this, it switches over to holding a pointer to the referenced value:

struct FourInts: Printable {
    var a = 0xaaaa
    var b = 0xbbbb
    var c = 0xcccc
    var d = 0xdddd
    var description: String { return toString(a,b,c,d) }
}

dumpPrintable(FourInts(), FourInts.self)

// prints out:
0:       0x7f8b5840fb90  // this is a pointer to a FourInts value
1:       0xaaaa          // uninitialized garbage (?)
2:       0xbbbb          // ditto
3:       0x10dde52a8     // metadata
4:       0x10dde51b8     // metadata

How to know that first part is a pointer to a FourInts type? Well, you can dereference it and see! We need to amend our dumping function slightly to tell it what the real type the protocol is referencing is:

func dumpPrintable<T>(var p: Printable, type: T.Type) {
    withUnsafePointer(&p) { ptr -> Void in
        let intPtr = UnsafePointer<Int>(ptr)
        for i in stride(from: 0, to: (sizeof(Printable)/8), by: 1) {
            println("\(i):\t 0x\(String(intPtr[i], radix: 16))")
        }
        // if the pointed-to value is to big to fit:
        if sizeof(T) > 24 {
            // we have an integer, and we want to turn it into a pointer,
            // so we use the bitPattern: constructor of UnsafePointer<T>
            let valPtr = UnsafePointer<T>(bitPattern: Int(intPtr.memory))
            // and now we can look at the value at that location in memory
            println("value at pointer: \(valPtr.memory)")
        }
    }
}

dumpPrintable(FourInts(),FourInts.self)

// prints out:
0:       0x7f8b5840fb90
1:       0x7fff909c5395
2:       0xaaaa
3:       0x10dde52a8
4:       0x10dde51b8
value at pointer: (43690, 48059, 52428, 56797)

Bingo! Those are the values of the 4 integers.

One final point before we move on. Just because you switch to referring to a value type using a protocol, this does not turn it into a reference type:

protocol Incrementable {
    var x: Int { get }
    mutating func inc()
}

struct S: Incrementable {
    var x = 0
    mutating func inc() {
        ++x
    }
}

var p1: Incrementable = S()
var p2 = p1
p1.inc()
p1.x  // now 1
p2.x  // still 0

Performance Implications

OK, so protocols used like this seem to add some level of indirection. Does that cost us anything compared to using the generic placeholder approach? To test this out, we can construct some trivial structs that perform a basic operation, and then attempt to run that operation multiple times via both a protocol and a generic placeholder.

First here’s the protocol:

protocol NumberGeneratorType {
    mutating func generateNumber() -> Int
}

We’re pretty restricted in what can be done without resorting to associated types, so all it does is spit out a number. Here are 3 implementions that do different things, along with two harnesses that iterate multiple times, totalling the returned numbers:

struct RandomGenerator: NumberGeneratorType {
    func generateNumber() -> Int {
        return Int(arc4random_uniform(10))
    }
}

struct IncrementingGenerator: NumberGeneratorType {
    var n: Int
    init(start: Int) { n = start }
    mutating func generateNumber() -> Int {
        n += 1
        return n
    }
}

struct ConstantGenerator: NumberGeneratorType {
    let n: Int
    init(constant: Int) { n = constant }
    func generateNumber() -> Int {
        return n
    }
}


func generateUsingProtocol(var g: NumberGeneratorType, count: Int) -> Int {
    return reduce(stride(from: 0, to: count, by: 1), 0) { total, _ in
        total &+ g.generateNumber()
    }
}

func generateUsingGeneric<T: NumberGeneratorType>(var g: T, count: Int) -> Int {
    return reduce(stride(from: 0, to: count, by: 1), 0) { total, _ in
        total &+ g.generateNumber()
    }
}

Running the above, compiled with -O, with a count of 10k iterations, gives the following timings:

(the full code with the timing can be found here)

Generic rand          261829µs
Protocol rand         286625µs
Generic incr            5481µs
Protocol incr          45094µs
Generic const              0µs
Protocol const         43666µs

So what does this tell us? Calling arc4random is quite expensive, so the marginal difference made by the protocol is negligible but noticeable. But in the case of the incrementing generator, it’s a lot more in proportion to the actual operation being performed.

And in the case of the constant generator, the compiler has managed to optimize away all the code of the generic version and turn it into a single multiply operation (number of iterations times the constant)! So it takes a constant tiny amount of time. But the protocol indirection acted as a barrier to this same optimization.

In fact if you recompile with -Ounchecked, the same happens with the incrementing generator too – it’s only the check for overflow on the increment that was preventing it. The protocol versions remain the same.

For the most part, much of this can probably be put in the “premature optimization” camp. The big wins are really more about the expressiveness generics give you, as seen in the previous article about sorting, and avoiding things like type erasure shown in the earlier section. But if you’re in the habit of writing generic functions, the performance gains are nice too. And when writing re-useable library functions like sort or reduce that are going to be called a lot and perform tight loops, they’re critical.

Of course, all this is subject to change – given that the protocols are not being used to get any kind of dynamic behaviour, perhaps the compiler could have optimized them too. But it doesn’t appear to at the moment.

Dynamic Dispatch

Speaking of dynamic behaviour – that might be one occasion when you might prefer to use protocols rather than generics.

Consider the following code:

func f(p: Printable) {
    println(p)
}

func g<T: Printable>(t: T) {
    println(t)
}

let a = [1,2,3]
let i = 1

// this will work fine: either can be converted
// to Printable and passed in, then the appropriate
// version of `description` is called at runtime
f(arc4random()%2 == 0 ? a : i)

// this will _not_ compile.  T needs to be fixed as
// either an array of ints, or an int at compile time, 
// it can't be both
g(arc4random()%2 == 0 ? a : i)

A contrived example obviously, but it shows how even structs in Swift can get dynamic behaviour at runtime via protocols, that you might find a use-case for.

There’s lots more to explore with generics. The mental model of “Swift just writes a version of your function for each different type” isn’t quite how it works in practice. And associated types and constraints between multiple placeholders allow you to create kinds of generic algorithms that would be very difficult to achieve with protocols. But this article is already pretty long so I’ll leave those for another day.


  1. In fact you couldn’t implement this function using protocols rather than generics, because SignedIntegerType uses Self and (via IntegerLiteralConvertible) associated types. A topic for a different post.