Protocols and Generics

In a talk last week at Swift Summit, I spoke a bit about Swift and performance, and about how structs and generic functions in Swift allow you to write high-level code without paying a significant performance penalty. This built on the example I posted a couple of weeks back about sorting nibbles in Swift.

Towards the end, I gave a couple of lines of code that seemingly behave the same, but are actually subtly different:

// given a protocol with no Self or associated type requirements, 
// such as Printable (one of the few like this in the standard library)

// how is this...
func f(x: Printable) { }

// different from this...
func g<T: Printable>(x: T) { }

This article is going to look at the differences between the two. Warning – some of the behaviour below is undocumented, and subject to change. All the code has been tested on Swift 1.2 beta 4, but may behave differently in future versions. Also, all size numbers assume a 64-bit platform.

Functional Differences

So what is the difference between the two? In the first case, the function takes an argument with the type of the protocol Printable, and can then access the argument value via any of the methods that protocol provides.

In the second case, g takes an argument of any type T that conforms to Printable. This means T is the type of whatever was passed in as the argument. If you pass in an Int, then T is an Int. If you pass in an array of integers, then T is an [Int].

For most purposes, this distinction doesn’t matter when implementing the body of the function. Even though T might be an Int or an [Int], you can’t use any properties not guaranteed by Printable because the function needs to be valid for any kind of type that conforms to Printable. So you can’t call x.successor() or x.count. Only x.description.

The most important functional difference between the two forms comes when the function returns a value. Suppose you wanted to implement your own version of the standard library’s abs function. It could look something like this:

func myAbs<T: SignedNumberType>(x: T) -> T {
    if x < 0 {
       return -x
    else {
        return x

// myAbs works for any kind of signed number 
// (e.g. Int, Int64, Float, Double etc)
let i: Int8 = -4
let j = myAbs(i)
// j will be of type Int8, with value 4

This function relies on 3 things provided by SignedNumberType: a negation operator, a less-than operator (via SignedNumberType conforming to Comparable), and the ability to create a zero of the same type for comparison (via IntegerLiteralConvertible). It compares to zero and then returns the original value or its negation. Importantly, it returns the same type that was passed in. If you pass in an Int8, you get back an Int8. If you passed in a Double, you get back a Double. If this function were implemented using protocols,1 you’d get back the type of the protocol. Not only would that be inconvenient for type inference purposes, you’d also probably need to cast the value back to the type you wanted using as! in order to make use of it.

Poking at Protocols

So aside from that, how else do the two functions differ? Well, there are still some ways you can tell the difference between the protocol and the T placeholder. For example, you can look at the size of the value:

func takesProtocol(x: Printable) {
    // this will print "40" every time:

func takesPlaceholder<T: Printable>(x: T) {
    // this will print whatever the size of
    // the argument type is (for example, Int64
    // is 8, Int8 is 1, class references are 8)

takesProtocol(1 as Int16)      // prints 40
takesPlaceholder(1 as Int16)   // prints 2

class MyClass: Printable {
    var description: String { return "MyClass" }

takesProtocol(MyClass())     // prints 40
takesPlaceholder(MyClass())  // prints 8

So it looks like Printable is some kind of fixed-sized box that holds any kind of value that is printable. This kind of boxing is quite a common feature in other languages like Java and C#. But here even references to classes are put inside this 40-byte box, which might surprise you if you‘re used to thinking of protocols as like references to pure virtual base classes. This Swift box is geared up to hold both value and reference types.

edit: as @norio_nomura points out, class-only protocols are smaller, at 16 bytes, as they never need to hold a larger-sized payload.

So what’s inside those 40 bytes? Mike Ash had a great series of posts in his Friday Q&A column about poking around inside Swift’s memory layout. Here we’re just going to do a much shorter bit of code to look into contents of the protocol:

// function to dump out the contents of a protocol variable
func dumpPrintable(var p: Printable) {
    // you could also do this with unsafeBitCast
    withUnsafePointer(&p) { ptr -> Void in
        let intPtr = UnsafePointer<Int>(ptr)
        for i in stride(from: 0, to: (sizeof(Printable)/8), by: 1) {
            println("\(i):\t 0x\(String(intPtr[i], radix: 16))")

let i = Int(0xb1ab1ab1a)

// prints out:
0:       0xb1ab1ab1a
1:       0x7fff5ad10f48
2:       0x2000000000
3:       0x10507bfa8
4:       0x105074780

With a single 8-byte integer, the protocol appears to pack the value into the protocol value itself. The next two words look like uninitialized memory (their contents vary on each run), and then the last two are pointers to some metadata about the type that is actually being held. (Mike’s article goes into more detail about this metadata).

If you create a custom struct of size 16 or 24, this is also held within the first 3 words of the protocol. But if you go above this, it switches over to holding a pointer to the referenced value:

struct FourInts: Printable {
    var a = 0xaaaa
    var b = 0xbbbb
    var c = 0xcccc
    var d = 0xdddd
    var description: String { return toString(a,b,c,d) }

dumpPrintable(FourInts(), FourInts.self)

// prints out:
0:       0x7f8b5840fb90  // this is a pointer to a FourInts value
1:       0xaaaa          // uninitialized garbage (?)
2:       0xbbbb          // ditto
3:       0x10dde52a8     // metadata
4:       0x10dde51b8     // metadata

How to know that first part is a pointer to a FourInts type? Well, you can dereference it and see! We need to amend our dumping function slightly to tell it what the real type the protocol is referencing is:

func dumpPrintable<T>(var p: Printable, type: T.Type) {
    withUnsafePointer(&p) { ptr -> Void in
        let intPtr = UnsafePointer<Int>(ptr)
        for i in stride(from: 0, to: (sizeof(Printable)/8), by: 1) {
            println("\(i):\t 0x\(String(intPtr[i], radix: 16))")
        // if the pointed-to value is to big to fit:
        if sizeof(T) > 24 {
            // we have an integer, and we want to turn it into a pointer,
            // so we use the bitPattern: constructor of UnsafePointer<T>
            let valPtr = UnsafePointer<T>(bitPattern: Int(intPtr.memory))
            // and now we can look at the value at that location in memory
            println("value at pointer: \(valPtr.memory)")


// prints out:
0:       0x7f8b5840fb90
1:       0x7fff909c5395
2:       0xaaaa
3:       0x10dde52a8
4:       0x10dde51b8
value at pointer: (43690, 48059, 52428, 56797)

Bingo! Those are the values of the 4 integers.

One final point before we move on. Just because you switch to referring to a value type using a protocol, this does not turn it into a reference type:

protocol Incrementable {
    var x: Int { get }
    mutating func inc()

struct S: Incrementable {
    var x = 0
    mutating func inc() {

var p1: Incrementable = S()
var p2 = p1
p1.x  // now 1
p2.x  // still 0

Performance Implications

OK, so protocols used like this seem to add some level of indirection. Does that cost us anything compared to using the generic placeholder approach? To test this out, we can construct some trivial structs that perform a basic operation, and then attempt to run that operation multiple times via both a protocol and a generic placeholder.

First here’s the protocol:

protocol NumberGeneratorType {
    mutating func generateNumber() -> Int

We’re pretty restricted in what can be done without resorting to associated types, so all it does is spit out a number. Here are 3 implementions that do different things, along with two harnesses that iterate multiple times, totalling the returned numbers:

struct RandomGenerator: NumberGeneratorType {
    func generateNumber() -> Int {
        return Int(arc4random_uniform(10))

struct IncrementingGenerator: NumberGeneratorType {
    var n: Int
    init(start: Int) { n = start }
    mutating func generateNumber() -> Int {
        n += 1
        return n

struct ConstantGenerator: NumberGeneratorType {
    let n: Int
    init(constant: Int) { n = constant }
    func generateNumber() -> Int {
        return n

func generateUsingProtocol(var g: NumberGeneratorType, count: Int) -> Int {
    return reduce(stride(from: 0, to: count, by: 1), 0) { total, _ in
        total &+ g.generateNumber()

func generateUsingGeneric<T: NumberGeneratorType>(var g: T, count: Int) -> Int {
    return reduce(stride(from: 0, to: count, by: 1), 0) { total, _ in
        total &+ g.generateNumber()

Running the above, compiled with -O, with a count of 10k iterations, gives the following timings:

(the full code with the timing can be found here)

Generic rand          261829µs
Protocol rand         286625µs
Generic incr            5481µs
Protocol incr          45094µs
Generic const              0µs
Protocol const         43666µs

So what does this tell us? Calling arc4random is quite expensive, so the marginal difference made by the protocol is negligible but noticeable. But in the case of the incrementing generator, it’s a lot more in proportion to the actual operation being performed.

And in the case of the constant generator, the compiler has managed to optimize away all the code of the generic version and turn it into a single multiply operation (number of iterations times the constant)! So it takes a constant tiny amount of time. But the protocol indirection acted as a barrier to this same optimization.

In fact if you recompile with -Ounchecked, the same happens with the incrementing generator too – it’s only the check for overflow on the increment that was preventing it. The protocol versions remain the same.

For the most part, much of this can probably be put in the “premature optimization” camp. The big wins are really more about the expressiveness generics give you, as seen in the previous article about sorting, and avoiding things like type erasure shown in the earlier section. But if you’re in the habit of writing generic functions, the performance gains are nice too. And when writing re-useable library functions like sort or reduce that are going to be called a lot and perform tight loops, they’re critical.

Of course, all this is subject to change – given that the protocols are not being used to get any kind of dynamic behaviour, perhaps the compiler could have optimized them too. But it doesn’t appear to at the moment.

Dynamic Dispatch

Speaking of dynamic behaviour – that might be one occasion when you might prefer to use protocols rather than generics.

Consider the following code:

func f(p: Printable) {

func g<T: Printable>(t: T) {

let a = [1,2,3]
let i = 1

// this will work fine: either can be converted
// to Printable and passed in, then the appropriate
// version of `description` is called at runtime
f(arc4random()%2 == 0 ? a : i)

// this will _not_ compile.  T needs to be fixed as
// either an array of ints, or an int at compile time, 
// it can't be both
g(arc4random()%2 == 0 ? a : i)

A contrived example obviously, but it shows how even structs in Swift can get dynamic behaviour at runtime via protocols, that you might find a use-case for.

There’s lots more to explore with generics. The mental model of “Swift just writes a version of your function for each different type” isn’t quite how it works in practice. And associated types and constraints between multiple placeholders allow you to create kinds of generic algorithms that would be very difficult to achieve with protocols. But this article is already pretty long so I’ll leave those for another day.

  1. In fact you couldn’t implement this function using protocols rather than generics, because SignedIntegerType uses Self and (via IntegerLiteralConvertible) associated types. A topic for a different post. 

Changes to the Swift standard library in 1.2 betas 2 and 3

Swift v1.2 beta 2 and 3 bring quite a few new enhancements, many of which build on what was introduced in beta 1.

The if let syntax continues to evolve – as of beta 2, you can check for a boolean condition and, if it’s true, then proceed to unwrap values. Nate Cook has a good example of this in his NSScanner extension:

// NSScanner.scanCharactersFromSet returns the scanned string
// by assigning it to a pointer you pass in:
var value: NSString? = ""

// The method returns true only if a string was successfully
// scanned, so check that first:
if scanner.scanCharactersFromSet(set, intoString: &value),
   // then if true, unwrap the value and convert to a String:
   let str = value as? String 
    // now use the valid scanned String as str

Flat Map

Continuing the theme of language features that help with handling optionals: beta 3 introduces flatMap. This is a map operation, followed by a flattening operation. The release notes have a good example of its use when you “want to chain optionals with functions”, so let’s drill into that a little.

The existing map allows you to apply a function to the value inside an optional, if that optional is non-nil. For example, suppose you have an optional integer i and you want to double it. You could write { $0 * 2 }. If i has a value, you get back an optional of that value doubled. On the other hand, if i is nil, no doubling takes place.

Now, suppose instead of doubling an integer, you wanted to perform a mapping on an optional that could itself return an optional. The example in the release notes is find, which searches an array for a value and returns the index if it’s found – or nil if not found:

// an array of arrays
let arr = [[1,2,3],[4,5,6]]
// .first returns an optional of the first element of the array
// (optional because the array could be empty, in which case it's nil)
let fst = arr.first  // fst is now [Int]?, an optional array of ints
// now, if we want to find the index of the value 2, we could use map and find
let idx = { find($0, 2) }

The trouble with this is, idx will be an Int??, because fst was already optional, and then we mapped it with find, that also returns an optional – so now it’s an optional wrapped in an optional. What we want is to “flatten” this nested optional – so instead of map we use flatMap:

// flatMap means idx will be an Int?, not an Int??
let idx = fst.flatMap { find($0, 2) }

Here’s another use of flatMap: since we have an optional array, you could use it to get the first element of the first element:

// getting the first element of an optional array
let innerFst = fst.flatMap(first)

Now, if you’ve been programming Swift for a while, you might say hang on, since arrays have a first method, I could use optional chaining:

// using optional chaining to get the first element of an optional array
let innerFst = fst?.first

And at this point, hopefully it’s clear that flatMap and optional chaining do very similar things. Optional chaining essentially is a compact version of flatMap that only works on methods. When you want the same functionality as optional chaining, but instead with a function you need to pass the optional value into (like with find) instead of calling a method on that value, you need flatMap.

That’s not all flatMap does – it’s also a method on arrays (think of a mapping function that turns elements into arrays of elements), and there’s a generic free function version that works on sequences and collections. I won’t go into those here – I’m sure there will be a flood of Swift blogs coming up that will cover it in more detail. Just grab one as it drifts by, read, and experiment.1

Grab Bag

There are a few other changes to the standard library across the two releases:

  • The precendence of the ?? has been bumped from 110 to 131. This means it now has higher precedence than equality (130). So i == j ?? 0 now replaces a nil value for j with 0 prior to equating it to i. Note, it still doesn’t trump arithmetic operators, so unbracketed monstrocities like i + j ?? 0 won’t compile.
  • Slice has been renamed ArraySlice, making its purpose clearer (strings, for example, are their own slice type). Other than the name, nothing else has changed (except the additional flatMap).
  • &/ and &% are gone, as mentioned in the release notes.
  • Following its introduction in beta 1, various function arguments have been marked @noescape. Examples include various withUnsafeXXXPointer functions, and various map, reduce and filter functions. Of course, lazy‘s map is not marked like this because it does capture self.
  • Similar to UnsafePointer, AutoreleasingUnsafeMutablePointer and CFunctionPointer have lost their static null instance.
  • But they have acquired conformance to CVarArgType, as does UnsafePointer
  • Double and Float, on the other hand, are now conforming to _CVarArgPassedAsDouble, a specialization of CVarArg. This is to enable x86_64-specific handling apparently.
  • Character is now DebugPrintable
  • If you want a “how to speak Swift” tip, in various documenting comments, things are now called “values” rather than “objects”.
  • String.Index, String.UTF16Index and String.UTF8Index now have methods for converting between each other.
  • String.UnicodeScalarView now conforms to ExtensibleCollectionType and RangeReplaceableCollectionType, and its index is now bidirectional and comparable, giving it a lot of the same capabilities of the String itself.
  • UInt8 has a constructor from an ASCII UnicodeScalar (with a warning that the scalar better actually be ASCII!)
  • In a boon to reckless optimizers everywhere, unsafeUnwrap will now force-unwrap your optional without safety-checking non-nilness first.

Trailing Closures and Default Arguments

A subtle change in the language means that the split function has had its arguments reordered.

Previously, its second parameter was the isSeparator function, followed by a couple of defaulted arguments (maxSplit and allowEmptySlices). But as of Swift 1.2, when you supply a function via the “trailing closure syntax”, that trailing closure is always the last parameter. So if you tried to call the old split without supplying those defaulted arguments, you’d get an error, because it isn’t the last paramter, it’s the second. The new split makes the isSeparator function the last parameter.

While we’re talking about default arguments, did you know you that you can supply them in any order?

func f(a: String = "A", b: String = "B") {
    println("(a) (b)")

f(b: "bee")             // prints A bee
f(b: "Bee!", a: "Eh?")  // prints Eh? Bee!

Signed Mismatches

OK this last one’s a bit language-lawyerly so if that sort of stuff bores you, you’ve been warned!

You might be surprised to find that, up until Swift 1.2, the following code would compile:

let x: UInt = 5
let y: Int = 6
let z = x + y
// z will be a UInt of value 11

Given that Swift is usually so strict about not mixing types, how come this is allowed? It comes down to the following definition for + that appeared way back in Swift 1.0 beta 4:

// if a type is strideable, you can add its stride type to it
func +(lhs: T, rhs: T.Stride) -> T

// and strides can be signed
protocol _Strideable {
    /// A type that can represent the distance between two values of `Self`
    typealias Stride: SignedNumberType

And since UInt is strideable, and defines its distance (and thus its stride) as a signed integer, that means you can add signed integers to unsigned integers.2

Swift 1.2 fixes this state of affairs by introducing two bits of code. First, it adds a typealias to the _UnsignedIntegerType protocol:

protocol _UnsignedIntegerType : _IntegerType {
    typealias _DisallowMixedSignArithmetic : SignedIntegerType = Int

Next, it adds a new overload for the + operator:

func +<T : _UnsignedIntegerType>(lhs: T, rhs: T._DisallowMixedSignArithmetic) -> T

The upshot of this is if you try and add an Int to a UInt, there are now two matching overloads for the + operator: the old one, and the new one, and Swift will complain that they’re ambiguous. Hopefully the fact that the compiler error mentions mixed-sign arithmetic will be enough of a hint you that this is the problem. A bit clunky, but it does the job.

  1. At some point, someone will try to explain to you that flatMap is actually something called monadic bind. Don’t panic when they do – the most important thing is to understand what it does. This more abstract concept will come easily once you have a good grasp of optional and array versions of flatMap under your belt. When you’re ready, I’d recommend reading Rob Napier’s article and this series by Alexandros Salazar. 
  2. Though not the other way around i.e. y + x wouldn’t compile – which is another problem, since people expect integer addition to be symmetric. 

Sorting Nibbles in Swift

A couple of months back, John Regehr posted a challenge on his blog – write the fastest possible sorting algorithm for all the nibbles in a 64-bit word.

The problem is to sort the 4-bit pieces of a 64-bit word with (unsigned) smaller values towards the small end of the word. The nibble sort of 0xbadbeef is 0xfeedbba000000000. The function you implement will perform this sorting operation on a buffer of 1024 64-bit integers:

void nibble_sort(unsigned long *buf);

The challenge was in C, but it made me wonder about solving the same problem in Swift. Here’s a straight port of the supplied reference version in Swift:

func read_nibble(w: UInt64, i: UInt64) -> UInt64 {
    let res = w >> (i * 4)
    return res & 0xf

func write_nibble(inout w: UInt64, i: UInt64, v: UInt64) {
    var mask: UInt64 = 0xf
    mask <<= (i * 4);
    w &= ~mask
    var prom = v;
    prom <<= (i * 4)
    w |= prom

func nibble_sort_word_ref(var arg: UInt64) -> UInt64 {
    for var i: UInt64 = 0; i < 16; ++i {
        var min = i;
        for var j = i+1; j < 16; ++j {
            if read_nibble(arg, j) < read_nibble(arg, min) {
                min = j
        if (min != i) {
            var tmp = read_nibble(arg, i)
            write_nibble(&arg, i, read_nibble(arg, min))
            write_nibble(&arg, min, tmp)
    return arg;

func nibble_sort_ref(buf: UnsafeMutablePointer<UInt64>) {
    for var i=0; i<1024; i++ {
        buf[i] = nibble_sort_word_ref(buf[i])

The first bit of good news is that on my machine, when compiled with -Ounchecked, this version performs with pretty much the same speed as the original C version. Sticking with -O instead gives up about ~20% performance, which is not too bad for the sake of safety.

So, this version has two functions to read and write the nibble at a given index, and a function to sort it using them. Get/set at a given index? That sounds a lot like a collection type. What if we were to wrap the word in a collection, and use subscript instead of the read/write functions? Something like this:

struct NibbleCollection: MutableCollectionType {
    var val: UInt64
    var startIndex: UInt64 { return 0 }
    var endIndex: UInt64 { return 16 }
    subscript(idx: UInt64) -> UInt64 {
        get {
            return (val >> (idx*4)) & 0xf
        set(n) {
            let mask = 0xf << (idx * 4)
            val &= ~mask
            val |= n << (idx * 4)
    typealias Generator = IndexingGenerator<NibbleCollection>
    func generate() -> Generator { return Generator(self) }

Once you have this, you can then use the standard library sorting function on it to implement the challenge:

func nibble_sort_word_col(arg: UInt64) -> UInt64 {
    var col = NibbleCollection(val: arg)
    return col.val

“Does all this Collection overhead mean it won’t be as fast as the raw function implementation?” you might worry. But there really is no overhead – since NibbleCollection is a struct, not a class, it really isn’t anything more than the raw underlying UInt64 and some functions. All of the scaffolding sloughs away at compile time, leaving you with pretty much identical code to the function-based version.

With one big difference: Swift.sort is way better than the reference sort implementation above.1 So this version outperforms it, both in Swift and in C. With -O, the collection version is faster than the Swift reference, and the same speed as the C version. With -Ounchecked it outperforms the C version by about 25%.

Which tells you two things you should bear in mind when writing Swift:

  1. Structs implementing protocols, and generic functions, are a good way of getting abstraction without giving up performance.
  2. Always check if the code you want exists as a library function. Using it won‘t just save time and make your code clearer, it could be better optimized than the version you might write.

Of course, the competitive entries in C were orders of magnitute faster than the reference version, using bit-twiddling techniques like bucket sorts and lookup tables that take advantage of the problem being constrained to values less than 16 held in a single word. But since Swift also supports most of the operations they use, many of them can be ported too. A quick and dirty translations I did of a couple of the simpler ones suggests that, like the reference version, Swift can match or almost match them in performance too.

  1. Naming the two sort algorithms in question is an exercise for the reader. They might not be the one you’d guess first.