Changes to the Swift Standard Library in 1.2 beta 1

Posted on February 11, 2015March 26, 2015 by airspeedvelocity

Swift v1.2 is out, and, well, it’s pretty fab actually.

The most exciting part for me is the beefed-up if let syntax. Not only does it allow you to unwrap multiple optionals at once, you can define a later let in terms of an earlier one. For instance, here are 3 functions, each one of which takes a non-optional and returns an optional, chained together:

if let url = NSURL(string: urlString),
   let data = NSData(contentsOfURL: url),
   let image = UIImage(data: data) {
      // display image
}

The key here is that if at any of the preceding lets fail, the whole thing fails and the later functions are not called. If you’re already familiar with optional chaining with ?. this should come naturally to you. For more on this, see NSHipster’s article.

The new ability to defer initialization of let constants makes much of of my “avoiding var” post redundant, as you can now do this: ¹

let direction: Direction    
switch (char) {
    case "a": direction = .Left
    case "s": direction = .Right
    default:  direction = .Forward
}

It’d still be nice to have a form of switch that evaluates to a value – but with the new let, that’s really just a desire for some syntactic sugar (imagine using it inside a short closure expression, without needing any return).

Speaking of switch, the new if, with the ability to have a where clause on the bound variables, means if may now be suitable for several places where you were previously using switch. But there’s still a place for switch – especially when using it for full coverage of each case of an enum. And I feel a bit like pattern matching via ~= is one of switch’s powerful overlooked features that I keep meaning to write a post about. This release adds a ~= operator for ranges, not just intervals.

Library Additions

The big new addition to the standard library is of course the Set collection type. The focus on mathematical set operations is very nice, and also that those operations take any SequenceType, not just another Set:

let alphabet = Set("abcdefghijklmnopqrstuvwxyz")
let consonants = alphabet.subtract("aeiou")
// don't forget, it's "jumps" not "jumped"
let foxy = "the quick brown fox jumped over the lazy dog"
alphabet.isSubsetOf(foxy)  // false

Note that the naming conventions emphasize the pure versions of these operations, which create new sets, with each operation having a mutating equivalent with an InPlace suffix:

// create a new set
var newalpha = consonants.union("aeiou")
// modify an existing set
newalpha.subtractInPlace(consonants)

Note also for those fearful of the evil of custom operators, there’s no - for set subtraction etc. The one operator defined for sets is ==. Unlike with Array, sets are also Equatable (since they mandate their contents are Equatable, via Hashable).

But that’s not all. Here’s a rundown of some of the other library changes:

There‘s no longer a separate countElements call for counting the size of collections vs count for ranges – it’s now just count for both.
Dictionary‘s index, which used to be bi-directional, is now forward-only, as are the lazy collections you get back from keys and values.
The C_ARGC and C_ARGV variables are gone, as is the _Process protocol. In their place there’s now a Process enum with 3 static methods: arguments, which gives you a [String] of the arguments, and argc and argv with give you the raw C-string versions.
Protocols that used to declare class funcs now declare static funcs since the language now requires that.
Array, ClosedInterval and Slice now conform to a new protocol, _DestructorSafeContainer. This protocol indicates “a container is destructor safe if whether it may store to memory on destruction only depends on its type parameters.” i.e. an array of Int is known never to allocate memory when its destroyed since Int doesn’t, whereas a more complex type (with a deinit, or aggregating a type with one) may not be.
They also now conform to a new __ArrayType protocol. Two underscores presumably means you definitely definitely shouldn’t try and conform to this directly yourself.
The builtin-converting Array.convertFromHeapArray function is gone.
The pointer types (AutoreleasingUnsafeMutablePointer, UnsafePointer and UnsafeMutablePointer now conform to a new _PointerType protocol.
The CVaListPointer (the equivalent of C’s va_list) now has an init from an unsafe pointer.
EmptyGenerator now has a public initializer, so you can use it directly.
The various unicode types now have links in their docs to the relevant parts of unicode.org’s glossary
HeapBuffer and HeapBufferStorage are gone, replaced by the more comprehensive and documented ManagedBuffer and ManagedBufferPointer, of which more later.
ObjectIdentifier is now Comparable.
The enigmatic OnHeap struct is gone.
All the views (UnicodeScalarView, UTF8View and UTF8View) inside String are now Printable and DebugPrintable
String.lowercaseString and uppercaseString are back! I missed you guys.
The various convertFrom static functions in StringInterpolationConvertible are now inits.
UnsafePointer and UnsafeMutablePointer no longer have a static null() instance, presumably you should just use the nil literal.
There’s a new zip function, which returns a Zip2 struct – consistent with the lazy function that returns LazyThings, and possibly a gateway to ZipMoreThan2 structs in the future…

The Printable and DebugPrintable protocol docs now point out: if you want a string representation of something, use the toString protocol – in case you were trying to get fancy with if let x as? Printable { use(x.description) }. Which the language will now let you do, since as? and is now work with Swift protocols, which is cool. But still don’t do that – especially considering you may be sorry when you find out String doesn’t conform to Printable.

The Character type is now a struct, not an enum, and no longer exposes a “small representation” and “large representation”. This appears to go along with a performance-improving refactoring that has made a big difference – for example, a simple bit of code that sucks in /usr/share/dict/words into a Swift String and counts the word lengths now runs in a quarter of the time it used to.

Bovine Husbandry

Suppose you wanted to write your very own collection, similar to Array, but managing all the memory yourself. Up until Swift 1.2, this was tricky to do.

The memory allocation part isn’t too hard – UnsafeMutablePointer makes it pretty straightforward (and typesafe) to allocate memory, create and destroy objects in that memory, move those objects to fresh memory when you need to expand the storage etc.

But what about de-allocating the memory? Structs have no deinit, and that unsafely-pointed memory ain’t going to deallocate itself. But you still want your collection to be a struct, just like Array is, so instead, you implement a class to hold the buffer, and make that a member. That class can implement deinit to destroy its contained objects and deallocate the buffer when the struct is destroyed.

But you’re still not done. At this point, you’re in a similar position Swift arrays were in way back in the first ever Swift beta. They were structs, but they didn’t have full value semantics. If someone copies your collection, they’ll get a bitwise copy of the struct, but all this will do is copy the reference to the buffer. The two structs will both point to the same buffer, and updating the values in one would update both. Behaviour when the buffer is expanded would depend on your implementation (if the buffer expanded itself, they might both see the change, if the struct replaced its own buffer, they might not).

How to handle this? One way would be to be able to detect when a struct is copied, and make a copy of its buffer. But just as structs don’t have deinit, they also don’t have copy constructors – this isn’t C++!

So plan B – let the bitwise copy happen, but when the collection is mutated (say, via subscript set), detect whether the buffer is referenced more than once, and at that point, if it is, make a copy of the buffer before changing it. This solves the problem, and also comes with a big side-benefit – your collection is now copy-on-write (COW). Arrays in Swift are also COW, and this means that if you make a copy of your array, but then don’t make any changes, no actual copying of the buffer takes place – the buffer is only copied when it has to be, because you are mutating the values of some shared storage.

Since classes in Swift are reference-counted, it ought to be possible to detect if your buffer class is being referenced by more than one struct. But there wasn’t an easy Swift-native way to do this, until 1.2, which introduces isUniquelyReferenced and isUniquelyReferencedNonObjC function calls:

extension MyFancyCollection: MutableCollectionType {
  // snip...

  subscript(idx: Index) -> T {
    get {
      return _buffer[idx]
    }
    set(newValue) {
      if isUniquelyReferencedNonObjC(&_buffer) {
          // all fine, this buffer is solely owned
          // by this struct:
          _buffer[idx] = newValue
      }
      else {
          // uh-oh, someone else is referencing this buffer
          _buffer = _buffer.cloneAndModify(idx, newValue)
      }
   }
}

Great, COW collections all round. But before you run off and start implementing your own buffer class for use with your collection, you probably want to check out the ManagedBuffer class implementation, and a ManagedBufferPointer struct that can be used to manage it. These provide the ability to create and resize a buffer, check its size, check if its uniquely referenced, access the underlying memory via the usual withUnsafeMutablePointer pattern etc. The commenting docs are pretty thorough so check them out.

Just in case you want to try out this new feature in a playground – this technique does not work with globals. See this thread on the dev forums. ↩

More fun with implicitly wrapped non-optionals

Posted on January 2, 2015March 23, 2015 by airspeedvelocity

As we’ve seen previously, Swift will happily auto-wrap your non-optional value inside an optional temporarily if it helps make the types involved compatible. We’ve seen the occasional instance where this can be surprising. But for the most part, this implicit wrapping of non-optionals is your friend.¹ Here’s a few more examples.

Equating Optionals

The Swift standard library defines an == operator for optionals:

func ==<T : Equatable>(lhs: T?, rhs: T?) -> Bool

So long as your optional wraps a type that’s Equatable, that means you can compare two optionals of the same underlying type:

let i: Int? = 1
let j: Int? = 1
let k: Int? = 2
let m: Int? = nil
let n: Int? = nil

i == j  // true,  1 == 1
i == k  // false, 1 != 2
i == m  // false, non-nil != nil
m == n  // true,  nil == nil

let a: [Int]? = [1,2,3]
let b: [Int]? = [1,2,3]

// error, [Int] is not equatable
// (it does have an == operator, but that's not the same thing)
a == b

let f: Double? = 1.0

// ignore the weird compiler error,
// the problem is Int and Double aren't
// the same type...
i == f

Well, this is useful, but even more useful is this one definition of == allows you to compare optionals against non-optionals too. Suppose you want to know if an optional string contains a specific value. You don’t care if it’s nil or not – just whether it contains that value.

You can just pass your non-optional into that same == that takes optionals:

let x = "Elsa"

let s1: String? = "Elsa"
let s2: String? = "Anna"
let s3: String? = nil

s1 == x  // true
s2 == x  // false
s3 == x  // false

This works because Swift is implicitly wrapping x in an optional each time. If Swift didn’t do this, and you implemented an optional-to-non-optional operator yourself, you’d have to define == an extra two times, to make sure it was symmetric (which is a requirement whenever you implement ==):

// definitions if there weren't implicit optionals
func ==<T:Equatable>(lhs:T, rhs:T?)->Bool { return rhs != nil && lhs == rhs! }
func ==<T:Equatable>(lhs:T?, rhs:T)->Bool { return lhs != nil && lhs! == rhs }

Beware though, optional == isn’t all sunshine and unicorn giggles. Imagine the following code:

let a: Any = "Hello"
let b: Any = "Goodbye"

// false, of course
(a as? String) == (b as? String)

// oops, this is true...
(a as? Int) == (b as? Int)

Depending on how you think, this is either perfectly reasonable or immensley confusing, but either way you should bear it in mind. And maybe it’s another reason to go on your long list “Why not to use Any and as unless you really really have to.”

Comparing Optionals

Unsurprisingly, there’s a similar definition of < that will compare two optionals so long as what they wrap is Comparable:

func <<T : _Comparable>(lhs: T?, rhs: T?) -> Bool

If lhs and rhs are both some value, it compares them. Two nils are considered equal just like before. But a nil is always considered less than any non-nil value, (Comparable must always enforce a strict total order so they have to go somewhere well-defined in the ordering).

So if you sort a set of optionals, the nil values all end up at one end:

let a: [Int?] = [1, 3, nil, 2]
sorted(a, <)
// results in [nil, 1, 2, 3]

This is fairly intuitive, but it can lead to other subtle bugs. Suppose you want to filter a list like so:²

let ages = [
    "Tim":  53,  "Angela": 54,  "Craig":   44,
    "Jony": 47,  "Chris":  37,  "Michael": 34,
]

let people = ["Tim", "Johny", "Chris", "Michael"]

// don't trust anyone over 40
let trust = people.filter { ages[$0] < 40 }

// trust will contain ["Johny", "Chris", "Michael"]

The string "Johny" was not in the ages dictionary so you get a nil – but the way < works, nil is less than 40, so included in the filtered array.

Optional Subscripts

While we're talking about dictionaries, did you know that Dictionary.subscript(key) doesn't just return an optional, it also takes an optional?

var d = ["1":1]

let four = "4"
let nonsense = "nonsense"

// this will add ["4":4] to d
// remember toInt returns an optional
// (the string might not have been a number)
d[four] = four.toInt()

// nothing will be added, because
// the rhs is nil
d[nonsense] = nonsense.toInt()

// this will remove the "1" entry from d
d["1"] = nil

That's quite useful – but also, there wasn't much choice in it working like this.³ Dictionary.subscript(key) { set } has to take an optional, because when you define a setter, you must define a getter. And Dictionary's getter has to be optional (in case the key isn't present). So the setter's input value has to be.

This means every time you assign a value to a dictionary using a key subscript, your non-optional value is getting auto-wrapped in an optional. If this didn’t happen automatically, it would make assigning values to your dictionary a real pain, as you’d have to type d[key] = .Some(value) every time.

To see this in practice, here's a very bare-bones (very inefficient) implementation of a dictionary. Take a look at the subscript to see how the optional is handled:

public struct RubbishDictionary<Key: Equatable, Value> {

    public typealias Element = (Key, Value)

    // just store the key/value pairs in an unsorted
    // array (see, told you it was rubbish)
    private typealias Storage = [Element]
    private var _store: Storage = []

    // there's a reason why this is a private method,
    // which I'll get to later...
    private func _indexForKey(key: Key) -> Storage.Index? {
        // new year's resolution: see if I can avoid
        // calling array[i] completely in 2015
        for (idx, (k, _)) in Zip2(indices(_store),_store) {
            if key == k { return idx }
        }
        return nil
    }

    // Subscript for key-based lookup/storage
    public subscript(key: Key) -> Value? {
        // "get" is pretty simple, either key can be
        // found, or not...
        get {
            if let idx = _indexForKey(key) {
                // implicit optional wrap
                return _store[idx].1
            }
            else {
                return nil
            }
        }
        // remember, newValue is a Value?
        set(newValue) {
            switch (_indexForKey(key), newValue) {

            // existing index, so replace value
            case let (.Some(idx), .Some(value)):
                _store[idx].1 = value

            // existing index, nil means remove value
            case let (.Some(idx), .None):
                _store.removeAtIndex(idx)

            // new index, add non-nil value
            case let (.None, .Some(value)):
                _store.append((key, value))

            // new index, nil value is no-op
            case (.None,.None):
                break
            }
        }
    }
}

Finally, just a reminder that an optional containing an optional containing nil is not the same as nil:

var d: [Int:Int?] = [1:1, 2:nil]
// d now contains two entries, 1:1 and 2:nil
d[2] = nil  // removes nil
d[3] = .Some(nil)  // adds 3:nil

Full imlementation of `RubbishDictionary`

So, in case you've made it all the way down here, and you're interested, I thought I'd end with a fuller but minimal-as-possible implementation of RubbishDictionary. It has almost-feature-parity with Dictionary (it's just missing getMirror and init(minimumCapatity)). It ought to do everything Dictionary can, just slower.

If reading raw code's not your thing, skip it. But it shows how to simply implement several of the standard library's protocols such as CollectionType and DictionaryLiteralConvertible, as well as a few other features like lazily-evaluated key and value collections. I like how Swift extensions allow you to build up the type in logical groups (e.g. each protocol implementation one-by-one). Since this is a post about implicit optional wrapping, I've flagged where it’s also helping keep the code consise.

I have a nagging feeling there are some neater ways to handle some of the optionals – if you spot one, let me know. Oh and bugs. I'm sure there are a couple of them.

// building on the initial definition given above...

// An index type for RubbishDictionary.  In theory, you could perhaps just
// expose an alias of the storage type’s index, however, if the Key type were
// an Int, this would mean subscript(key) and subscript(index) would be
// ambiguous. So instead it's simple wrapper struct that does little more than
// pass-through to the real index.
public struct RubbishDictionaryIndex<Key: Equatable, Value>: BidirectionalIndexType {
    private let _idx: RubbishDictionary<Key,Value>.Storage.Index

    public typealias Index = RubbishDictionaryIndex<Key,Value>

    public func predecessor() -> Index { return Index(_idx: _idx.predecessor()) }
    public func successor() -> Index { return Index(_idx: _idx.successor()) }
}

// Required to make the index conform to Equatable
// (indirectly required by BidirectionalIndexType)
public func ==<Key: Equatable, Value>
  (lhs: RubbishDictionaryIndex<Key,Value>, rhs: RubbishDictionaryIndex<Key,Value>)
   -> Bool { 
    return lhs._idx == rhs._idx 
}

// Conformance to CollectionType
extension RubbishDictionary: CollectionType {
    public typealias Index = RubbishDictionaryIndex<Key,Value>

    // Add a subscript that takes an Index.  This one
    // does not need to be nil, since an indexed entry
    // _must_ exist.  Read-only.
    public subscript(idx: Index) -> Element { return _store[idx._idx] }

    public var startIndex: Index { return Index(_idx: _store.startIndex) }
    public var endIndex: Index { return Index(_idx: _store.endIndex) }

    public typealias Generator = IndexingGenerator<RubbishDictionary>
    public func generate() -> Generator { return Generator(self) }
}

// Conformance to Printable and DebugPrintable
extension RubbishDictionary: Printable, DebugPrintable {
    public var description: String {
        let elements = map(self) { "($0.0):($0.1)" }
        return "[" + ",".join(elements) + "]"
    }

    public var debugDescription: String {
        let elements = map(self) { "(toDebugString($0.0)):(toDebugString($0.1))" }
        return "[" + ",".join(elements) + "]"
    }
}

// Conformance to DictionaryLiteralConvertible
extension RubbishDictionary: DictionaryLiteralConvertible {
    public init(dictionaryLiteral elements: (Key, Value)...) {
        for (k,v) in elements {
            self[k] = v
        }
    }
}

// Remaining features supported by Dictionary
extension RubbishDictionary {

    public var count: Int { return _store.count }
    public var isEmpty: Bool { return _store.isEmpty }

    // now for the public version of indexForKey
    public func indexForKey(key: Key) -> Index? {
        // this maps non-nil results from _indexForKey to
        // the wrapper type, or just returns nil if 
        // _indexForKey did...
        return _indexForKey(key).map { Index(_idx: $0) }
    }

    // a lazily-evaluated collection of keys in the dictionary
    public var keys: LazyBidirectionalCollection<MapCollectionView<RubbishDictionary<Key,Value>,Key>> {
        return lazy(self).map { (k,v) in k }
    }

    // and the same for the values
    public var values: LazyBidirectionalCollection<MapCollectionView<RubbishDictionary<Key,Value>,Value>> {
        return lazy(self).map { (k,v) in v }
    }

    // unlike its subscript equivalent, updateValue:forKey does not take an optional value...
    public mutating func updateValue(value: Value, forKey key: Key) -> Value? {
        if let idx = _indexForKey(key) {
            let old = _store[idx].1
            _store[idx].1 = value
            // implicit optional wrap
            return old
        }
        else {
            _store.append((key,value))
            return nil
        }
    }

    // returns the value removed, if there was one
    public mutating func removeValueForKey(key: Key) -> Value? {
        if let idx = _indexForKey(key) {
            // implicit optional wrap
            return _store.removeAtIndex(idx).1
        }
        else {
            return nil
        }
    }

    public mutating func removeAtIndex(index: Index) {
        _store.removeAtIndex(index._idx)
    }

    public mutating func removeAll(keepCapacity: Bool = false) { 
        _store.removeAll(keepCapacity: keepCapacity)
    }

}

Unlike implicitly-wrapped optionals, which are more like the kind of friend that gets you into fights in bars. ↩
I'm getting a lot of mileage out of this example. ↩
I guess it could have asserted on a nil value. ↩

zipWith, pipe forward, and treating functions like objects

Posted on December 5, 2014December 5, 2014 by airspeedvelocity

In the previous article, about implementing the Luhn algorithm using function composition in Swift, I had a footnote to the part about using mapEveryNth to double every other digit:

An alternative could be a version of map that cycled over a sequence of different functions. You could then give it an array of two functions – one that did nothing, and another that did the conversion.

This post is about doing just that. I’ll define some more generic helper functions, and then use them to build a variant of our checksum function.

First up: cycling through a sequence multiple times. Suppose you have a sequence, and you want to repeat that sequence over and over. For example, passing in 1...3 would result in 1,2,3,1,2,3,1,...etc.

You could do this by returning a SequenceOf object that takes a sequence, and serves up values from its generator, but intercepts when that generator returns nil and instead just resets it to a fresh one:

func cycle
  <S: SequenceType>(source: S) 
  -> SequenceOf<S.Generator.Element> {
    return SequenceOf { _->GeneratorOf<S.Generator.Element> in
        var g = source.generate()
        return GeneratorOf {
            if let x = g.next() {
                return x
            }
            else {
                g = source.generate()
                if let y = g.next() {
                    return y
                }
                else {
                    // maybe assert here, if you want that behaviour
                    // when passing an empty sequence into cycle
                    return nil
                }
            }
        }
    }
}

// seq is a sequence that repeats 1,2,3,1,2,3,1,2...
let seq = cycle(1...3)

This code does something that might be considered iffy – it re-uses a sequence for multiple passes. Per the Swift docs, sequences are not guaranteed to be multi-pass. However, most sequences you encounter from day-to-day are fine with multiple passes. Collection sequences are, for example. But some non-collection sequences like StrideThrough ought to be too. My (possibly misguided) take on this: it seems OK to take multiple passes over a sequence, so long as you make it clear to the caller that this will happen. The caller needs to know if their sequence allows this, and if they pass a sequence that doesn’t support it, the results would be undefined.¹

OK, next, suppose you have two sequences. Maybe they’re both finite, maybe one is finite and one infinite (such as a sequence created by cycle above). You want to combine the elements at the same point in each one together, using a supplied function that takes two arguments. We’ll call that zipWith, and it’s very simple:

func zipWith
  <S1: SequenceType, S2: SequenceType, T>
  (s1: S1, s2: S2, combine: (S1.Generator.Element,S2.Generator.Element) -> T) 
  -> [T] {
    return map(Zip2(s1,s2),combine)
}

// returns an array of 1+1,2+2,3+3,4+4,5+5,
// so [2,4,6,8,10]
zipWith(1...5, 1...5, +)

How does this work? Zip2 is a SequenceType that takes two sequences, and serves up elements from the same position in each as a pair.² For example, if you passed it [1,2,3] and ["a","b","c"], Zip2 would be a sequence of (1,"a"), (2,"b"), (3,"c"). If one sequence is longer than the other, Zip2 stops as soon as it’s used up the shortest sequence.³

Then, that zipped sequence of 2-tuples is passed into map, along with the combiner function. Map does it’s thing, calling a transformation function on each element to produce a new combined value. Remember, Swift functions that take n arguments can instead take one n-tuple. That is, if you have a function f(a,b), and a pair p = (x,y), you can call f as f(p). So if Zip2 produces p = (1,1), then +(p) returns 2.

Finally, map spits out the array of combined pairs, and that’s returned.

So, we have a function that cycles over a given sequence forever, and a function that takes two sequences and combines their elements using a given function. How does this help double every other number in a sequence?

Well, suppose you had a sequence of two functions: one that did nothing, just returned its input unchanged, and one that doubled its input. If you cycled that sequence, you’d have an infinite sequence of functions that alternated between doing nothing and doubling.

How could zipWith be used to combine that cycling pair of functions, with a sequence of numbers, such that every other number was doubled? What would the combiner function need to be? It would have to take a value (the number), and a function (do-nothing or double), and apply the function to the number.

We’ve already written a function that does that, last time: pipe forward.

infix operator |> {
    associativity left
}

public func |> <T,U>(t: T, f: T->U) -> U {
    return f(t)
}

So, if we passed in pipe forward as the function to zipWith, it’d do exactly what we want:⁴

// a function that does nothing to its input
func id<T>(t: T) -> T {
    return t
}

// a function that doubles its input
func double<I: IntegerType>(i: I) -> I {
    return i * 2
}

// an array of those two functions, for Ints:
let funcs: [Int->Int] = [id, double]

// double every other number from one to ten
// returns [1,4,3,8,5,12,7,16,9,20]
zipWith(1...10, cycle(funcs), |> )

Finally, let’s implement the new checksum function. We’re going to re-use the •, mapSome, toInt, sum and isMultipleOf functions from the previous article, just redefining the digit-doubling part:

let doubleThenCombine = { i in
    i < 5
        ? i * 2
        : i * 2 - 9
}

let idOrDouble: [Int->Int] = [
    id,
    doubleThenCombine,
]

let doubleEveryOther: [Int]->[Int] = { zipWith($0, cycle(idOrDouble), |> ) }

let extractDigits: String->[Int] = { mapSome($0, toInt) }

let checksum = isMultipleOf(10) • sum • doubleEveryOther • reverse • extractDigits

let ccnum = "4012 8888 8888 1881"
println( checksum(ccnum) ? "👍" : "👎" )

This shows another benefit of the function composition approach. We completely changed the way the doubleEveryOther function was implemented, but because its interfaces in and out remained the same, everything else could stay untouched. Compare this with the iterative version, where looping and doubling and summing were all tangled up together.

So that’s nice, but you might ask if this version is any better than the version from the last post that used mapEveryNth. And it probably isn’t.

But what it does do is show the power of treating functions like any other object, storing them in a collection and iterating over them.

Suppose the Luhn algorithm actually required you to double the even-positioned numbers, but halve the odd ones. Using the mapIfIndex function, you’d need to take two passes (or write another function that took two transformations). With the zipWith technique, all you’d have to do is replace id with a function to halve its input.

You could take this even further. In this example, the array was built literally. But imagine a scenario where you needed more dynamic behaviour – where the array of functions you passed into zipWith were built at runtime, perhaps chosen via user interaction.

“Big whoop,” you say. “This is just like an array of delegates. I’ve been doing this for years.” And that’s totally fair. But functions can be both lighter-weight and more flexible. No need to create an interface, define methods, create an object that implements the interface, worry about which methods need real implementations and which just need stubbing out. The callee defines the function type, the caller writes a closure expression, maybe captures a few variables, and we’re done. For more on this, see Peter Norvig’s slides on how language features can make design patterns invisible or simpler.⁵

As ever, you can find all the general-purpose functions defined in this post in the SwooshKit framework.

The alternative to this (which the Swift docs actually advocate) is to only write cycle against collections, not sequences. But this seems like a shame, as it’d rule out using it on sequences that don’t have collection equivalents, like strides, but that ought to be multi-pass capable. Perhaps an option is to create a new protocol, MultipassSequenceType: SequenceType { } and to have sequences that can be used this way extend that? ↩
Zip2 is a bit of an anomoly in the Swift standard library. Whereas in most places, the pattern is to have a function that returns an object (for example, the enumerate function returns an instance of EnumerateSequence), Zip2 skips that stage and is just a struct itself, that you initialize with two sequences. Its usage looks almost exactly the same, but that’s why the Z is capitalized (because the convention is to capitalize the names of structs). ↩
If you’re interested in a version of zip that runs to the end of the longer sequence, I used one in this gist. ↩
If you’re of a more Haskelly temperament, you might feel this is the wrong way around (functions should come before values because f(x). In which case you’d want pipe backwards (<|) and flip the first two arguments. ↩
OK, so he was talking about dynamic languages, not just functional ones, but it’s still worth reading. ↩

Which function does Swift call? Part 6: Tinkering with priorities

Posted on November 16, 2014April 5, 2015 by airspeedvelocity

This is part of a series of posts on how Swift resolves ambiguity in overloaded functions. You can find Part 1 here. In the previous article, we looked at why you get a Range from the ... by default. The answer? Because Range has some extra validation that relies on the ... input being comparable, and this makes the Range version more specific.

Defaulting to Neither

So, suppose you didn’t want Range to be the default, but still wanted to benefit from this extra validation?

If your goal was to keep the ambiguity, and force the user to be explicit about whether they wanted a Range or a ClosedInterval, you could declare the following:

func ...
  <T: Comparable where T: ForwardIndexType>
  (start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

Now, unless you specify explicitly what type you want, you’ll get an ambiguous usage error when you use the ... operator.

(It might make you a bit nervous to see ForwardIndexType in the declaration of a function for creating intervals, given intervals have nothing to do wth indices. But don’t worry, it’s really only there to carve out an identical domain to the equivalent Range function to force the ambiguity for the same set of possible inputs.)

Defaulting to ClosedInterval

What if you actually wanted ClosedInterval to be the default? This is a little trickier.

If you just wanted to cover the most common case, integers, then you could do it like so:

func ...<T: IntegerType>(start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

// x will now be a ClosedInterval
let x = 1...5

This works because IntegerType conforms to RandomAccessIndex (which eventually comforms to ForwardIndexType).

Now, this would be a very specific carve-out for integers. You could leave it there, because integers are a special case, being as how they’re so fundamental and it’s a bit odd that they’re defined as an index type too.

But If you wanted an interval for all other type that are both comparable and an index, you’d need to handle each one on a case-by-case basis. For example, string indices are comparable:

// note, this doesn’t even have to be generic, since Strings aren’t generic
func ...(start: String.Index, end: String.Index) -> ClosedInterval<String.Index> {
    return ClosedInterval(start, end)
}

let s = "hello"

// y will now be a ClosedInterval
let y = s.startIndex...s.endIndex

Two problems become apparent with this.

First, this gets real old real quickly. Every time you create a new index type you have implement a whole new ... function for it.

Second, even then it’s still out of your control. You want other people to be able to create new index and comparable types and use them with ranges and intervals, and they aren’t necessarily going to do this. So instead of a nice predictable “you get a range by default”, we now have “you’ll probably get an interval, so long as the type has the appropriate ... overload”. That doesnt’t sound good at all.

Your best bet at this point would be to force anyone who wants to their type to be useable with an interval to tag it with a particular protocol. This is what stride does – types have to conform to the Strideable protocol. So let’s try defining an equivalent for ClosedInterval. We’ll call it, uhm, Intervalable.

protocol Intervalable: Equatable { }

extension Int: Intervalable { }

extension String.Index: Intervalable { }

// The following definintions of ... would need
// to REPLACE the current definitions.  So you
// can only do this if you're the author of
// ClosedInterval

func ...<T: Intervalable>
  (start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

func ...<T: Intervalable where T: ForwardIndexType>
  (start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

// x will be a ClosedInterval
let x = 1...5

let s = "hello"

// y will be a ClosedInterval
let y = s.startIndex...s.endIndex

Of course, this is only something the author of the original type can really implement, not something you can do yourself if you personally prefer the default to be the other way around.

And it’s a bit heavy-handed to force every type to conform to a protocol when really all it needs is to be comparable for the interval to work. But I can’t think of another option (even for the implementor) right now.¹ If anyone else can think of a good solution, let me know.

except possible tinkering with the Comparable hierarchy, which is even more restricting since only Apple could do that. ↩

Which function does Swift call? Part 5: Range vs Interval

Posted on November 10, 2014November 10, 2014 by airspeedvelocity

This is part of a series of posts on how Swift resolves ambiguity in overloaded functions. You can find Part 1 here. In the previous article, we looked at how generics are handled.

So finally…

Now that we’ve covered how generics fit in, we can finally answer the original question – how come you get a Range back from 1...5 by default? To recap the sample code:

// r will be of type Range<Int>
let r = 1...5

// if you want a ClosedInterval<Int>, you
// have to be explicit:
let integer_interval: ClosedInterval = 1...5

// if you use doubles though, you don’t need
// to declare the type explicitly.
// floatingpoint_interval is a ClosedInterval<Double>
let floatingpoint_interval = 1.0...5.0

The infix ... function is defined three times in the standard library, with the following signatures:

func ...<T : Comparable>
  (start: T, end: T) -> ClosedInterval<T>

func ...<Pos : ForwardIndexType>
  (minimum: Pos, maximum: Pos) -> Range<Pos>

func ...<Pos : ForwardIndexType where Pos : Comparable>
  (start: Pos, end: Pos) -> Range<Pos>

The first two are pretty similar. They are both generic, and have one generic placeholder that is constrained by a single protocol. There’s nothing favouring one of the two constraints – Comparable and ForwardIndexType both descend from Equatable, but neither is a descendent of the other. Left as just those two functions, you’d get an ambiguous call error.

It’s the third function above that is what makes Swift pick Range over ClosedInterval. In it, Pos is constrained not just by ForwardIndexType but also by Comparable. Constraining by two protocols wins over one, so this is the overload that’s picked.

Well, after 4 long lead-up posts, that was a bit of an anticlimax, eh?

So let’s ask another question – why is that last function there?

One possible explanation: the version with two protocol constrains was implemented purely to break the tie, since ambiguous call errors are annoying. Creating a Range doesn’t actually require comparable, but adding the comparable requirement has the effect of favouring ranges over intervals. If this was the desired outcome, that could be all the extra ... is there for.

But that’s probably not why. There’s another reason why there’s an overload that requires Comparable.

Factory Functions

A recurring pattern in the Swift standard library is to use free functions as factories to construct different types depending on the function name or arguments.

For example, the lazy function is defined 4 times for 4 different argument types: once each for random-access, bidirectional, and forward collections; and once for sequences.¹ Each one returns a different matching lazy type (e.g. LazyRandomAccessCollection). When you call lazy, the compiler picks the most specific match (in the order just given). When you combine this with type inference, you get the most powerful type available declared for you, all determined at compile time.

// a will be a LazyRandomAccessCollection
// since arrays are random access
let a = lazy([1,2,3,4])

// s will be a LazyBidirectionalCollection,
// since strings can't be indexed randomly
let s = lazy("hello")

// r will be a LazySequence, since StrideTo
// isn't a collection, just a sequence
let r = lazy(stride(from: 1, to: 8, by: 2))

// there's nothing stopping you declaring these
// lazy objects manually directly:
let l = LazyRandomAccessCollection([1,2,3,4])

Each version of lazy might not even do anything other than initialize and return the relevant type – it just makes declaring different lazy types more convenient than giving the relevant class name in full. All this is made possible by the overloading priorities described in this series. Most everyday users of lazy don’t need to know any of this though – they just use the function and it does what you’d intuitively guess was the “right” thing.

In some cases, which factory function to call is controlled more directly by the caller. For example, the stride function creates either a StrideTo or a StrideThrough depending on whether the named middle argument is to: or through:.

// x will be a StrideThrough
let x = stride(from: 1, through: 10, by: 5)

// y will be a StrideTo
let y = stride(from: 1, to: 10, by: 5)

In the case of stride, this is the only way you can construct these objects, as their initializers are private (which means stride can call them since it’s declared inside Swift, but you can’t).

Using Factory Functions to Perform Validation

For Range, the ... operator performs two extra tasks on top of constructing the return value. First, if the input supports comparators, it can validate that the begin and end arguments aren’t inverted. If they are, you get a runtime assertion. You only get this validation when using the ... operator. If you try and declare a range like this: Range(start: 5, end: 1), it’ll work.²

By the way, you’ll even get this check with String ranges. Swift string indices aren’t random-access (because of variable-length characters ), but they are comparable. While you can’t know how many characters are between two indices, you can know that one index comes before the another.

The other purpose of the ... function is to normalize closed ranges into half-open ranges. Range is kind of the opposite way around to ClosedInterval and HalfOpenInterval. There is no closed version of Range, ranges are always half-open. When you call ..., it increments the second parameter to make it equivalent to a half-open range. If you type 1...5 into a playground, you’ll see what is actually created is displayed as 1..<6.

This is why you'll get a runtime assertion if you ever try and construct a closed range through the end index of a string, even though in theory it could be a valid thing to do:

let s = "hello"
let start = s.startIndex
let end = s.endIndex

// fatal error: can not increment endIndex
let r = start...end

// same as if you did this:
end.successor()

Could you put this logic into Range.init? In the case of the half-open conversion, maybe you could, by having two versions of init with to: and through: arguments. But it's cleaner to do it using operators.

(You could maybe say the same about stride, but then you'd need custom ternary operators.³)

But in the case of the inversion check that uses comparable, you'd have to make a generic version of init, maybe something like this:

extension Range {
    init<C: Comparable>(start: C, end: C) {
        assert(start <= end, "Can't form Range with end < start")
        self.init(start: start, end: end)
    }
}

Well, this will compile but it won't do what you want. In fact your init will never be called, even if you pass in a comparable type. Why? Because it's generic, and the regular version of Range.init is not. As we've seen previously, generic functions never get called when there's a possible non-generic overload.

“Hey, no wait, the current Range.init is so generic!” you complain. “Look:”

// Excerpt from the Swift definition of Range
struct Range : Equatable ...etc {
    /// Construct a range with `startIndex == start` and `endIndex ==
    /// end`.
    init(start: T, end: T)
}

“See? T is a generic placeholder! And we’re puting more constraints on our function’s placeholder, so it should be the one that’s called.”

OK yes, T is a generic placeholder. But it’s not a placeholder for the function. It’s a placeholder for the struct. In the context of the function, T is already fixed in place as a specific type.⁴ To help understand this, try the following code:

struct S<T> {
    func f(i: Int) { print("f(Int)") }
    func f(t: T) { print("f(T)") }

    func g(i: Int) { print("g(Int)") }

    // note, here scoping rules mean the placeholder 
    // T is a _different_ T to the struct's T
    func g<T: IntegerType>(t: T) { print("g(T)") }
}

// fix T to be an Int
let s = S<Int>()

// error: Ambiguous use of f
s.f(1)

// prints "g(Int)" not "g(T)"
// because generics lose...
s.g(1)

Incidentally, it’s stuff like the potentially confusing scoping of T above (or that placeholders might get randomly renamed) that makes me nervous about re-using a struct’s placeholders when extending that struct. It doesn’t feel right. Usually, nicely-written generic classes typealias their placeholders in some fashion. For example, Range typealiases T as Index. Probably best to use that.

Anyway, for Range, declaring a generic function ..., that doesn’t have to compete with any non-generic alternatives, makes these problems go away. Plus operators look nice.

Next up: what if we didn’t want Range to be the default? What could we do about it?

For more on Swift’s lazy types, see this article ↩
well, depending on your definition of “work” I guess. ↩
If you want to see how these are possible (if perhaps inadvisable) Nate Cook recently wrote a good article about them. ↩
If you really want to tie your brain in knots, try to think about a secenario where the struct placeholder is fixed by which init is chosen, which relies on what T is… ↩

Which function does Swift call? Part 4: Generics

Posted on October 26, 2014October 26, 2014 by airspeedvelocity

This is part 4 of a series on how overloaded functions in Swift are chosen. Part 1 covered overloading by return type, part 2 was about how different functions with simple single arguments were picked on a “best match” basis, and part 3 went into a little more detail about protocols.

We started the series with the question, why do you get a Range type back from 1...5 instead of a ClosedInterval? Today we cover how generics fit into the matching hierarchy, and we’ll finally have enough information to answer that question.

Generics are Lower Priority

Generics are lower down the pecking order. Remember, Swift likes to be as “specific” as possible, and generics are less specific. Functions with non-generic arguments (even ones that are protocols) are always preferred over generic ones:

// This f will match a single argument of any type,
// but is very low priority
func f<T>(t: T) { 
    print("T") 
}

// This f takes a specific implemented type, so will be
// preferred over anything else when passed a String.
func f(s: String) {
    print("String")
}

// This f takes a specific protocol, so would be
// preferred over the generic version (though not
// over a version of f that took an actual Bool)
func f(b: BooleanType) {
    print("BooleanType")
}

// this prints "String"
f("wotcha")

// this prints "BooleanType"
f(true)

// this prints "T"
f(1)

OK that’s simple enough. But once we’re in generic territory, there’s a bunch of additional rules for picking one generic function over another.

Making Generic Placeholders More Specific with Constraints

Within a type parameter list, you can apply various different constraints to your generic parameters. You can require generic placeholders conform to protocols, and that their associated types be equal to a certain type, or conform to a protocol.

Just like before, the rule of thumb is: the more specific you make a function, the more likely it is to be picked in overload resolution.

This means a generic function with a placeholder that has a constraint (such as conforming to a protocol) will be picked over one that doesn’t:

func f<T>(t: T) {
    print("T")
}

func f<T: IntegerType>(t: T) {
    print("T: IntegerType")
}

// prints "T: IntegerType"
f(1)

If the choice is between two functions that both have constraints on their placeholder, and one is more specific than the other, the more specific one is picked. These rules follow a familiar pattern – they mirror the rules for non-generic overloading.

For example, when one protocol inherits from another, the inheriting protocol constraint is preferred:

func g<T: IntegerType>(t: T) {
    print("IntegerType")
}

func g<T: SignedIntegerType>(t: T) {
    print("SignedIntegerType")
}

// prints "SignedIntegerType"
g(1)

Or if the choice is between adhering to one protocol or two, two is preferred:

func f<B: BooleanType>(b: B) {
    print("BooleanType")
}

func f<B: protocol<BooleanType,Printable>>(b: B) {
    print("BooleanType and Printable")
}

// prints "BooleanType and Printable"
f(true)

The where clause introduces the ability to constrain placeholders by their associated types. These additional constraints make the match more specific, bumping it up the priority:

// argument must be an IntegerType
func f<T: IntegerType>(t: T) {
    print("IntegerType")
}

// argument must be an IntegerType 
// AND it's Distance must be an IntegerType
func f<T: IntegerType where T.Distance: IntegerType>(t: T) {
    print("T.Distance: IntegerType")
}

// prints "T.Distance: IntegerType"
f(1)

Here, the extra where clause is analogous to a regular argument needing to conform to two protocols.

Swift will also distinguish between two where clauses, one of which is more specific than another. For example, requiring an associated type to conform to an inheriting protocol:

func f<T: IntegerType where T.Distance: IntegerType>(t: T) {
    print("T.Distance: IntegerType")
}

// where T.Distance: SignedIntegerType is more specific than
// where just T.Distance: IntegerType
func f<T: IntegerType where T.Distance: SignedIntegerType>(t: T) {
    print("T.Distance: SignedIntegerType")
}

// so this prints "T.Distance: SignedIntegerType"
f(1)

Where clauses also allow you to check a value is equal to a specific type, not just that it conforms to a protocol:

func f<T: IntegerType where T.Distance == Int>(t: T) {
    print("Distance == Int")
}

func f<T: IntegerType where T.Distance: IntegerType>(t: T) {
    print("Distance: IntegerType")
}

// prints "Distance == Int"
f(1)

It isn’t surprising that the version that specifies Distance is equal to an exact type is more specific than the version that just requires it conforms to a protocol. This is similar to the rule for non-generic functions that a specific type as an argument is preferred over a protocol.

One interesting point about checking equality in the where clause. You can check for equality to a protocol. But it probably isn’t what you want – equality to a protocol is not the same as conforming to a protocol:

// create a protocol that requires the
// implementor defines an associated type
protocol P {
    typealias L
}

// implement P and choose Ints as the
// associated type
struct S: P {
    typealias L = Int
}
let s = S()

// Ints are printable so this will match
func f<T: P where T.L: Printable>(t: T) {
    print("T.L: Printable")
}

// == is more specific than :, so this will
// be the one that's called, right?
func f<T: P where T.L == Printable>(t: T) {
    print("T.L == Printable")
}

// nope! prints "T.L: Printable"
f(s)

// here is a struct where L really is == Printable
struct S2: P {
    typealias L = Printable
}
let s2 = S2()

// and this time, yes, it prints "T.L == Printable"
f(s2)

It’s actually pretty hard to do this by accident – you can only check for type equality if a protocol you’re equating to has no Self or associated type requirements (i.e. the protocol doesn’t require a typealias. or use Self as a function argument), same as you can’t use those protocols as non-generic arguments. This rules out most of the protocols in the standard library (Printable is one of only a handful that doesn’t).

Beware the Unexpected Overload

So, if given a choice between a generic overload and a non-generic one, Swift choses the non-generic one. If choosing between two generic functions, there’s a set of rules that are very similar to the ones for choosing between non-generic functions.

It’s not a perfect parallel, though – there are some differences. For example, if given a choice between an inheriting constraint or more constraints (“depth versus breadth”), unlike with non-generic protocols, Swift will actually chose breadth:

protocol P { }
protocol Q { }

protocol A: P { }

struct S: A, Q { }
let s = S()

// this f is for an inherited protocol
// (deeper than just P)
func f(p: A) {
    print("A")
}

// this f is for more protocols
// (broader than just P)
func f(p: protocol<P, Q>) {
    print("P and Q")
}

// error: Ambiguous use of 'f'
f(s)

// however, if we define a similar situation
// with generics instead:

func g<T: A>(t: T) {
    print("T: A")
}

func g<T: protocol<P, Q>>(t: T) {
    print("T: P and Q")
}

// the second one will be picked
// prints "T: P and Q"
g(s)

Let’s see that example again with some real-world types from the standard library:

func f<T: SignedIntegerType>(t: T) {
    print("SignedIntegerType")
}

// IntegerType is less specific than SignedIntegerType,
// but depth beats breadth so this one should be called:
func f<T: protocol<IntegerType, Printable>>(t: T) {
    print("T: P and Q")
}

// prints "SignedIntegerType"
// wait, what?
f(1)

If the example with P and Q worked one way, why the difference with SignedIntegerType and Printable?

Well, turns out IntegerType itself conforms to Printable (by way of _IntegerType). Since it already conforms to it, the Printable in protocol is redundant and can be ignored – it doesn’t make that version of f any more specific than T just conforming to IntegerType. Since SignedIntegerType inherits from IntegerType, it is more specific, so that’s the one that gets picked.

I point this out not to be fussy about the hierarchy of standard library protocols, but to point out how easy it is to get an unexpected version of an overloaded function. For this reason, it’s a good rule to never overload a function with different functionality. Overload for performance optimization (like a collection algorithm that extends), or to extend a function to cover your user-defined new type, or to provide the caller with a more powerful but basically equivalent type (like with Range and ClosedInterval). But overload to vary functionality based on the input and sooner or later you’ll be sorry.

Anyway, with that little mini-lecture out of the way, we finally have enough of the details of overloading to answer the original question – why does Range get picked? We’ll cover that in the next article.

Which function does Swift call? Part 3: Protocol Composition

Posted on October 7, 2014October 26, 2014 by airspeedvelocity

Previously, we looked at how Swift functions can be overloaded just by return type, and how Swift picks between different possible overloads based on a best-match mechanism.

All this was working towards the reason why 1...5 returns a Range rather than a ClosedInterval. And to understand that, we’ll have to look at how generics fit into the matching criteria.

But before we do, a brief diversion into protocol composition.

Protocols can be composed by putting zero¹ or more protocols between the angle brackets of a protocol<> statement. The Apple Swift book says:

NOTE
Protocol compositions do not define a new, permanent protocol type. Rather, they define a temporary local protocol that has the combined requirements of all protocols in the composition.

Reading that, you might think that declaring protocol<P,Q> was essentially like declaring an anonymous protocol Tmp: P, Q { }, and then using Tmp, without ever giving it a name. But that’s not quite what this means, and a way to spot the difference is by declaring functions that take arguments declared with protocol.

First, a recap of some of the behaviours of regular protocols. Suppose we define 4 protocols: first P and Q, and then two more, A and B that both just conform to P and Q:

protocol P { }
protocol Q { }

protocol A: P, Q { }
protocol B: P, Q { }

Although A and B are functionally identical, they are two different protocols. This means you can write two functions, one that takes A and one that takes B, and they’ll be two different functions with equal priority, so if you try to call them and either one would work, you’ll get an ambiguous call error:

struct S: A, B { }
let s = S()

func f(a: A) {
    print("A")
}

func f(b: B) {
    print("B")
}

// error: Ambiguous use of 'f'
f(s)

If, on the other hand, we use protocol<P,Q> instead of B, we get a different result:

func g(a: A) {
    print("A")
}

func g(p: protocol<P,Q>) {
    print("protocol<P,Q>")
}

// prints "A"
g(s)

If protocol<P,Q> were really declaring a new anonymous protocol that conformed to P and Q, we’d get the same error as before. Instead, its like saying “an argument that conforms to both P and Q”.

As we saw in the last article, an inherited protocol (in this case A) always trumps its ancestors (in this case P and Q). So the A version of g is called. This is consistent with our rule of thumb, that Swift will always pick the function with more “specific” arguments – here, inheriting protocols are more specific than inherited ones.

Another way to be more specific is to require more protocols. Here’s an example of how three protocols is more specific than two:

protocol R { }
extension S: R { }

// you can assign an alias for any
// protocol<> definition if you prefer
typealias Two = protocol<P,Q>
typealias Three = protocol<P,Q,R>

func h(p: Two) {
    print("Two")
}

func h(p: Three) {
    print("Three")
}

// prints "Three"
h(s)

// you can still force the other to be
// called if you want: this prints "Two"
h(s as Two)

Finally, if given a choice between an inheriting protocol, or more protocols (i.e. depth vs breadth), Swift won’t favour one over the other:

func o(p: protocol<A,B>) {
    print("A and B")
}

func o(p: protocol<P,Q,R>) {
    print("P, Q and R")
}

// error: Ambiguous use of 'o'
o(s)

// prints "A and B"
o(s as protocol<A,B>)
// prints "P, Q and R"
o(s as protocol<P,Q,R>)

One of the conclusions here is that there are quite a few possible ways for you to declare functions that take the same argument type, and to tweak the declarations to favour one function being called over another. When you combine these with different return values, that gives you a tool to nudge Swift into inferring a specific type by default, but allowing the user to get a different one from the same function name if they prefer, while avoiding forcing the user to deal with ambiguity errors. This is what is happening with ... when it defaults to Range.

So now, on to generics, which will take this even further.

zero being an interesting case we’ll look at some other time ↩

Changes to the Swift Standard Library in 1.1 beta 3

Posted on October 1, 2014October 1, 2014 by airspeedvelocity

The biggest deal in this latest beta is the documentation. There are 2,745 new lines of /// comments in beta 3, and even pre-existing documentation for most items has been revised.

That doesn’t mean there aren’t any functional changes to the library, though. Here’s a rundown of what I could find:

As mentioned in the release notes, the various literal convertible protocols are now implemented as inits rather than static functions.
The ArrayBoundType protocol is gone, and the various integer types that conformed to it no longer do.
The CharacterLiteralConvertible protocol is gone. Character never appeared to conform to it – it conformed to ExtendedGraphemeClusterLiteralConvertible which remains.
The StringElementType protocol is gone, and UInt8 and UInt16 no longer conform to it. The UTF16.copy method, that relied on it to do some pointer-based manipulation, is also gone.
Dictionary’s initializer that took a minimumCapacity argument no longer has a default for that minimum capacity. It’s interesting that this worked previously, since if you did the same thing with your own type (i.e. gave it both an init() and an init(arg: Int = 0)) you’d get a compiler error when you tried to actually use init().
In extending RandomAccessIndexType, integers now use their Distance typealias for Int rather than straight Int.
The static from() methods on integers are all gone.
There’s now an EnumerateSequence that returns an EnumerateGenerator, and enumerate() returns that rather than the generator.
The various integer types no longer have initializers from Builtin.SomeType.
The _CocoaArrayType protocol (as used to initialize an Array from a Cocoa array) has been renamed to _SwiftNSArrayRequiredOverridesType.
Numerous _CocoaXXX and _SwiftXXX protocols have acquired an @objc attribute.
_PrintableNSObjectType (which wasn’t explicitly implemented by anything in the std lib) is gone.
The AssertStringType and StaticStringType protocols are gone, as has AssertString. StaticString still exists and is clearly described as a static string that can be known at compile-time.
assert and precondition are much simplified with those string types removed, leaving one version each that just uses String for it’s message parameter. assertionFailure, preconditionFailure and fatalError all take a String for their message now as well, though they still take StaticString for the filename argument as described in the Swift blog’s article.
There are two new sort functions that operate on a ContiguousArray (one for arrays of comparable elements, and one that takes a comparison predicate).
But there are two fewer sorted functions. The ones that take a MutableCollectionType, and return one, are gone. Seems a shame, though it didn’t really make sense for them to take a mutable type given they didn’t need to mutate it. Maybe they’ll be replaced with versions that return ExtensibleCollectionType.
There’s a new unsafeDowncast that is equivalent to x as T but with the safeties removed – to be used only when using as is causing performance problems.
Raise a glass for the snarky “Haskell’s fmap, which was mis-named” comment, which is now replaced by a very straight-laced description of what Optional.map actually does.

In keeping with the documenting theme, there are a lot of argument name changes here as well. Continuing a trend seen in previous betas, specific, descriptive argument names are preferred over more generic ones. This is probably a good Swift style tip for your own code, especially if you’re also writing libraries.

Some examples:

Naming arguments in protocols instead of just using ‘_‘ e.g. func distanceTo(other: Self) instead of func distanceTo(_: Self) (there’s no compulsion to use the same names for your method arguments that the protocol does but you probably should)
Avoiding just naming the variable after the type (e.g. sequence: Sequence), For example, Array.extend has renamed its argument to newElements.
Replacing newValues with newElements in various places, presumably because of the unintended Swift implications of the term “values”.
Avoiding i or v such as subscript(position: Index) instead of subscript(i: Index), and init(_ other : Float) instead of init(_ v : Float).
Descriptive names for predicate arguments, such as isOrderedBefore rather than just pred.

Rather than me regurgitate the comments here, I’d suggest option-clicking import Swift and reading through it in full. There’s lots of informative stuff in there. Major types like Array and String have long paragraphs in front of them with some good clarifications. Many things are explained that you’d otherwise only have picked up on by watching the developer forums like a hawk.¹

Here are a few items of specific interest:

The comment above GeneratorType has been revised. The suggestion that if using a sequence multiple times the algorithm “should probably require CollectionType, since CollectionType implies multi-pass” is gone. Instead it states that “any code that uses multiple generators (or for ... in loops) over a single sequence should have static knowledge that the specific sequence is multi-pass”. The most common case of having that static knowledge being that you got the sequence from a collection I guess.

Above Slice we find “Warning: Long-term storage of Slice instances is discouraged“. It makes it pretty clear Slice is mainly there to help pass around subranges of arrays, rather than being a standalone type in its own right. They give you the value type behaviour you’d get from creating a new array (i.e. if a value in the original array is changed, the value in the slice won‘t) while allowing you to get the performance benefits of copy-on-write for a sub-range of an existing array.

Several of the _SomeProto versions of protocols now have an explicit comment along the lines of the previously implied “this protocol is an implementation detail of SomeProto; do not use it directly”. I’m still not sure why these protocols are written in this way, possibly as a workaround for some compiler issues? If you have an idea, let me know.

edit: a post on the dev forum confirms it’s a temporary workaround for a compiler limitations related to generics, though doesn’t get specific about what limitation. Thanks to @anatomisation and @ivicamil for the link.

The reason for ContiguousArray’s existence is finally made clear. It’s there specifically for better performance when holding references to classes (not value types). There’s also a comment above the various collection withUnsafeBufferPointer methods suggesting you could use them when the optimizer fails to eliminate bounds checks automatically, in a performance-vs-safety trade-off.

Chris Lattner clarified on the dev forum that although this version is a GM, that’s in order to allow you to use it to submit apps to the mac store rather than because this is the final shipping version. We’ll likely see more changes to Xcode 6.1 before that, and with it maybe further changes to the standard lib.

Or reading this blog, of course. Hopefully the comments don’t get too helpful, what would I write about then? ↩

Which function does Swift call? Part 2: Single Arguments

Posted on September 24, 2014 by airspeedvelocity

In the previous article, we saw how Swift allows overloading by just the return type. With these kind of overloads, explicitly stating the type tells Swift which function you want to call:

// r will be of type Range<Int>
let r = 1...5

// if you want a ClosedInterval<Int>, you
// have to be explicit:
let integer_interval: ClosedInterval = 1...5

But that still doesn’t explain why you get a Range back by default if you don’t specify what kind of return type you want. To answer that, we need to look at function arguments. We’ll start with just single-argument functions.

Unlike with return values, functions with the same name that take different types for their arguments don’t always need explicit disambiguation. Swift will try its best to pick one for you based on various best-match criteria.

As a rule of thumb, Swift likes to pick the most “specific” function possible based on the arguments passed. Swift’s fave thing, the thing that beats out all other single arguments, is a function that takes the exact type you’re passing.

If it can’t find a function that takes that exact type, the next preferred option is an inherited class or protocol. Child classes or protocols are preferred over parents.

To see this in action:

protocol P { }
protocol Q: P { }

struct S: Q { }

// this is the function that will be picked
// if called with type S
func f(s: S) { print("S") }

// if that weren't defined, this would be the next
// choice for a call passing in an S, since Q is
// more specialized than P:
func f(p: Q) { print("Q") }

// and then finally this
func f(p: P) { print("P") }

f(S())  // prints S

class C: P { }
class D: C { }
class E: D { }

// this is the function that will be picked
func f(d: D) { print("D") }

// if that weren't defined, this would be the next choice
func f(c: C) { print("C") }

// if neither were defined, the version above for the 
// protocol P would be called

f(E())  // prints D

This prioritized overloading allows you to write more specialized functions for specific types. For example, suppose you want an overloaded version of contains for ClosedInterval that used the fact that you can do the check in constant rather than linear time using comparators.

Well, that sounds like a familiar concept – it’s polymorphism. But don’t get too carried away. It’s important to understand that:

Overloaded functions are chosen at compile time

In all the cases above, which function is called is determined statically not dynamically, at compile-time not run-time. This means it’s determined by the type of the variable not what the variable points to:

class C { }
class D: C { }

// function that takes the base class
func f(c: C) { print("C") }
// function that takes the child class
func f(d: D) { print("D") }

// the object x references is of type D,
// but x is of type C
let x: C = D()

// which function to call is determined by the
// type of x, and not what it points to
f(x)    // Prints "C" _not_ "D"

// the same goes for protocols...
protocol P { }
struct S: P { }

func f(s: S) { print("S") }
func f(p: P) { print("P") }

let p: P = S()

// despite p pointing to an object of type S,
// the version that takes a P will be called:
f(p)    // Prints "P" _not_ "S"

If instead you need run-time polymorphism – that is, you want the function to be picked based on what a variable points to and not what the type of the variable is – you should be using methods not functions:

class C {
    func f() { print("C") }
}

class D: C {
    override func f() { print("D") }
}

// function that takes the child class

// the object x references is of type D,
// but x is of type C:
let x: C = D()

// which method to call is determined by 
// what x points to, and not what type it is
x.f()    // Prints "D" _not_ "C"

The two approaches aren’t mutually exclusive, mind you. You could for example overload a function to take a particular protocol, checking for protocol conformance statically, but then invoke a method on that protocol to get dynamic behaviour.

The downside to that approach is that:

Functions that take Protocols can be ambiguous

With protocols, it’s possible to use multiple inheritance to generate ambiguous situations.

protocol P { }
protocol Q { }

struct S: P, Q { }
let s = S()

func f(p: P) { print("P") }
func f(p: Q) { print("Q") }

// error: argument could be either a P or a Q
f(s)

// so you need to disambiguate, either:
f(s as P)  // prints P

// or:
let q: Q = S()
f(q)       // prints Q

// the same happens with classes vs protocols:
class C  { }
class D: C, P { }

func f(c: C) { print("C") }

// error: argument could be either a C or a P
f(D())
// you must specify, either:
f(D() as P)
// or:
f(D() as C)

Defaulting to ClosedInterval for Integers

So, knowing now that a function that takes a struct will be chosen ahead of a function that takes a protocol, we should be able to write a version of the ... operator that takes a struct (Int), that will beat the current versions that take protocols (Comparable to create a ClosedInterval, and ForwardIndex to create a Range):

func ...(start: Int, end: Int) -> ClosedInterval<Int> {
    return ClosedInterval(start, end)
}

// x is now of type ClosedInterval
let x = 1...5

This is not very appealing though. If you pass in other integer types, you still get a Range:

let i: Int32 = 1, j: Int32 = 5

// r will be of type Range still:
let r = i...j

It would be much better to write a generic version that took an IntegerType.

And it still doesn’t explain why ranges are preferred to closed intervals. Comparable and ForwardIndex don’t inherit from each other. Why isn’t it ambiguous, as in our protocol multiple inheritance example above?

Well, because I was lying when I said the current implementations of ... took a protocol. They actually take generic arguments constrained by protocols. The next article will look at how generic versions of functions fit into the matching criteria.

Which function does Swift call? Part 1: Return Values

Posted on September 20, 2014September 24, 2014 by airspeedvelocity

ClosedInterval is shy. You have to coax it out from behind its friend, Range.

// r will be of type Range<Int>
let r = 1...5

// if you want a ClosedInterval<Int>, you
// have to be explicit:
let integer_interval: ClosedInterval = 1...5

// if you use doubles though, you don’t need
// to declare the type explicitly.
// floatingpoint_interval is a ClosedInterval<Double>
let floatingpoint_interval = 1.0...5.0

Ever since Swift 1.0 beta 5, Range has supposed to be only for representing collection index ranges. If you’re not operating on indices, ClosedInterval is probably what you want. It has methods like contains, which determines in constant time if an interval contains a value. Range can only use the non-member contains algorithm, which will turn your range into a sequence and iterate over it – don’t accidentally put a loop in your loop!

But Range is not going quietly. If you use the ... operator with an integer, it elbows ClosedRange out the way and dashes onto the stage. Why? Because integers are forward indexes (as used by Array), and the Swift ... operator has 3 declarations:

func ...<T : Comparable>(start: T, end: T) -> ClosedInterval<T>

func ...<Pos : ForwardIndexType>(minimum: Pos, maximum: Pos) -> Range<Pos>

func ...<Pos : ForwardIndexType where Pos : Comparable>(start: Pos, end: Pos) -> Range<Pos>

When you write let r = 1...5, Swift calls the last one of these, and you get back a Range. To understand why, we need to run through the different ways Swift decides which overloaded function to call. There are a lot of them.

Let’s start with:

Differing Return Values

In Swift, you can’t declare the exact same function twice:

func f() {  }

// error: Invalid redeclaration of f()
func f() {  }

What you can do in Swift, unlike in some other languages, is define two versions of a function that differ only by their return type:

struct A { }
struct B { }

func f() -> A { return A() }

// this second f is fine, even though it only differs
// by what it returns:
func f() -> B { return B() } 

// note, typealiases are just aliases not different types,
// so you can't do this:
typealias L = A
// error: Invalid redeclaration of 'f()'
func f() -> L { return A() }

When calling these only-differing-by-return-value functions, you need to give Swift enough information at the call site for it to unambiguously determine which one to call:

// error: Ambiguous use of 'f'
let x = f()

// instead you must specify which return value you want:
let x: A = f()    // calls the A-returning version
                  // x is of type A

// or if you prefer this syntax:
let y = f() as B  // calls the B-returning version
                  // y is of type B

// or, maybe you're passing it in as the argument to a function:
func takesA(a: A) { }

// the A-returning version of f will be called
takesA(f())

Finally, if you want declare a reference to a function f, you need to specify the full type of the function. Note, once you have assigned it, the reference is to a specific function. It doesn’t need further disambiguation:

// g will be a reference to the A-returning version
let g: ()->A = f

// h will be a reference to the B-returning version
let h = f as ()->B

// calling g doesn't need further info, it references 
// a specific f, and z will be of type A:
let z = g()

OK, so ... is a function that differs by return type – it returns either a ClosedInterval or a Range. By explicitly specifying the result needs to be a ClosedInterval, we can force Swift to call the function that returns one.

But if we don’t specify which ... explicitly, we don’t get an error about ambiguity as seen above. Instead, Swift picks the Range version by default. How come?

Because the different versions of ... also differ by the arguments they can take. In the next article, we’ll take a look at how that works.

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

Airspeed Velocity

African or European Swift?

swift

Changes to the Swift Standard Library in 1.2 beta 1

More fun with implicitly wrapped non-optionals

Equating Optionals

Comparing Optionals

Optional Subscripts

Full imlementation of `RubbishDictionary`

zipWith, pipe forward, and treating functions like objects

Which function does Swift call? Part 6: Tinkering with priorities

Which function does Swift call? Part 5: Range vs Interval

Which function does Swift call? Part 4: Generics

Which function does Swift call? Part 3: Protocol Composition

Changes to the Swift Standard Library in 1.1 beta 3

Which function does Swift call? Part 2: Single Arguments

Which function does Swift call? Part 1: Return Values

Equating Optionals

Comparing Optionals

Optional Subscripts

Full imlementation of RubbishDictionary

Full imlementation of `RubbishDictionary`