Which function does Swift call? Part 6: Tinkering with priorities

Posted on November 16, 2014April 5, 2015 by airspeedvelocity

This is part of a series of posts on how Swift resolves ambiguity in overloaded functions. You can find Part 1 here. In the previous article, we looked at why you get a Range from the ... by default. The answer? Because Range has some extra validation that relies on the ... input being comparable, and this makes the Range version more specific.

Defaulting to Neither

So, suppose you didn’t want Range to be the default, but still wanted to benefit from this extra validation?

If your goal was to keep the ambiguity, and force the user to be explicit about whether they wanted a Range or a ClosedInterval, you could declare the following:

func ...
  <T: Comparable where T: ForwardIndexType>
  (start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

Now, unless you specify explicitly what type you want, you’ll get an ambiguous usage error when you use the ... operator.

(It might make you a bit nervous to see ForwardIndexType in the declaration of a function for creating intervals, given intervals have nothing to do wth indices. But don’t worry, it’s really only there to carve out an identical domain to the equivalent Range function to force the ambiguity for the same set of possible inputs.)

Defaulting to ClosedInterval

What if you actually wanted ClosedInterval to be the default? This is a little trickier.

If you just wanted to cover the most common case, integers, then you could do it like so:

func ...<T: IntegerType>(start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

// x will now be a ClosedInterval
let x = 1...5

This works because IntegerType conforms to RandomAccessIndex (which eventually comforms to ForwardIndexType).

Now, this would be a very specific carve-out for integers. You could leave it there, because integers are a special case, being as how they’re so fundamental and it’s a bit odd that they’re defined as an index type too.

But If you wanted an interval for all other type that are both comparable and an index, you’d need to handle each one on a case-by-case basis. For example, string indices are comparable:

// note, this doesn’t even have to be generic, since Strings aren’t generic
func ...(start: String.Index, end: String.Index) -> ClosedInterval<String.Index> {
    return ClosedInterval(start, end)
}

let s = "hello"

// y will now be a ClosedInterval
let y = s.startIndex...s.endIndex

Two problems become apparent with this.

First, this gets real old real quickly. Every time you create a new index type you have implement a whole new ... function for it.

Second, even then it’s still out of your control. You want other people to be able to create new index and comparable types and use them with ranges and intervals, and they aren’t necessarily going to do this. So instead of a nice predictable “you get a range by default”, we now have “you’ll probably get an interval, so long as the type has the appropriate ... overload”. That doesnt’t sound good at all.

Your best bet at this point would be to force anyone who wants to their type to be useable with an interval to tag it with a particular protocol. This is what stride does – types have to conform to the Strideable protocol. So let’s try defining an equivalent for ClosedInterval. We’ll call it, uhm, Intervalable.

protocol Intervalable: Equatable { }

extension Int: Intervalable { }

extension String.Index: Intervalable { }

// The following definintions of ... would need
// to REPLACE the current definitions.  So you
// can only do this if you're the author of
// ClosedInterval

func ...<T: Intervalable>
  (start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

func ...<T: Intervalable where T: ForwardIndexType>
  (start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

// x will be a ClosedInterval
let x = 1...5

let s = "hello"

// y will be a ClosedInterval
let y = s.startIndex...s.endIndex

Of course, this is only something the author of the original type can really implement, not something you can do yourself if you personally prefer the default to be the other way around.

And it’s a bit heavy-handed to force every type to conform to a protocol when really all it needs is to be comparable for the interval to work. But I can’t think of another option (even for the implementor) right now.¹ If anyone else can think of a good solution, let me know.

except possible tinkering with the Comparable hierarchy, which is even more restricting since only Apple could do that. ↩

Which function does Swift call? Part 5: Range vs Interval

Posted on November 10, 2014November 10, 2014 by airspeedvelocity

This is part of a series of posts on how Swift resolves ambiguity in overloaded functions. You can find Part 1 here. In the previous article, we looked at how generics are handled.

So finally…

Now that we’ve covered how generics fit in, we can finally answer the original question – how come you get a Range back from 1...5 by default? To recap the sample code:

// r will be of type Range<Int>
let r = 1...5

// if you want a ClosedInterval<Int>, you
// have to be explicit:
let integer_interval: ClosedInterval = 1...5

// if you use doubles though, you don’t need
// to declare the type explicitly.
// floatingpoint_interval is a ClosedInterval<Double>
let floatingpoint_interval = 1.0...5.0

The infix ... function is defined three times in the standard library, with the following signatures:

func ...<T : Comparable>
  (start: T, end: T) -> ClosedInterval<T>

func ...<Pos : ForwardIndexType>
  (minimum: Pos, maximum: Pos) -> Range<Pos>

func ...<Pos : ForwardIndexType where Pos : Comparable>
  (start: Pos, end: Pos) -> Range<Pos>

The first two are pretty similar. They are both generic, and have one generic placeholder that is constrained by a single protocol. There’s nothing favouring one of the two constraints – Comparable and ForwardIndexType both descend from Equatable, but neither is a descendent of the other. Left as just those two functions, you’d get an ambiguous call error.

It’s the third function above that is what makes Swift pick Range over ClosedInterval. In it, Pos is constrained not just by ForwardIndexType but also by Comparable. Constraining by two protocols wins over one, so this is the overload that’s picked.

Well, after 4 long lead-up posts, that was a bit of an anticlimax, eh?

So let’s ask another question – why is that last function there?

One possible explanation: the version with two protocol constrains was implemented purely to break the tie, since ambiguous call errors are annoying. Creating a Range doesn’t actually require comparable, but adding the comparable requirement has the effect of favouring ranges over intervals. If this was the desired outcome, that could be all the extra ... is there for.

But that’s probably not why. There’s another reason why there’s an overload that requires Comparable.

Factory Functions

A recurring pattern in the Swift standard library is to use free functions as factories to construct different types depending on the function name or arguments.

For example, the lazy function is defined 4 times for 4 different argument types: once each for random-access, bidirectional, and forward collections; and once for sequences.¹ Each one returns a different matching lazy type (e.g. LazyRandomAccessCollection). When you call lazy, the compiler picks the most specific match (in the order just given). When you combine this with type inference, you get the most powerful type available declared for you, all determined at compile time.

// a will be a LazyRandomAccessCollection
// since arrays are random access
let a = lazy([1,2,3,4])

// s will be a LazyBidirectionalCollection,
// since strings can't be indexed randomly
let s = lazy("hello")

// r will be a LazySequence, since StrideTo
// isn't a collection, just a sequence
let r = lazy(stride(from: 1, to: 8, by: 2))

// there's nothing stopping you declaring these
// lazy objects manually directly:
let l = LazyRandomAccessCollection([1,2,3,4])

Each version of lazy might not even do anything other than initialize and return the relevant type – it just makes declaring different lazy types more convenient than giving the relevant class name in full. All this is made possible by the overloading priorities described in this series. Most everyday users of lazy don’t need to know any of this though – they just use the function and it does what you’d intuitively guess was the “right” thing.

In some cases, which factory function to call is controlled more directly by the caller. For example, the stride function creates either a StrideTo or a StrideThrough depending on whether the named middle argument is to: or through:.

// x will be a StrideThrough
let x = stride(from: 1, through: 10, by: 5)

// y will be a StrideTo
let y = stride(from: 1, to: 10, by: 5)

In the case of stride, this is the only way you can construct these objects, as their initializers are private (which means stride can call them since it’s declared inside Swift, but you can’t).

Using Factory Functions to Perform Validation

For Range, the ... operator performs two extra tasks on top of constructing the return value. First, if the input supports comparators, it can validate that the begin and end arguments aren’t inverted. If they are, you get a runtime assertion. You only get this validation when using the ... operator. If you try and declare a range like this: Range(start: 5, end: 1), it’ll work.²

By the way, you’ll even get this check with String ranges. Swift string indices aren’t random-access (because of variable-length characters ), but they are comparable. While you can’t know how many characters are between two indices, you can know that one index comes before the another.

The other purpose of the ... function is to normalize closed ranges into half-open ranges. Range is kind of the opposite way around to ClosedInterval and HalfOpenInterval. There is no closed version of Range, ranges are always half-open. When you call ..., it increments the second parameter to make it equivalent to a half-open range. If you type 1...5 into a playground, you’ll see what is actually created is displayed as 1..<6.

This is why you'll get a runtime assertion if you ever try and construct a closed range through the end index of a string, even though in theory it could be a valid thing to do:

let s = "hello"
let start = s.startIndex
let end = s.endIndex

// fatal error: can not increment endIndex
let r = start...end

// same as if you did this:
end.successor()

Could you put this logic into Range.init? In the case of the half-open conversion, maybe you could, by having two versions of init with to: and through: arguments. But it's cleaner to do it using operators.

(You could maybe say the same about stride, but then you'd need custom ternary operators.³)

But in the case of the inversion check that uses comparable, you'd have to make a generic version of init, maybe something like this:

extension Range {
    init<C: Comparable>(start: C, end: C) {
        assert(start <= end, "Can't form Range with end < start")
        self.init(start: start, end: end)
    }
}

Well, this will compile but it won't do what you want. In fact your init will never be called, even if you pass in a comparable type. Why? Because it's generic, and the regular version of Range.init is not. As we've seen previously, generic functions never get called when there's a possible non-generic overload.

“Hey, no wait, the current Range.init is so generic!” you complain. “Look:”

// Excerpt from the Swift definition of Range
struct Range : Equatable ...etc {
    /// Construct a range with `startIndex == start` and `endIndex ==
    /// end`.
    init(start: T, end: T)
}

“See? T is a generic placeholder! And we’re puting more constraints on our function’s placeholder, so it should be the one that’s called.”

OK yes, T is a generic placeholder. But it’s not a placeholder for the function. It’s a placeholder for the struct. In the context of the function, T is already fixed in place as a specific type.⁴ To help understand this, try the following code:

struct S<T> {
    func f(i: Int) { print("f(Int)") }
    func f(t: T) { print("f(T)") }

    func g(i: Int) { print("g(Int)") }

    // note, here scoping rules mean the placeholder 
    // T is a _different_ T to the struct's T
    func g<T: IntegerType>(t: T) { print("g(T)") }
}

// fix T to be an Int
let s = S<Int>()

// error: Ambiguous use of f
s.f(1)

// prints "g(Int)" not "g(T)"
// because generics lose...
s.g(1)

Incidentally, it’s stuff like the potentially confusing scoping of T above (or that placeholders might get randomly renamed) that makes me nervous about re-using a struct’s placeholders when extending that struct. It doesn’t feel right. Usually, nicely-written generic classes typealias their placeholders in some fashion. For example, Range typealiases T as Index. Probably best to use that.

Anyway, for Range, declaring a generic function ..., that doesn’t have to compete with any non-generic alternatives, makes these problems go away. Plus operators look nice.

Next up: what if we didn’t want Range to be the default? What could we do about it?

For more on Swift’s lazy types, see this article ↩
well, depending on your definition of “work” I guess. ↩
If you want to see how these are possible (if perhaps inadvisable) Nate Cook recently wrote a good article about them. ↩
If you really want to tie your brain in knots, try to think about a secenario where the struct placeholder is fixed by which init is chosen, which relies on what T is… ↩

Which function does Swift call? Part 4: Generics

Posted on October 26, 2014October 26, 2014 by airspeedvelocity

This is part 4 of a series on how overloaded functions in Swift are chosen. Part 1 covered overloading by return type, part 2 was about how different functions with simple single arguments were picked on a “best match” basis, and part 3 went into a little more detail about protocols.

We started the series with the question, why do you get a Range type back from 1...5 instead of a ClosedInterval? Today we cover how generics fit into the matching hierarchy, and we’ll finally have enough information to answer that question.

Generics are Lower Priority

Generics are lower down the pecking order. Remember, Swift likes to be as “specific” as possible, and generics are less specific. Functions with non-generic arguments (even ones that are protocols) are always preferred over generic ones:

// This f will match a single argument of any type,
// but is very low priority
func f<T>(t: T) { 
    print("T") 
}

// This f takes a specific implemented type, so will be
// preferred over anything else when passed a String.
func f(s: String) {
    print("String")
}

// This f takes a specific protocol, so would be
// preferred over the generic version (though not
// over a version of f that took an actual Bool)
func f(b: BooleanType) {
    print("BooleanType")
}

// this prints "String"
f("wotcha")

// this prints "BooleanType"
f(true)

// this prints "T"
f(1)

OK that’s simple enough. But once we’re in generic territory, there’s a bunch of additional rules for picking one generic function over another.

Making Generic Placeholders More Specific with Constraints

Within a type parameter list, you can apply various different constraints to your generic parameters. You can require generic placeholders conform to protocols, and that their associated types be equal to a certain type, or conform to a protocol.

Just like before, the rule of thumb is: the more specific you make a function, the more likely it is to be picked in overload resolution.

This means a generic function with a placeholder that has a constraint (such as conforming to a protocol) will be picked over one that doesn’t:

func f<T>(t: T) {
    print("T")
}

func f<T: IntegerType>(t: T) {
    print("T: IntegerType")
}

// prints "T: IntegerType"
f(1)

If the choice is between two functions that both have constraints on their placeholder, and one is more specific than the other, the more specific one is picked. These rules follow a familiar pattern – they mirror the rules for non-generic overloading.

For example, when one protocol inherits from another, the inheriting protocol constraint is preferred:

func g<T: IntegerType>(t: T) {
    print("IntegerType")
}

func g<T: SignedIntegerType>(t: T) {
    print("SignedIntegerType")
}

// prints "SignedIntegerType"
g(1)

Or if the choice is between adhering to one protocol or two, two is preferred:

func f<B: BooleanType>(b: B) {
    print("BooleanType")
}

func f<B: protocol<BooleanType,Printable>>(b: B) {
    print("BooleanType and Printable")
}

// prints "BooleanType and Printable"
f(true)

The where clause introduces the ability to constrain placeholders by their associated types. These additional constraints make the match more specific, bumping it up the priority:

// argument must be an IntegerType
func f<T: IntegerType>(t: T) {
    print("IntegerType")
}

// argument must be an IntegerType 
// AND it's Distance must be an IntegerType
func f<T: IntegerType where T.Distance: IntegerType>(t: T) {
    print("T.Distance: IntegerType")
}

// prints "T.Distance: IntegerType"
f(1)

Here, the extra where clause is analogous to a regular argument needing to conform to two protocols.

Swift will also distinguish between two where clauses, one of which is more specific than another. For example, requiring an associated type to conform to an inheriting protocol:

func f<T: IntegerType where T.Distance: IntegerType>(t: T) {
    print("T.Distance: IntegerType")
}

// where T.Distance: SignedIntegerType is more specific than
// where just T.Distance: IntegerType
func f<T: IntegerType where T.Distance: SignedIntegerType>(t: T) {
    print("T.Distance: SignedIntegerType")
}

// so this prints "T.Distance: SignedIntegerType"
f(1)

Where clauses also allow you to check a value is equal to a specific type, not just that it conforms to a protocol:

func f<T: IntegerType where T.Distance == Int>(t: T) {
    print("Distance == Int")
}

func f<T: IntegerType where T.Distance: IntegerType>(t: T) {
    print("Distance: IntegerType")
}

// prints "Distance == Int"
f(1)

It isn’t surprising that the version that specifies Distance is equal to an exact type is more specific than the version that just requires it conforms to a protocol. This is similar to the rule for non-generic functions that a specific type as an argument is preferred over a protocol.

One interesting point about checking equality in the where clause. You can check for equality to a protocol. But it probably isn’t what you want – equality to a protocol is not the same as conforming to a protocol:

// create a protocol that requires the
// implementor defines an associated type
protocol P {
    typealias L
}

// implement P and choose Ints as the
// associated type
struct S: P {
    typealias L = Int
}
let s = S()

// Ints are printable so this will match
func f<T: P where T.L: Printable>(t: T) {
    print("T.L: Printable")
}

// == is more specific than :, so this will
// be the one that's called, right?
func f<T: P where T.L == Printable>(t: T) {
    print("T.L == Printable")
}

// nope! prints "T.L: Printable"
f(s)

// here is a struct where L really is == Printable
struct S2: P {
    typealias L = Printable
}
let s2 = S2()

// and this time, yes, it prints "T.L == Printable"
f(s2)

It’s actually pretty hard to do this by accident – you can only check for type equality if a protocol you’re equating to has no Self or associated type requirements (i.e. the protocol doesn’t require a typealias. or use Self as a function argument), same as you can’t use those protocols as non-generic arguments. This rules out most of the protocols in the standard library (Printable is one of only a handful that doesn’t).

Beware the Unexpected Overload

So, if given a choice between a generic overload and a non-generic one, Swift choses the non-generic one. If choosing between two generic functions, there’s a set of rules that are very similar to the ones for choosing between non-generic functions.

It’s not a perfect parallel, though – there are some differences. For example, if given a choice between an inheriting constraint or more constraints (“depth versus breadth”), unlike with non-generic protocols, Swift will actually chose breadth:

protocol P { }
protocol Q { }

protocol A: P { }

struct S: A, Q { }
let s = S()

// this f is for an inherited protocol
// (deeper than just P)
func f(p: A) {
    print("A")
}

// this f is for more protocols
// (broader than just P)
func f(p: protocol<P, Q>) {
    print("P and Q")
}

// error: Ambiguous use of 'f'
f(s)

// however, if we define a similar situation
// with generics instead:

func g<T: A>(t: T) {
    print("T: A")
}

func g<T: protocol<P, Q>>(t: T) {
    print("T: P and Q")
}

// the second one will be picked
// prints "T: P and Q"
g(s)

Let’s see that example again with some real-world types from the standard library:

func f<T: SignedIntegerType>(t: T) {
    print("SignedIntegerType")
}

// IntegerType is less specific than SignedIntegerType,
// but depth beats breadth so this one should be called:
func f<T: protocol<IntegerType, Printable>>(t: T) {
    print("T: P and Q")
}

// prints "SignedIntegerType"
// wait, what?
f(1)

If the example with P and Q worked one way, why the difference with SignedIntegerType and Printable?

Well, turns out IntegerType itself conforms to Printable (by way of _IntegerType). Since it already conforms to it, the Printable in protocol is redundant and can be ignored – it doesn’t make that version of f any more specific than T just conforming to IntegerType. Since SignedIntegerType inherits from IntegerType, it is more specific, so that’s the one that gets picked.

I point this out not to be fussy about the hierarchy of standard library protocols, but to point out how easy it is to get an unexpected version of an overloaded function. For this reason, it’s a good rule to never overload a function with different functionality. Overload for performance optimization (like a collection algorithm that extends), or to extend a function to cover your user-defined new type, or to provide the caller with a more powerful but basically equivalent type (like with Range and ClosedInterval). But overload to vary functionality based on the input and sooner or later you’ll be sorry.

Anyway, with that little mini-lecture out of the way, we finally have enough of the details of overloading to answer the original question – why does Range get picked? We’ll cover that in the next article.

Which function does Swift call? Part 3: Protocol Composition

Posted on October 7, 2014October 26, 2014 by airspeedvelocity

Previously, we looked at how Swift functions can be overloaded just by return type, and how Swift picks between different possible overloads based on a best-match mechanism.

All this was working towards the reason why 1...5 returns a Range rather than a ClosedInterval. And to understand that, we’ll have to look at how generics fit into the matching criteria.

But before we do, a brief diversion into protocol composition.

Protocols can be composed by putting zero¹ or more protocols between the angle brackets of a protocol<> statement. The Apple Swift book says:

NOTE
Protocol compositions do not define a new, permanent protocol type. Rather, they define a temporary local protocol that has the combined requirements of all protocols in the composition.

Reading that, you might think that declaring protocol<P,Q> was essentially like declaring an anonymous protocol Tmp: P, Q { }, and then using Tmp, without ever giving it a name. But that’s not quite what this means, and a way to spot the difference is by declaring functions that take arguments declared with protocol.

First, a recap of some of the behaviours of regular protocols. Suppose we define 4 protocols: first P and Q, and then two more, A and B that both just conform to P and Q:

protocol P { }
protocol Q { }

protocol A: P, Q { }
protocol B: P, Q { }

Although A and B are functionally identical, they are two different protocols. This means you can write two functions, one that takes A and one that takes B, and they’ll be two different functions with equal priority, so if you try to call them and either one would work, you’ll get an ambiguous call error:

struct S: A, B { }
let s = S()

func f(a: A) {
    print("A")
}

func f(b: B) {
    print("B")
}

// error: Ambiguous use of 'f'
f(s)

If, on the other hand, we use protocol<P,Q> instead of B, we get a different result:

func g(a: A) {
    print("A")
}

func g(p: protocol<P,Q>) {
    print("protocol<P,Q>")
}

// prints "A"
g(s)

If protocol<P,Q> were really declaring a new anonymous protocol that conformed to P and Q, we’d get the same error as before. Instead, its like saying “an argument that conforms to both P and Q”.

As we saw in the last article, an inherited protocol (in this case A) always trumps its ancestors (in this case P and Q). So the A version of g is called. This is consistent with our rule of thumb, that Swift will always pick the function with more “specific” arguments – here, inheriting protocols are more specific than inherited ones.

Another way to be more specific is to require more protocols. Here’s an example of how three protocols is more specific than two:

protocol R { }
extension S: R { }

// you can assign an alias for any
// protocol<> definition if you prefer
typealias Two = protocol<P,Q>
typealias Three = protocol<P,Q,R>

func h(p: Two) {
    print("Two")
}

func h(p: Three) {
    print("Three")
}

// prints "Three"
h(s)

// you can still force the other to be
// called if you want: this prints "Two"
h(s as Two)

Finally, if given a choice between an inheriting protocol, or more protocols (i.e. depth vs breadth), Swift won’t favour one over the other:

func o(p: protocol<A,B>) {
    print("A and B")
}

func o(p: protocol<P,Q,R>) {
    print("P, Q and R")
}

// error: Ambiguous use of 'o'
o(s)

// prints "A and B"
o(s as protocol<A,B>)
// prints "P, Q and R"
o(s as protocol<P,Q,R>)

One of the conclusions here is that there are quite a few possible ways for you to declare functions that take the same argument type, and to tweak the declarations to favour one function being called over another. When you combine these with different return values, that gives you a tool to nudge Swift into inferring a specific type by default, but allowing the user to get a different one from the same function name if they prefer, while avoiding forcing the user to deal with ambiguity errors. This is what is happening with ... when it defaults to Range.

So now, on to generics, which will take this even further.

zero being an interesting case we’ll look at some other time ↩

Changes to the Swift Standard Library in 1.1 beta 3

Posted on October 1, 2014October 1, 2014 by airspeedvelocity

The biggest deal in this latest beta is the documentation. There are 2,745 new lines of /// comments in beta 3, and even pre-existing documentation for most items has been revised.

That doesn’t mean there aren’t any functional changes to the library, though. Here’s a rundown of what I could find:

As mentioned in the release notes, the various literal convertible protocols are now implemented as inits rather than static functions.
The ArrayBoundType protocol is gone, and the various integer types that conformed to it no longer do.
The CharacterLiteralConvertible protocol is gone. Character never appeared to conform to it – it conformed to ExtendedGraphemeClusterLiteralConvertible which remains.
The StringElementType protocol is gone, and UInt8 and UInt16 no longer conform to it. The UTF16.copy method, that relied on it to do some pointer-based manipulation, is also gone.
Dictionary’s initializer that took a minimumCapacity argument no longer has a default for that minimum capacity. It’s interesting that this worked previously, since if you did the same thing with your own type (i.e. gave it both an init() and an init(arg: Int = 0)) you’d get a compiler error when you tried to actually use init().
In extending RandomAccessIndexType, integers now use their Distance typealias for Int rather than straight Int.
The static from() methods on integers are all gone.
There’s now an EnumerateSequence that returns an EnumerateGenerator, and enumerate() returns that rather than the generator.
The various integer types no longer have initializers from Builtin.SomeType.
The _CocoaArrayType protocol (as used to initialize an Array from a Cocoa array) has been renamed to _SwiftNSArrayRequiredOverridesType.
Numerous _CocoaXXX and _SwiftXXX protocols have acquired an @objc attribute.
_PrintableNSObjectType (which wasn’t explicitly implemented by anything in the std lib) is gone.
The AssertStringType and StaticStringType protocols are gone, as has AssertString. StaticString still exists and is clearly described as a static string that can be known at compile-time.
assert and precondition are much simplified with those string types removed, leaving one version each that just uses String for it’s message parameter. assertionFailure, preconditionFailure and fatalError all take a String for their message now as well, though they still take StaticString for the filename argument as described in the Swift blog’s article.
There are two new sort functions that operate on a ContiguousArray (one for arrays of comparable elements, and one that takes a comparison predicate).
But there are two fewer sorted functions. The ones that take a MutableCollectionType, and return one, are gone. Seems a shame, though it didn’t really make sense for them to take a mutable type given they didn’t need to mutate it. Maybe they’ll be replaced with versions that return ExtensibleCollectionType.
There’s a new unsafeDowncast that is equivalent to x as T but with the safeties removed – to be used only when using as is causing performance problems.
Raise a glass for the snarky “Haskell’s fmap, which was mis-named” comment, which is now replaced by a very straight-laced description of what Optional.map actually does.

In keeping with the documenting theme, there are a lot of argument name changes here as well. Continuing a trend seen in previous betas, specific, descriptive argument names are preferred over more generic ones. This is probably a good Swift style tip for your own code, especially if you’re also writing libraries.

Some examples:

Naming arguments in protocols instead of just using ‘_‘ e.g. func distanceTo(other: Self) instead of func distanceTo(_: Self) (there’s no compulsion to use the same names for your method arguments that the protocol does but you probably should)
Avoiding just naming the variable after the type (e.g. sequence: Sequence), For example, Array.extend has renamed its argument to newElements.
Replacing newValues with newElements in various places, presumably because of the unintended Swift implications of the term “values”.
Avoiding i or v such as subscript(position: Index) instead of subscript(i: Index), and init(_ other : Float) instead of init(_ v : Float).
Descriptive names for predicate arguments, such as isOrderedBefore rather than just pred.

Rather than me regurgitate the comments here, I’d suggest option-clicking import Swift and reading through it in full. There’s lots of informative stuff in there. Major types like Array and String have long paragraphs in front of them with some good clarifications. Many things are explained that you’d otherwise only have picked up on by watching the developer forums like a hawk.¹

Here are a few items of specific interest:

The comment above GeneratorType has been revised. The suggestion that if using a sequence multiple times the algorithm “should probably require CollectionType, since CollectionType implies multi-pass” is gone. Instead it states that “any code that uses multiple generators (or for ... in loops) over a single sequence should have static knowledge that the specific sequence is multi-pass”. The most common case of having that static knowledge being that you got the sequence from a collection I guess.

Above Slice we find “Warning: Long-term storage of Slice instances is discouraged“. It makes it pretty clear Slice is mainly there to help pass around subranges of arrays, rather than being a standalone type in its own right. They give you the value type behaviour you’d get from creating a new array (i.e. if a value in the original array is changed, the value in the slice won‘t) while allowing you to get the performance benefits of copy-on-write for a sub-range of an existing array.

Several of the _SomeProto versions of protocols now have an explicit comment along the lines of the previously implied “this protocol is an implementation detail of SomeProto; do not use it directly”. I’m still not sure why these protocols are written in this way, possibly as a workaround for some compiler issues? If you have an idea, let me know.

edit: a post on the dev forum confirms it’s a temporary workaround for a compiler limitations related to generics, though doesn’t get specific about what limitation. Thanks to @anatomisation and @ivicamil for the link.

The reason for ContiguousArray’s existence is finally made clear. It’s there specifically for better performance when holding references to classes (not value types). There’s also a comment above the various collection withUnsafeBufferPointer methods suggesting you could use them when the optimizer fails to eliminate bounds checks automatically, in a performance-vs-safety trade-off.

Chris Lattner clarified on the dev forum that although this version is a GM, that’s in order to allow you to use it to submit apps to the mac store rather than because this is the final shipping version. We’ll likely see more changes to Xcode 6.1 before that, and with it maybe further changes to the standard lib.

Or reading this blog, of course. Hopefully the comments don’t get too helpful, what would I write about then? ↩

Which function does Swift call? Part 2: Single Arguments

Posted on September 24, 2014 by airspeedvelocity

In the previous article, we saw how Swift allows overloading by just the return type. With these kind of overloads, explicitly stating the type tells Swift which function you want to call:

// r will be of type Range<Int>
let r = 1...5

// if you want a ClosedInterval<Int>, you
// have to be explicit:
let integer_interval: ClosedInterval = 1...5

But that still doesn’t explain why you get a Range back by default if you don’t specify what kind of return type you want. To answer that, we need to look at function arguments. We’ll start with just single-argument functions.

Unlike with return values, functions with the same name that take different types for their arguments don’t always need explicit disambiguation. Swift will try its best to pick one for you based on various best-match criteria.

As a rule of thumb, Swift likes to pick the most “specific” function possible based on the arguments passed. Swift’s fave thing, the thing that beats out all other single arguments, is a function that takes the exact type you’re passing.

If it can’t find a function that takes that exact type, the next preferred option is an inherited class or protocol. Child classes or protocols are preferred over parents.

To see this in action:

protocol P { }
protocol Q: P { }

struct S: Q { }

// this is the function that will be picked
// if called with type S
func f(s: S) { print("S") }

// if that weren't defined, this would be the next
// choice for a call passing in an S, since Q is
// more specialized than P:
func f(p: Q) { print("Q") }

// and then finally this
func f(p: P) { print("P") }

f(S())  // prints S

class C: P { }
class D: C { }
class E: D { }

// this is the function that will be picked
func f(d: D) { print("D") }

// if that weren't defined, this would be the next choice
func f(c: C) { print("C") }

// if neither were defined, the version above for the 
// protocol P would be called

f(E())  // prints D

This prioritized overloading allows you to write more specialized functions for specific types. For example, suppose you want an overloaded version of contains for ClosedInterval that used the fact that you can do the check in constant rather than linear time using comparators.

Well, that sounds like a familiar concept – it’s polymorphism. But don’t get too carried away. It’s important to understand that:

Overloaded functions are chosen at compile time

In all the cases above, which function is called is determined statically not dynamically, at compile-time not run-time. This means it’s determined by the type of the variable not what the variable points to:

class C { }
class D: C { }

// function that takes the base class
func f(c: C) { print("C") }
// function that takes the child class
func f(d: D) { print("D") }

// the object x references is of type D,
// but x is of type C
let x: C = D()

// which function to call is determined by the
// type of x, and not what it points to
f(x)    // Prints "C" _not_ "D"

// the same goes for protocols...
protocol P { }
struct S: P { }

func f(s: S) { print("S") }
func f(p: P) { print("P") }

let p: P = S()

// despite p pointing to an object of type S,
// the version that takes a P will be called:
f(p)    // Prints "P" _not_ "S"

If instead you need run-time polymorphism – that is, you want the function to be picked based on what a variable points to and not what the type of the variable is – you should be using methods not functions:

class C {
    func f() { print("C") }
}

class D: C {
    override func f() { print("D") }
}

// function that takes the child class

// the object x references is of type D,
// but x is of type C:
let x: C = D()

// which method to call is determined by 
// what x points to, and not what type it is
x.f()    // Prints "D" _not_ "C"

The two approaches aren’t mutually exclusive, mind you. You could for example overload a function to take a particular protocol, checking for protocol conformance statically, but then invoke a method on that protocol to get dynamic behaviour.

The downside to that approach is that:

Functions that take Protocols can be ambiguous

With protocols, it’s possible to use multiple inheritance to generate ambiguous situations.

protocol P { }
protocol Q { }

struct S: P, Q { }
let s = S()

func f(p: P) { print("P") }
func f(p: Q) { print("Q") }

// error: argument could be either a P or a Q
f(s)

// so you need to disambiguate, either:
f(s as P)  // prints P

// or:
let q: Q = S()
f(q)       // prints Q

// the same happens with classes vs protocols:
class C  { }
class D: C, P { }

func f(c: C) { print("C") }

// error: argument could be either a C or a P
f(D())
// you must specify, either:
f(D() as P)
// or:
f(D() as C)

Defaulting to ClosedInterval for Integers

So, knowing now that a function that takes a struct will be chosen ahead of a function that takes a protocol, we should be able to write a version of the ... operator that takes a struct (Int), that will beat the current versions that take protocols (Comparable to create a ClosedInterval, and ForwardIndex to create a Range):

func ...(start: Int, end: Int) -> ClosedInterval<Int> {
    return ClosedInterval(start, end)
}

// x is now of type ClosedInterval
let x = 1...5

This is not very appealing though. If you pass in other integer types, you still get a Range:

let i: Int32 = 1, j: Int32 = 5

// r will be of type Range still:
let r = i...j

It would be much better to write a generic version that took an IntegerType.

And it still doesn’t explain why ranges are preferred to closed intervals. Comparable and ForwardIndex don’t inherit from each other. Why isn’t it ambiguous, as in our protocol multiple inheritance example above?

Well, because I was lying when I said the current implementations of ... took a protocol. They actually take generic arguments constrained by protocols. The next article will look at how generic versions of functions fit into the matching criteria.

Which function does Swift call? Part 1: Return Values

Posted on September 20, 2014September 24, 2014 by airspeedvelocity

ClosedInterval is shy. You have to coax it out from behind its friend, Range.

// r will be of type Range<Int>
let r = 1...5

// if you want a ClosedInterval<Int>, you
// have to be explicit:
let integer_interval: ClosedInterval = 1...5

// if you use doubles though, you don’t need
// to declare the type explicitly.
// floatingpoint_interval is a ClosedInterval<Double>
let floatingpoint_interval = 1.0...5.0

Ever since Swift 1.0 beta 5, Range has supposed to be only for representing collection index ranges. If you’re not operating on indices, ClosedInterval is probably what you want. It has methods like contains, which determines in constant time if an interval contains a value. Range can only use the non-member contains algorithm, which will turn your range into a sequence and iterate over it – don’t accidentally put a loop in your loop!

But Range is not going quietly. If you use the ... operator with an integer, it elbows ClosedRange out the way and dashes onto the stage. Why? Because integers are forward indexes (as used by Array), and the Swift ... operator has 3 declarations:

func ...<T : Comparable>(start: T, end: T) -> ClosedInterval<T>

func ...<Pos : ForwardIndexType>(minimum: Pos, maximum: Pos) -> Range<Pos>

func ...<Pos : ForwardIndexType where Pos : Comparable>(start: Pos, end: Pos) -> Range<Pos>

When you write let r = 1...5, Swift calls the last one of these, and you get back a Range. To understand why, we need to run through the different ways Swift decides which overloaded function to call. There are a lot of them.

Let’s start with:

Differing Return Values

In Swift, you can’t declare the exact same function twice:

func f() {  }

// error: Invalid redeclaration of f()
func f() {  }

What you can do in Swift, unlike in some other languages, is define two versions of a function that differ only by their return type:

struct A { }
struct B { }

func f() -> A { return A() }

// this second f is fine, even though it only differs
// by what it returns:
func f() -> B { return B() } 

// note, typealiases are just aliases not different types,
// so you can't do this:
typealias L = A
// error: Invalid redeclaration of 'f()'
func f() -> L { return A() }

When calling these only-differing-by-return-value functions, you need to give Swift enough information at the call site for it to unambiguously determine which one to call:

// error: Ambiguous use of 'f'
let x = f()

// instead you must specify which return value you want:
let x: A = f()    // calls the A-returning version
                  // x is of type A

// or if you prefer this syntax:
let y = f() as B  // calls the B-returning version
                  // y is of type B

// or, maybe you're passing it in as the argument to a function:
func takesA(a: A) { }

// the A-returning version of f will be called
takesA(f())

Finally, if you want declare a reference to a function f, you need to specify the full type of the function. Note, once you have assigned it, the reference is to a specific function. It doesn’t need further disambiguation:

// g will be a reference to the A-returning version
let g: ()->A = f

// h will be a reference to the B-returning version
let h = f as ()->B

// calling g doesn't need further info, it references 
// a specific f, and z will be of type A:
let z = g()

OK, so ... is a function that differs by return type – it returns either a ClosedInterval or a Range. By explicitly specifying the result needs to be a ClosedInterval, we can force Swift to call the function that returns one.

But if we don’t specify which ... explicitly, we don’t get an error about ambiguity as seen above. Instead, Swift picks the Range version by default. How come?

Because the different versions of ... also differ by the arguments they can take. In the next article, we’ll take a look at how that works.

Changes to the Swift Standard Library in 1.1 beta 2

Posted on September 15, 2014September 16, 2014 by airspeedvelocity

Less than a week after the last beta was released, Swift 1.1 beta 2 is up. And yes, it’s officially Swift 1.1, as displayed by running swift -v from the command line.

Presumably we’re gearing up for a new GM alongside Yosemite, as there are almost no changes to the standard library in this release, much like when the GM was almost upon us for 6.0. Not surprising, since the Swift team had already confirmed on the dev forums that 6.1 was going to be a fairly small release.

The only material change are to the RawRepresentable protocol, and the _RawOptionSetType which inherits from it:

RawRepresentable’s Raw typealias is now RawValue.
Instead of a toRaw function, it now has a read-only rawValue property.
Instead of a fromRaw class function, it now defines an init? that takes a raw value.
_RawOptionSetType also now defines an init that takes a raw value, rather than a class function fromMask.

This matches the release notes, which state that enums construction from raw values have been changed to take advantage of the new failable initializers introduced in the last beta.

Interesting thing to note: while RawRepresentable has an init?(rawValue: RawValue) method, _RawOptionSetType has an init(rawValue: RawValue) method. This might seem odd at first – _RawOptionSetType inherits from RawRepresentable, shouldn’t they have the same kind of init? But since init is more restrictive than init?, it works, as an optional type can always be substituted with its non-optional equivalent. _RawOptionSetType is essentially restating RawRepresentable’s raw initializer as non-optional after all.

Here’s looking forward to a Swift 1.1 GM, and maybe after that on to 1.2?

Accidentally putting a loop in your loop

Posted on September 14, 2014September 14, 2014 by airspeedvelocity

Joel Spolsky wrote a good article back in 2001 about knowing when the function you’re calling has linear complexity,¹ and not accidentally putting a loop in your loop,² getting quadratic complexity without realizing it. He was talking about C string algorithms like strlen and strcat, but it applies just as much to several Swift standard library functions.

When a function only has versions that take a sequence, like the seductively-useful contains, equal, min– and maxElement, it’s linear.³ Some depend on their input: countElements can be done in constant time if you give it a collection with a random-access index, but linear if not. Same with distance and advance index operations, which is why extending String to support random-access indexing using them is probably a bad idea. Be careful, not everything that could be optimized may have been. For example, ClosedInterval.contains is presumably O(1) but there doesn’t appear to be an optimized overload for the non-member contains, which only has versions that take a sequence.

The Swift team have helpfully put the complexity of certain functions in the documentation. Array’s reserveCapacity, insert, and replaceRange are O(N).

And so is… removeAtIndex. In a previous article, I mentioned the following code to remove all occurrences of a given value from an array has an efficiency problem:

func remove
    <C: RangeReplaceableCollectionType,
     E: Equatable
     where C.Generator.Element == E>
    (inout collection: C, value: E) {
        var idx = collection.startIndex
        while idx != collection.endIndex {
            if collection[idx] == value {
                collection.removeAtIndex(idx)
            }
            else {
                ++idx
            }
        }
}

The documentation for removeAtIndex says: ⁴

Remove and return the element at the given index. Worst case complexity:
O(N). Requires: index < count

That is, the worst-case time it takes to remove an element increases in linear proportion to the length of the collection. That's not surprising – if you remove an element of an array, you have to shuffle each element after it down one. The larger the collection the more things to shuffle. Maybe your element is near the end, maybe your collection is a linked list that can remove elements in O(1), maybe it’s magic and has all sorts of cool optimizations – hence O(N) is the worst case. But still, it’s probably not best to call it within another loop.

Here, it shouldn't be necessary. We're already iterating over the collection, so ought to be able to combine the deletion and shuffling down together as one operation. Here's a new version of remove that does this, modelled on the C++ STL equivalent:

(incidentally, it also removes the need for questionable assumptions about how indexes behave when you remove elements, the subject of the previous article)

func remove
    <C: protocol<RangeReplaceableCollectionType,
                 MutableCollectionType>,
     E: Equatable
     where C.Generator.Element == E>
    (inout col: C, value: E) {
        // find the first entry to remove
        if var advance = find(col, value) {
            // advance points to next element to test,
            // rear points to where to copy it to
            // if it's a keeper
            var rear = advance++
            while advance != col.endIndex {
                if col[advance] != value {
                    col[rear] = col[advance]
                    ++rear
                }
                ++advance
            }
            col.removeRange(rear..<col.endIndex)
        }
}

This version breaks the removal into multiple steps. First, it uses find to locate the first entry to remove (and if it doesn’t find one, does nothing more). Next, one by one it moves the subsequent elements down on top of that entry. When it encounters more entries to remove, it skips over them (i.e. it increments advance, the index of entries to examine and maybe copy, but not rear, the index of where to copy to, and they aren’t copied).

Finally, when all this is done, the collection should have all the non-removed entries at the front, and some meaningless garbage at the end. This end section is the length of the number of removed entries (though it doesn‘t contain the removed entries – entries were copied, not swapped). This is trimmed off by a call to removeRange, which is also O(N). But that’s fine – two O(N) algorithms in series is still O(N).

By the way, in this last step it differs from the C++ STL version which, in a quality bit of user-unfriendliness, requires the caller to do the final remove step, instead leaving the collection with the garbage still at the end. There‘s good reasons for this (because iterators), but it’s a nasty gotcha for newbies. I’d chalk this one up as a win for Swift’s approach to generic collection algorithms. ⁵

Note that for this to work, the collection also needs to support the MutableCollectionType protocol, because that’s where the assignable version of subscript lives for some reason.⁶ In fact, that’s all MutableCollectionType adds. By the way, this whole optimization is based on the assumption that the assignment version of subscript runs in constant time. It should do, right? Copying a value into a position in an array shouldn’t affect the rest of the array, so it shouldn’t matter how long it is.

A quick test with a reasonably large array shows that this new version of remove does indeed run quicker (and more consistently) than our first version. Yay.

Then you try and use it on a string, and your celebration is short lived. String doesn’t implement MutableCollectionType. Huh, what’s that about?

What this doesn‘t mean is that String is immutable. It’s totally mutable. Mutating is what RangeReplaceableCollectionType is all about. This is a chance to make an important if maybe obvious point: just because your function takes an object via a protocol that doesn’t allow mutation doesn‘t mean that object is immutable. It just means you can‘t mutate it in your function.

My guess for why strings don‘t support MutableCollectionType? Because that assumption above, about subscript assignment being O(1), doesn’t hold. Remember, individual elements of Swift strings are of variable length. What if you replaced a longer character with a shorter one? We’d be back to square one, having to shovel all the subsequent characters down to fill in the gap. Worse, what if you replaced a shorter entry with a longer one? The string would get bigger, maybe even need relocating to a newly allocated chunk of memory.

This would be another example of signalling more from protocols than just what functions are supported. They can tell you about fundamental properties of the object. Perhaps MutableCollectionType is like MutableInConstantTimeCollectionType. Course, on the other hand, I could be reading waaay too much into String not implementing it. It could just be an oversight – only time (or one of the Swift devs) will tell.

String does support replaceRange as an alternative to subscript assign. But its complexity is, you guessed it, O(N). And after crowbarring it into the second algorithm above, tests suggest it’s no faster than the removeAtIndex version.⁷ So if you really have a burning desire to remove some characters from a huge string, maybe find another way. Perhaps you can trade some space for that time.

That wikipedia entry on complexity, like so many wikipedia entries on mathematical topics, is pretty beginner-unfriendly. If anyone has a good beginner’s guide link I could replace it with, let me know. An (admittedly cursory) google search doesn’t turn much up. ↩
I’d have called this post “Yo, dawg” but then I’ve already done that once. ↩
Unless it does something like return the first element in the sequence, obvs. Hmmm, maybe I should be less paranoid about people finding errors in my posts. ↩
Actually it doesn’t, not on the protocol version anyway. The stand-alone function version (that takes that same protocol) does though, so I’m taking the liberty of giving that instead. ↩
It‘s not all wins, mind. The STL’s iterator-based functions are much easier to use with subranges of collections. Don‘t get me started on slices… ↩
Wait for it… ↩
though this is possibly due to my lack of deftness with a crowbar. ↩

Changes to the Swift Standard Library in Xcode 6.1

Posted on September 10, 2014September 10, 2014 by airspeedvelocity

Congratulations to the Swift team on releasing a 1.0 GM! Unsurprisingly, there were no changes that I could spot¹ between beta 7 and the GM standard libraries.

But as they said in their blog post, “Swift will continue to advance with new features, improved performance, and refined syntax. In fact, you can expect a few improvements to come in Xcode 6.1 in time for the Yosemite launch.” As such, the first beta of Xcode 6.1² saw more changes to the standard library than we saw in the pre-1.0 beta, and here they are:

As flagged in the release notes, all the integer types have acquired truncatingBitPattern initializers for each of their larger counterparts.
The functions join, reduce, sort and sorted have acquired descriptions in the various places they’re implemented.
AssertString and StaticString are now Printable and DebugPrintable
HeapBufferStorage no longer inherits from HeapBufferStorageBase, which is gone.
The Process instance is now declared with let rather than var
StrideThrough or StrideTo are now Reflectable
StaticString now has a utf8Start instead of start (which still returns an UnsafePointer). It also has a withUTF8Buffer method for using the underlying raw buffer, a unicodeScalar get property.
String has lost its compare(other: String.UnicodeScalarView) method.
String also now implements several methods extending StringInterpolationConvertible – for a variety of different built-in types.
Second versions of assertionFailure and fatalError now take a generic AssertStringType instead of StaticString.
equal and startsWith‘s predicate functions have been renamed isEquivalent.
maxElement and minElement‘s input parameter has been renamed elements.
The toString function now says it returns the result of printing, not debugPrinting, an object.

There’s a new unsafeAddressOf function that takes an AnyObject. Apparently there’s “not much you can do with this other than use it to identify the object”.

There is a new protocol, UnicodeScalarLiteralConvertible, that follows the now-familiar literal converter pattern: a ThingType typealias, a convertFromThing class function, and a library-level typealias for ThingType for the standard type for these literals to convert into, in this case a String.

The ExtendedGraphemeClusterLiteralConvertible protocol now inherits this protocol, so the objects that implement it (Character, String, AssertString and StaticString) also now implement convertFromUnicodeScalarLiteral. And the UnicodeScalar struct now implements UnicodeScalarLiteralConvertible instead of ExtendedGraphemeClusterLiteralConvertible.

Here’s looking forward to more enhancements as Swift continues to develop.

Unless there’s one hiding in those pesky re-ordered operator overloads… ↩
Does this mean this is Swift 1.1? ↩

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

Airspeed Velocity

African or European Swift?