Generic Collections, SubSequences and Overloading

Posted on January 3, 2016January 3, 2016 by airspeedvelocity

So, you’ve figured out the whole indices versus distances situation, and now you’re feeling pretty happy and you start trying out some more generic collection algorithms. And before long you smack your head against one of the more confusing things about generic collections: a sequence’s SubSequence doesn’t necessarily contain the same element type as that sequence. This usually manifests itself in one of those compiler errors that has you yelling “what are you talking about you infernal machine” at the compiler before eventually realizing what you’ve done wrong and sheepishly fixing it.

For example, suppose you want to find the location of the first subsequence in a collection. Like indexOf, but for another sequence rather than an element. The simple brute-force approach would be to iterate over every subsequence, trying to match it. Something like this – except it won’t compile:

extension CollectionType
  where 
  Generator.Element: Equatable {

    func search<
      Other: SequenceType 
        where 
        Other.Generator.Element == Generator.Element
    >(pattern: Other) -> Index? {

        for idx in self.indices {
            // error: cannot convert value of type 'Other' 
            // to expected argument type...
            if suffixFrom(idx).startsWith(pattern) {
                return idx
            }
        }
        return nil
    }
}

Until you know why, this can be very confusing. You’ve constrained the other sequence to have the same elements as the collection, so why can’t you use startsWith on it?

Except you’re not calling startsWith on Self. You’re calling it on Self.SubSequence, because that’s what suffixFrom returns – it’s a slicing operation. And SubSequence.Generator.Element isn’t guaranteed to be the same as Self.Generator.Element.

Here’s the relevant declaration:

protocol CollectionType {
    typealias SubSequence : Indexable, SequenceType = Slice<Self>
}

There’s nothing about this definition that forces the elements of SubSequence to be anything specific. Really, this needs a where clause on the typealias – something that Swift doesn’t currently support:¹

protocol CollectionType {
    typealias SubSequence : Indexable, SequenceType
       // not (yet?) valid Swift
       where SubSequence.Generator.Element == Generator.Element
}

So, in the mean-time, the way around this is to specify any requirements like this as part of the declaration. A fix that might seem to work, but is over-constraining, would be to require subsequences be the same type:

extension CollectionType
  where
  Generator.Element: Equatable,
  SubSequence == Self {
    // etc
}

This would compile, and it would work fine with types whose slices are the same type as themselves, like strings:

// a slice of a String.CharacterView is
// a String.CharacterView, so this is fine 
"hello".characters.search("ll".characters)

but not for arrays, who have an ArraySlice type, or arbitrary collections, that have the default Slice implementation.

// "type of expression is ambiguous without more context", apparently
[1,2,3,4].search(2…4)
// "ambiguous reference to member '..<'" uhhh what?
(0..<5).search(2...4)

Instead, you can constrain the elements to be the same while allowing the slice type itself to differ:

extension CollectionType
  where
  Generator.Element: Equatable,
  // guarantee subsequences contain the same element
  SubSequence.Generator.Element == Generator.Element {
    
    func search<
      Other: SequenceType
        where
        Other.Generator.Element == Generator.Element
    >(pattern: Other) -> Index? {
            
            for idx in self.indices {
                // and now this line will happily compile...
                if suffixFrom(idx).startsWith(pattern) {
                    return idx
                }
            }
            return nil
    }
}

[1,2,3,4].search(2...4) // returns 1
(0..<5).search([3,4])   // returns 3

Optimizing for Random Access

It’s often the case that you can optimize a collection algorithm if you know it has random access. For example, given the above algorithm, you can tweak it to avoid searching for a pattern that is longer than what remains of the collection, so long as both the collection and the pattern are random access and therefore able to calculate their lengths (and the lengths of their collection) in constant time. OK, maybe it’s not Boyer-Moore, but it’s better than nothing and the code to do this is pretty simple, in fact it’s more constraints than code at this point:

extension CollectionType
  where
  Generator.Element: Equatable,
  Index: RandomAccessIndexType,
  SubSequence.Generator.Element == Generator.Element {
    func search<
      Other: CollectionType
      where
      Other.Index: RandomAccessIndexType,
      Other.Index.Distance == Index.Distance,
      Other.Generator.Element == Generator.Element
    >(pat: Other) -> Index? {
            
        // if pattern is longer, this cannot match, exit early
        guard !isEmpty && pat.count <= count else { return nil }
        
        // otherwise, search from the start up to the 
        // end less the space for the pattern
        for i in startIndex...endIndex.advancedBy(-pat.count) {
            // check if a slice from this point
            // starts with the pattern
            if self.suffixFrom(i).startsWith(pat) {
                // if it does, we've found it
                return i
            }
        }
        // otherwise, not found
        return nil
    }
}

(one part of the where clause, Other.Index.Distance == Index.Distance, is maybe overconstraining, but the alternative would be fiddling around with casts – non-Int distances aren’t exactly common, even Bit’s distance is an Int).

We’ve talked about overload priority a long time ago, and it still pertains even with protocol extensions. The most specific function wins. Here, if your two collections are random access, this version will be called because it is more tightly constrained than the first version. So:

// calls the version that works for any index
"hello".characters.search("ll".characters)
// calls the version optimized for random-access
[1,2,3,4].search(2...4) // returns 1?

Now, there is a potentially nasty bug you could stumble into here, and the compiler won’t save you from it. Try removing the constraints requiring the indices be random access on the second version. It still compiles! Except, now this version will be called strings as well. But in the case of non-random indices, it’s not an optimized version, it’s potentially slower. The reason being, pat.count and endIndex.advancedBy aren’t constant time, they’re linear, because they involve incrementing the index one by one. Now it could be worse – at least it’s not accidentally quadratic, but other examples could easily be (imagine implementing a sort that assumed constant-time advance but didn’t get it), so it’s something to watch out for.

Protocols, Extensions and Dynamic Dispatch

Once upon a time, it was harder to have made this mistake. In earlier versions of the standard library there were two ways to advance an index: a free advance function, that was O(1) for random access indices and O(n) otherwise, and then there was an advancedBy method on only the RandomAccessIndexType.

As part of the great extensionizing, the free advance function went away, and advancedBy moved into ForwardIndexType. So now you can call it on any index type, but you’ll get different performance characteristics depending on the type you call it on.

If you look at the declaration of ForwardIndex.advancedBy, you’ll see that it’s declared twice. Once inside the protocol:

public protocol ForwardIndexType : _Incrementable {
    public func advancedBy(n: Self.Distance) -> Self
}

and then again just below, as an extension:

extension ForwardIndexType {
    public func advancedBy(n: Self.Distance) -> Self
}

This seems a little weird. Why declare it twice? Why not just declare it once as an extension? It has to do with static versus dynamic dispatch, and which actual implementation is called.

To show this, let’s create our own toy index, initially without random access:

struct MyIndex {
    let i: Int
}

extension MyIndex: BidirectionalIndexType {
    func successor() -> MyIndex {
        print("successing!")
        return MyIndex(i: i+1)
    }
    func predecessor() -> MyIndex {
        print("predecessing!")
        return MyIndex(i: i-1)
    }
}

// indices need to be Equatable
func == (lhs: MyIndex, rhs: MyIndex) -> Bool {
    return lhs.i == rhs.i
}

I’ve put prints in there so you can tell when the functions are called. Now, this index type will have the default advancedBy implementation, which just calls successor() multiple times. And because of the print, you can see it happening:

func castHolyHandGrenadeOfAntioch<
  I: ForwardIndexType
>(idx: I) {
    idx.advancedBy(3)
}

let idx = MyIndex(i: 1)
castHolyHandGrenadeOfAntioch(idx)
// prints “successing!” 3 times

OK, now, we conform the type to be random access by giving it advance and distance functions:

extension MyIndex: RandomAccessIndexType {
    func advancedBy(n: Int) -> MyIndex {
        return MyIndex(i: i+n)
    }
    func distanceTo(end: MyIndex) -> Int {
        return end.i - i
    }
}

and now, when we call the castHolyHandGrenadeOfAntioch, we see nothing printed. Instead, even though the function only required a forward index, it uses the better advancedBy implementation of the random-access index.

But… this only works because advancedBy had been declared in the original ForwardIndexType protocol declaration. If, on the other hand, we added another method as an extension to different index types – say, a retreatedBy function – this would not happen.

First, let’s do that for BidirectionalIndexType:

extension BidirectionalIndexType {
    // just an example, you don't have to do this!
    // since BidirectionalIndexType also supports
    // advancedBy(-distance)
    func retreatedBy(n: Distance) -> Self {
        var i = n
        var result = self
        // since Distance isn’t strideable, this
        // is a rare case where you might mourn the
        // loss of C-style for and the ++ operator…
        while i > 0 {
            result = result.predecessor()
            i -= 1
        }
        return result
    }
}

and then, an optimized version for RandomAccessIndexType:

extension RandomAccessIndexType {
    func retreatedBy(n: Distance) -> Self {
        return self.advancedBy(-n)
    }
}

But now, if we repeat our earlier experiment but calling this function, we get a different result:

// only requires bidirectional access
func runAway<I: BidirectionalIndexType>(idx: I) {
    idx.retreatedBy(10)
}

// and even though MyIndex is random access
let braveSirRobin = MyIndex(i: 10)

// the bidirectional version is called...
runAway(braveSirRobin)
// prints “predecessing!” 10 times

So we can see the difference between default implementations that are also declared as part of the protocol, versus ones just created as extensions. With the former, generic functions get to take advantage of specialized implementations, whereas with the latter, only the implementation that matches the generic constraint will be called. This is why you often see functions declared twice in the standard library. For example, on SequenceType, underestimateCount is declared like this, so that if you pass in a collection to a function that takes a sequence, it can check it’s length if it’s a collection, without risking consuming the sequence.

That’s assuming that there isn’t a legitimate reason to vary elements in subsequences which I don’t… think… there is? ↩

Changes to the Swift standard library in 2.0 betas 2..<5

Posted on July 23, 2015July 24, 2015 by airspeedvelocity

So, I was waiting for enough library changes across a few betas to justify posting something, and then all of a sudden lots of stuff changed! So let’s start with:

Flat Map

Missed from my 2.0b1 changes post (shame on me) was a third kind of flatMap.

1.2 brought us flatMap for collections – given an array, and a function that transforms elements into arrays, it produces a new array with every element transformed, and then those arrays “flattened” into a single array. For example, suppose you had a function, extractLinks, that took a filename and returned an array of links found in that file:

func extractLinks(markdownFile: String) -> [NSURL]

If you had an array of filename strings (say from a directory listing), and you wanted an array of the links in all the files, you could use this function with flatMap to produce a single array (unlike map, which would generate an array of arrays):

let nestedArrays: [NSURL] = markdownFiles.flatMap(extractLinks)

There was also a flatMap for optionals. Given an optional, it takes a function that takes the possible value inside the optional, and applies a function that also returns an optional, and “flattens” the result of mapping an optional to another optional. For example, Array.first returns the first element of a collection, or nil if the array is empty and has no such element. Int.init(String) takes a string, and if it is a representation of an integer, returns that integer, or nil if it isn’t. To take the first element of an array, and turn it into an optional integer, you could use flatMap (unlike map, which will return an optional of an optional of an integer):

let a = ["1","2","3"]
let i: Int? = a.first.flatMap { Int($0) }

So what’s the new third kind of flatMap? It’s a mixture of the two – for when you have an array, and want to transform it with a function that returns an optionals, discarding any transforms that return nil. This is logical if you think of optionals as like collections with either zero or one element – flatMap would discard the empty results, and flatten the single-element collections into an array of elements.

For example, if you have an array of strings, and you want to transform it into integers, discarding any non-numeric strings:

let strings = ["1","2","elephant","3"]
let ints: [Int] = strings.flatMap { Int($0) }
// ints = [1,2,3]

If you’ve been following this blog for a while, you’ll recognize this is a function we’ve defined before as mapSome.

By the way, all these examples have been pulled from the book on Swift that Chris Eidhof and I have been writing. The chapter on optionals has just been added to the preview.

First-class Initializers

In beta 2 came the ability to use initializers, partially-applied methods, and enum cases, as first-class functions. This gives you the ability to pass them directly into higher-order functions without needing a closure expression:

// equivalent to [1,2,3].map { Optional.Some($0) }
let optionals = [1,2,3].map(Optional.Some)

// equivalent to [1,2,3].map { String($0) }
let strings = [1,2,3].map(String.init)

// equivalent to [1,2,3].map { 1.advancedBy($0) }
let plusones = [1,2,3].map(1.advancedBy)

Given this, you might wonder why I wrote strings.flatMap { Int($0) } in the earlier section, instead of strings.flatMap(Int.init). But this new ability is constrained by the fact that overload resolution works slightly differently for function invocation versus function assignment. If you try the latter version, you will get an error because there is no method for Int.init that takes just a single string as an argument. When you call Int("1"), you are really calling Int("1", radix: 10), with the compiler filling out the default argument for you. But it won’t do that when passing Int.init in to map.

String.init on integers works though, I guess because the overloading is less ambiguous? On the other hand, arrayOfCustomStructs.map(String.init) won’t compile, probably because there is ambiguity between String.init(T) and String.init(reflecting:T) (the debug version). So for now, you’ll have to stick with writing array.map { String($0) }, like an animal.

One other thing to be careful of: this new terse syntax doesn’t work for when you want to apply a method to the self argument. The following will compile, but won’t do what might be intended – reverse each of the arrays:

let reversedArrays = [[1,2,3],[4,5,6]].map(Array.reverse)

Instead, this will produce an array of functions – because Array.reverse returns a function that returns a function where the method will be called on the argument pased.

So instead, you would have to write this:

let reversedArrays = [[1,2,3],[4,5,6]].map { Array.reverse($0)() }
// which is just a long winded way of writing this…
let reversedArrays = [[1,2,3],[4,5,6]].map { $0.reverse() }
// but bear with me...

You could always write a version of map that takes curried functions and does the above for you:

extension CollectionType {
    func map<T>(@noescape transform: (Self.Generator.Element) -> () -> T) -> [T] {
        return self.map { transform($0)() }
    }
}

after defining which, .map(Array.reverse) will do what you probably want.¹

This can also be nice to do for other functions. For example, if you defined a similar version of sort:

extension CollectionType {
    public func sort(isOrderedBefore: Self.Generator.Element -> Self.Generator.Element -> Bool) -> [Self.Generator.Element] {
        return self.sort { isOrderedBefore($0)($1) }
    }
}

then, after also overloading caseInsensitiveCompare to have a version that returns a boolean in Swift’s isOrderedBefore style, you could do the same with methods that compare:

import Foundation

extension String {
    func caseInsensitiveCompare(other: String) -> Bool {
        return self.caseInsensitiveCompare(other) == .OrderedAscending
    }
}

["Hello","hallo","Hullo","Hallo",].sort(String.caseInsensitiveCompare)
// returns ["hallo", "Hallo", "Hello", "Hullo"]

RangeReplaceableType: all your ExtensibleCollectionTypes are belong to us

ExtensibleCollectionType is gone. It used to provide append and extend, which are now amongst the features provided by RangeReplaceableCollectionType.

RangeReplaceableCollectionType is a great example of the power of protocol extensions. You implement one uber-flexible method, replaceRange, which takes a range to replace and a collection to replace with, and from that comes a whole bunch of derived methods for free:

append and extend: replace endIndex..<endIndex (i.e. nothing, at the end) with the
new element/elements.
removeAtIndex and removeRange: replace i...i or subRange with an empty collection.
splice and insertAtIndex: replace atIndex..<atIndex (i.e. don't replace any elements but insert at that point) with a new element/elements.
removeAll: replace startIndex..<endIndex with an empty collection.

The two remaining functions from ExtensibleCollectionType – an empty initializer, and reserveCapacity, have also moved into RangeReplaceableCollectionType.

As always, if a specific collection type can use knowledge about its implementation to perform these functions more efficiently, it can provide custom versions that will take priority over the default protocol extention ones.

You get to be Sliceable. And you get to be Sliceable. Everybody gets to be Sliceable!

Things being Sliceable (i.e. having a subscript that takes a range, and returns a new collection of just that range) was incredibly useful. But it was also limiting, because you needed to rely on things implementing Sliceable.

But you could make anything sliceable if you needed it to be – by writing a wrapper view that took a subrange and implemented CollectionType to present only that subrange:

// note, won't compile in beta 4 because Sliceable is gone!
public struct SubSliceView<Base: CollectionType>: Sliceable {
    let collection: Base
    let bounds: Range<Base.Index>

    public var startIndex: Base.Index { return bounds.startIndex }
    public var endIndex: Base.Index { return bounds.endIndex }

    public subscript(idx: Base.Index) -> C.Generator.Element
    { return collection[idx] }

    typealias SubSlice = SubSliceView<Base>
    public subscript(bounds: Range<Base.Index>) -> SubSliceView<Base> {
        return SubSliceView(collection: collection, bounds: bounds)
    }
}

// AnyRandomAccessCollection is an example of a type that wasn't sliceable,
// but you could make it one by doing this:
extension AnyRandomAccessCollection: Sliceable {
    public subscript(bounds: Range<Index>) -> SubSliceView<AnyRandomAccessCollection> {
        return SubSliceView(collection: self, bounds: bounds)
    }
}

let a = [1,2,3]
let r = AnyRandomAccessCollection(a)
// dropFirst relies on things being sliceable:
print(dropFirst(dropFirst(r)).first)

As of beta 4, the new Slice type does pretty much this exact thing. Including the way the slice's start index is not zero based – if you create a slice starting at index 2, the startIndex of that slice will be 2. Another reason to avoid using indices directly in favour of things like for...in or higher-order functions like map. Index-based operations are so much easier to screw up, and your assumptions (e.g. every integer-indexed collection starts at zero) can easily be invalid.

The Sliceable protocol has been subsumed into CollectionType, and the default implementation for the ranged subscript is to return a Slice view onto the collection. So now every collection supports a slicing subscript.

CollectionType in turn now conforms to Indexable, which is the new home for startIndex, endIndex, and index-based subscript. Note, Indexable does not to conform to SequenceType – these two things are now brought together by CollectionType.

There is still a good reason for collections to implement their own custom slicing code – they can often do it more efficiently.

For example, UnsafeBufferPointer didn't used to be sliceable, but now is, with the default implementation. If you slice it, you'll get back a Slice:

[1,2,3].withUnsafeBufferPointer { buf -> Void in
    let slice = buf[2..<4]
    sizeofValue(slice)  // returns 32 bytes
}

sizeof returns 32, because the type has to contain both the base buffer (which is 16 bytes – the base pointer address and the length), plus the start and end of the subrange (another 16 bytes).²

But alternatively, UnsafeBufferPointer could use itself as its subslice, like so:

extension UnsafeBufferPointer {
    subscript(subRange: Range<Int>) -> UnsafeBufferPointer {
        return UnsafeBufferPointer(start: self.baseAddress + subRange.startIndex,
                                   count: subRange.count)
    }
}

If you do this, you’ll see that UnsafeBufferPointer‘s slice is now another UnsafeBufferPointer, and the memory requirement drops in half:

[1,2,3].withUnsafeBufferPointer { buf -> Void in
    let slice = buf[2..<4]
    sizeofValue(slice)  // returns 16 bytes
}

This approach – making Self your subslice – is taken by String slices, but not by Array, which has a subslice type of ArraySlice.

This brings up another gotcha you might hit when writing generic methods on collections using slices – slices aren’t necessarily of the same type as the thing you are slicing. They are sometimes, but not always.

Even more strangely, the elements contained in the subslice aren’t guaranteed to be the same as the elements of the original collection! Ideally they would be, but this constraint can’t be expressed in Swift right now.

For example, suppose you wanted to overload reduce with a version that didn’t need an initial element – instead it used the first element of the collection, and returned an optional in case the collection was empty. You might try and implement it like this:

extension CollectionType { 
    /// Return the result of repeatedly calling `combine` with an
    /// accumulated value initialized to the first element, and each 
    /// subsequent element of `self`, in turn, i.e. return
    /// `combine(combine(...combine(combine(self[0], self[1]),
    /// self[2]),...self[count-2]), self[count-1])`.
    /// Return `nil` in case of an empty collection.
    func reduce(@noescape combine: (Generator.Element, Generator.Element) -> Generator.Element) -> Generator.Element? {
        return first.map {
            dropFirst(self).reduce($0, combine: combine)
        }
    }
}

This will not compile (and beta 4 will lead you up the garden path with a spurious error for why). The problem being, the elements of the collection returned by dropFirst aren’t guaranteed to be the same as those of the collection. You would probably be right to be annoyed if they weren’t – but the type system doesn’t guarantee it.

The fix would be to constrain the extension to require it:

extension CollectionType where Generator.Element == SubSequence.Generator.Element {
    func reduce(@noescape combine: (Generator.Element, Generator.Element) -> Generator.Element) -> Generator.Element? {
        return first.map {
            dropFirst(self).reduce($0, combine: combine)
        }
    }
}

// so now you can do this:
[1,2,3,4].reduce(+)  // returns 10

Grab Bag

There are a few other changes to the standard library:

Reflectable became _Reflectable and MirrorType became _MirrorType
QuickLookObject has been renamed PlaygroundQuickLookObject
_MirrorType and PlaygroundQuickLookObject acquired documenting comments.
_MirrorDisposition is now gone from the visible library, though _MirrorType still refers to it for now.
And the reflect function is gone: Xcode tells you to use Mirror.init(reflecting:) instead.
RawOptionSetType was removed
The gradual disappearing of the _ protocols continues. _UnsignedIntegerType is now gone (its methods – toUIntMax and init(_: UIntMax) moved into UnsignedIntegerType)
Views onto other things (such as FilterCollection, a lazily filtered viewonto a collection) have dropped their View suffix.
Free functions continue to disappear into protocol extensions.
As do unnecessarily non-default methods, such as generate for collections that now use the default indexing generator implementation.
LazyRandomAccessCollection.reverse now returns a random-access reversed collection, not a bi-directional one.
The Zip2 type has been renamed Zip2Sequence – but you already switched to creating it using the zip function, right? Though perhaps the protocol extension wolves are circling its campfire.
SinkType and SinkOf are gone. SinkType was used in two places – UTF encoding methods, where you now pass a closure, and UnsafeMutablePointer, where you could now write (ptr++).memory = x instead, I guess.

Beta 3 didn’t introduce many functional changes, but did do a lot of renaming, changing the placeholder names for generic types to something more descriptive than T, U etc:

Sequence and collection-like things now use Element
Pointer-like things now use Memory
Intervals use Bound
Views onto other types use Base for the thing they are viewing
Unmanaged things are Instances.
Optionals still wrap a T though.

We’re not quite out of gotcha territory yet though. After defining that map, try and guess what ["foo","bar","baz"].map(String.hasPrefix("b")) will do… ↩
On 64-bit platforms, anyway. Cut all these numbers in half for 32-bit. ↩

Changes to the Swift Standard Library in 2.0 beta 1

Posted on June 9, 2015June 9, 2015 by airspeedvelocity

OK don’t panic – it might look like a lot has changed, but not really. It’s more… transformed. For the better. Nothing the migration assistant shouldn’t be able to handle.

By far the biggest change in the standard library is due to the new protocol extensions. The free map function is dead, long live the map extensions to CollectionType and SequenceType. Which means now you only ever call map as a method – no more shuttling back and forth between functions and methods like you’re watching a tennis match.

To show how this works, let’s define my usual example, mapSome:

extension SequenceType {
    /// Return an `Array` containing the results of mapping `transform`
    /// over `self`, discarding any elements where the result is `nil`.
    ///
    /// - Complexity: O(N).
    func mapSome<U>(@noescape transform: Generator.Element -> U?) -> [U] {
        var result: [U] = []
        for case let x? in self.map(transform) {
            result.append(x)
        }
        return result
    }
}

You can use this in the method-calling style, for example, with the brand new double-from-string failable initializer:

let a = ["3.14", "foo", "6.02e23"]
let doubles = a.mapSome { Double($0) }
print(doubles) // no more println!
// prints [3.14, 6.02e+23]

Incidentally, this implementation of mapSome uses the new pattern-matching capabilities of for to filter out nils. The case let syntax is matching an enumeration – and since optionals are enumerations, it works for them too. for has also acquired where clauses:¹

let isEven = { $0%2 == 0 }
for even in 0..<10 where isEven(even) {
    print(even)
}

But there’s more. You can constrain the extension, similar to how you would constrain a placeholder in a generic function. And this means you can give protocols methods that only apply when they have certain properties. For example, there’s now a version of sort on arrays (or any other collection type) that doesn’t need to be given a isOrderedBefore argument, so long as the array contents are Comparable. There’s still an overload that takes a closure if you want to do something more custom, but if you just want the default behaviour (ascending), you don’t need one.

For example, here’s an implementation of all that returns true if all the values are equal to a certain value:

// a version that only applies when the contents of the
// sequence are equatable
extension SequenceType where Generator.Element: Equatable {
    /// Return `true` iff every element of `self` is `x`.
    func all(equalTo: Generator.Element) -> Bool {
        // of course, contains is now a method too
        return !self.contains { $0 != equalTo }
    }
}

// and an unconstrained version where the caller supplies a predicate
extension SequenceType {
    /// Return `true` iff every element of `self` satisfies `predicate`.
    func all(criteria: Generator.Element -> Bool) -> Bool {
        return !self.contains { !criteria($0) }
    }
}

[1,1,1].all(1)       // true

let isEven = { $0%2 == 0 }
[2,4,6].all(isEven)  // true

As a result, the number of free functions in the standard library has dropped from 101 down to 77. I wonder how far this will go – will it eventually just be a motly crew of unsafeBitCast and company left? Should abs() become an extension of SignedNumberType? And now it can be, should it be a property? This and more in the next exciting installment of Swift…

Grab Bag

Here are a few other changes to the standard library:

Array.withUnsafeMutableBufferPointer has acquired a warning: do not use the array itself while calling it, only refer to the contents via the pointer. Presumably so updates to the array can potentially be deferred to after it’s finished executing. Same for ContiguousArray and Slice.
They’ve also had some of their internal-type initializers privated.
Some of the _-prefixed protocols have started to disappear. For example _BidirectionalIndexType is gone, and BidirectionalIndexType now has its predecessor method. And _Comparable is gone, Comparable now containing its < operator.
CollectionOfOne and EmptyCollection now have a count property – just in case you want to check they’re not misleading you.
So now for the start of the great extensionizing… CollectionType now has isEmpty, count, map, filter, first, last, indexOf, indices and reverse. So these are now available as methods on all collections. Corresponding methods in existing collections have been removed.
indexOf is what find used to be called. And it now has a version that takes a predicate.
An ErrorType protocol has been added to support the new error handling feature.
Printable and DebugPrintable protocols are now the more descriptive CustomStringConvertible and CustomDebugStringConvertible
There are also CustomReflectable, CustomLeafReflectable and CustomPlaygroundQuickLookable protocols for tweaking how your type behaves in a playground.
And there’s a new Mirror type for returning from the CustomReflectable protocol.
There’s a new DictionaryLiteral type, so you can now use the [:] syntax for more than just dictionaries without worrying about duplicate keys getting coalesced.
As mentioned above, Double, Float and Float80 have acquired failable initializers from strings.
And the integers now have the same, replacing the toInt method on string. And they take a radix argument! You still can’t tell from the headers what default values are for defaulted arguments, but I'm guessing this one is 10.
FloatingPointType’s _toBitPattern and _fromBitPattern are gone. I guess you can use unsafeBitCast if you like bits.
All the lazy collections have acquired an underestimateCount.
MutableCollectionType is a good example of extensions with where clauses – such as requiring the collection also be random-access in order for it to have partition and sortInPlace support.
SequenceType sucks in contains, sort, underestimateCount, enumerate, minElement, maxElement, elementsEqual, lexicographicalCompare and flatMap
elementsEqual being the new name for the equal function that compares two sequences.
and minElement and maxElement now return optionals in case of empty sequences, and also now have versions that take isOrderedBefore closures.
There’s a new SetAlgebraType protocol, for types that are “a generalized set whose distinct elements are not necessarily disjoint”. Set does not conform to it (though it has all the properties it requires).
A type that does conform to it is OptionSetType protocol, for bitfield enums.
print is gone! Now println is print and old print is print(1, newLine: false)
toString is no more. Just use the String initializer that takes any type (and has a similar hierarchy of fallbacks)
readLine reads from the standard input, returning an optional String so you can while let over it.

Protocol Extensions are Defaults

Silly jokes aside, there’s a good reason for why CollectionOfOne and EmptyCollection have implementations of count. If a protocol has a default implementation, but you know your specific type can do it faster because of how it works internally, a specific implementation will take precedence over the protocol version.

So for example, suppose we implemented the next (but fairly pointless) logical collection type: CollectionOfTwo:

struct CollectionOfTwo<T>: CollectionType {
    let first: T, second: T
    subscript(idx: Int) -> T {
        precondition(idx < 2, "Index out of bounds")
        return idx == 0 ? first : second
    }
    var startIndex: Int { return 0 }
    var endIndex: Int { return 2 }
}

let two = CollectionOfTwo(first: "42", second: "420")
",".join(two)  // “42,420"

Notice, that I didn’t need to define a generator at all – due to this handy new protocol extension:

// CollectionType conforms to _CollectionGeneratorDefaultsType 
// which is extended with:
extension _CollectionGeneratorDefaultsType { 
    func generate() -> IndexingGenerator<Self>
}

Anyway because CollectionOfTwo conforms to CollectionType it gets a count property for free. But it gets them by subtracting the end index from the start, which is quite a roundabout way of doing it. So you can instead give it a hardcoded value by explicitly implementing it:

extension CollectionOfTwo {
    var count: Int { return 2 }
}

Now, when count is called, it will run this code instead of the default.

Protocols can also replace the default implementations of other protocols they conform to – hence CollectionType defines map even though SequenceType does too.

While we’re implementing simple little types – the default string rendering of custom types is now a lot nicer. Even though it doesn’t yet conform to CustomStringConvertible, if you print out CollectionOfTwo you’ll get something that looks like CollectionOfTwo(first: 42, second: 420) which is much nicer that the __lldb_expr_11.CollectionOfTwo you used to get.

Strings are No Longer Collections

Something that might take you by surprise – String no longer conforms to CollectionType. Though it has all the required properties (just writing extension String: CollectionType { } without any implementation works), it’s no longer tagged as such. This is apparently due to concerns that, even with the Character representation, there were ways in which using collection algorithms on strings could produce non-Unicode-correct results.

Of course, you still need to manipulate strings, so rather than just resorting to staring at them extra intensely, you can use the new CharacterView accessible via the characters property:

let s = "comma,separated,strings"
let fields = split(s.characters) { $0 == "," }.map { String($0) }

Since String's Index is just a typealias for String.CharacterView.Index, you can use them interchangeably:

let s = "Hello, world!"
if let comma = s.characters.indexOf(",") {
    print(s[s.startIndex..<comma])
}

Nonetheless, the fact that you have to switch to a character-based view on the string rather than operate on it directly should act as a reminder that you might be doing something that doesn’t behave correctly under all edge cases.

Type-Erased Containers

Finally, there are now a collection of “type-erased” container types available: AnyBidirectionalCollection, AnyRandomAccessCollection and AnyForwardCollection (and associated indexes), plus AnyGenerator and AnySequence. These could be used, for example, to bully different kinds of collection into being in the same container:

let set: Set = [1,2,3]
let array = [4,5,6]
let anySet = AnyForwardCollection(set)
let anyArray = AnyForwardCollection(array)
let anys = [anySet,anyArray]
anys.flatMap { $0 } // [2, 3, 1, 4, 5, 6]

However, this is probably not all that useful. Despite the name, this isn’t like the Any type – you can’t cast back into the original type, it’s more a one-way trip.

This is more useful when you want to expose an internal collection without exposing exactly what that collection type is. Instead you could create an AnyXXXCollection from your internal value, and return that, safe in the knowledge users of your class won’t complain when you switch to a different internal data structure later on.

Well that’s it. The standard library has some sparkly new online documentation. And as part of the WWDC sample code, there’s a playground all about the Swift standard library that is worth checking out. Also it looks like the developer forums for Swift no longer require a developer login to read either, if you want to take a look without registering.

Here’s to a future using all this stuff on the back end once the port to Linux is available!

Of course, you wouldn’t write this would you – you’d write for even in stride(from: 0, to: 10, by: 2), right? ↩

Changes to the Swift standard library in 1.2 betas 2 and 3

Posted on March 13, 2015March 26, 2015 by airspeedvelocity

Swift v1.2 beta 2 and 3 bring quite a few new enhancements, many of which build on what was introduced in beta 1.

The if let syntax continues to evolve – as of beta 2, you can check for a boolean condition and, if it’s true, then proceed to unwrap values. Nate Cook has a good example of this in his NSScanner extension:

// NSScanner.scanCharactersFromSet returns the scanned string
// by assigning it to a pointer you pass in:
var value: NSString? = ""

// The method returns true only if a string was successfully
// scanned, so check that first:
if scanner.scanCharactersFromSet(set, intoString: &value),
   // then if true, unwrap the value and convert to a String:
   let str = value as? String 
{
    // now use the valid scanned String as str
}

Flat Map

Continuing the theme of language features that help with handling optionals: beta 3 introduces flatMap. This is a map operation, followed by a flattening operation. The release notes have a good example of its use when you “want to chain optionals with functions”, so let’s drill into that a little.

The existing map allows you to apply a function to the value inside an optional, if that optional is non-nil. For example, suppose you have an optional integer i and you want to double it. You could write i.map { $0 * 2 }. If i has a value, you get back an optional of that value doubled. On the other hand, if i is nil, no doubling takes place.

Now, suppose instead of doubling an integer, you wanted to perform a mapping on an optional that could itself return an optional. The example in the release notes is find, which searches an array for a value and returns the index if it’s found – or nil if not found:

// an array of arrays
let arr = [[1,2,3],[4,5,6]]
// .first returns an optional of the first element of the array
// (optional because the array could be empty, in which case it's nil)
let fst = arr.first  // fst is now [Int]?, an optional array of ints
// now, if we want to find the index of the value 2, we could use map and find
let idx = fst.map { find($0, 2) }

The trouble with this is, idx will be an Int??, because fst was already optional, and then we mapped it with find, that also returns an optional – so now it’s an optional wrapped in an optional. What we want is to “flatten” this nested optional – so instead of map we use flatMap:

// flatMap means idx will be an Int?, not an Int??
let idx = fst.flatMap { find($0, 2) }

Here’s another use of flatMap: since we have an optional array, you could use it to get the first element of the first element:

// getting the first element of an optional array
let innerFst = fst.flatMap(first)

Now, if you’ve been programming Swift for a while, you might say hang on, since arrays have a first method, I could use optional chaining:

// using optional chaining to get the first element of an optional array
let innerFst = fst?.first

And at this point, hopefully it’s clear that flatMap and optional chaining do very similar things. Optional chaining essentially is a compact version of flatMap that only works on methods. When you want the same functionality as optional chaining, but instead with a function you need to pass the optional value into (like with find) instead of calling a method on that value, you need flatMap.

That’s not all flatMap does – it’s also a method on arrays (think of a mapping function that turns elements into arrays of elements), and there’s a generic free function version that works on sequences and collections. I won’t go into those here – I’m sure there will be a flood of Swift blogs coming up that will cover it in more detail. Just grab one as it drifts by, read, and experiment.¹

Grab Bag

There are a few other changes to the standard library across the two releases:

The precendence of the ?? has been bumped from 110 to 131. This means it now has higher precedence than equality (130). So i == j ?? 0 now replaces a nil value for j with 0 prior to equating it to i. Note, it still doesn’t trump arithmetic operators, so unbracketed monstrocities like i + j ?? 0 won’t compile.
Slice has been renamed ArraySlice, making its purpose clearer (strings, for example, are their own slice type). Other than the name, nothing else has changed (except the additional flatMap).
&/ and &% are gone, as mentioned in the release notes.
Following its introduction in beta 1, various function arguments have been marked @noescape. Examples include various withUnsafeXXXPointer functions, and various map, reduce and filter functions. Of course, lazy‘s map is not marked like this because it does capture self.
Similar to UnsafePointer, AutoreleasingUnsafeMutablePointer and CFunctionPointer have lost their static null instance.
But they have acquired conformance to CVarArgType, as does UnsafePointer
Double and Float, on the other hand, are now conforming to _CVarArgPassedAsDouble, a specialization of CVarArg. This is to enable x86_64-specific handling apparently.
Character is now DebugPrintable
If you want a “how to speak Swift” tip, in various documenting comments, things are now called “values” rather than “objects”.
String.Index, String.UTF16Index and String.UTF8Index now have methods for converting between each other.
String.UnicodeScalarView now conforms to ExtensibleCollectionType and RangeReplaceableCollectionType, and its index is now bidirectional and comparable, giving it a lot of the same capabilities of the String itself.
UInt8 has a constructor from an ASCII UnicodeScalar (with a warning that the scalar better actually be ASCII!)
In a boon to reckless optimizers everywhere, unsafeUnwrap will now force-unwrap your optional without safety-checking non-nilness first.

Trailing Closures and Default Arguments

A subtle change in the language means that the split function has had its arguments reordered.

Previously, its second parameter was the isSeparator function, followed by a couple of defaulted arguments (maxSplit and allowEmptySlices). But as of Swift 1.2, when you supply a function via the “trailing closure syntax”, that trailing closure is always the last parameter. So if you tried to call the old split without supplying those defaulted arguments, you’d get an error, because it isn’t the last paramter, it’s the second. The new split makes the isSeparator function the last parameter.

While we’re talking about default arguments, did you know you that you can supply them in any order?

func f(a: String = "A", b: String = "B") {
    println("(a) (b)")
}

f(b: "bee")             // prints A bee
f(b: "Bee!", a: "Eh?")  // prints Eh? Bee!

Signed Mismatches

OK this last one’s a bit language-lawyerly so if that sort of stuff bores you, you’ve been warned!

You might be surprised to find that, up until Swift 1.2, the following code would compile:

let x: UInt = 5
let y: Int = 6
let z = x + y
// z will be a UInt of value 11

Given that Swift is usually so strict about not mixing types, how come this is allowed? It comes down to the following definition for + that appeared way back in Swift 1.0 beta 4:

// if a type is strideable, you can add its stride type to it
func +(lhs: T, rhs: T.Stride) -> T

// and strides can be signed
protocol _Strideable {
    /// A type that can represent the distance between two values of `Self`
    typealias Stride: SignedNumberType
}

And since UInt is strideable, and defines its distance (and thus its stride) as a signed integer, that means you can add signed integers to unsigned integers.²

Swift 1.2 fixes this state of affairs by introducing two bits of code. First, it adds a typealias to the _UnsignedIntegerType protocol:

protocol _UnsignedIntegerType : _IntegerType {
    typealias _DisallowMixedSignArithmetic : SignedIntegerType = Int
}

Next, it adds a new overload for the + operator:

func +<T : _UnsignedIntegerType>(lhs: T, rhs: T._DisallowMixedSignArithmetic) -> T

The upshot of this is if you try and add an Int to a UInt, there are now two matching overloads for the + operator: the old one, and the new one, and Swift will complain that they’re ambiguous. Hopefully the fact that the compiler error mentions mixed-sign arithmetic will be enough of a hint you that this is the problem. A bit clunky, but it does the job.

At some point, someone will try to explain to you that flatMap is actually something called monadic bind. Don’t panic when they do – the most important thing is to understand what it does. This more abstract concept will come easily once you have a good grasp of optional and array versions of flatMap under your belt. When you’re ready, I’d recommend reading Rob Napier’s article and this series by Alexandros Salazar. ↩
Though not the other way around i.e. y + x wouldn’t compile – which is another problem, since people expect integer addition to be symmetric. ↩

Sorting Nibbles in Swift

Posted on March 8, 2015March 26, 2015 by airspeedvelocity

A couple of months back, John Regehr posted a challenge on his blog – write the fastest possible sorting algorithm for all the nibbles in a 64-bit word.

The problem is to sort the 4-bit pieces of a 64-bit word with (unsigned) smaller values towards the small end of the word. The nibble sort of 0xbadbeef is 0xfeedbba000000000. The function you implement will perform this sorting operation on a buffer of 1024 64-bit integers:

void nibble_sort(unsigned long *buf);

The challenge was in C, but it made me wonder about solving the same problem in Swift. Here’s a straight port of the supplied reference version in Swift:

func read_nibble(w: UInt64, i: UInt64) -> UInt64 {
    let res = w >> (i * 4)
    return res & 0xf
}

func write_nibble(inout w: UInt64, i: UInt64, v: UInt64) {
    var mask: UInt64 = 0xf
    mask <<= (i * 4);
    w &= ~mask
    var prom = v;
    prom <<= (i * 4)
    w |= prom
}

func nibble_sort_word_ref(var arg: UInt64) -> UInt64 {
    for var i: UInt64 = 0; i < 16; ++i {
        var min = i;
        for var j = i+1; j < 16; ++j {
            if read_nibble(arg, j) < read_nibble(arg, min) {
                min = j
            }
        }
        if (min != i) {
            var tmp = read_nibble(arg, i)
            write_nibble(&arg, i, read_nibble(arg, min))
            write_nibble(&arg, min, tmp)
        }
    }
    return arg;
}

func nibble_sort_ref(buf: UnsafeMutablePointer<UInt64>) {
    for var i=0; i<1024; i++ {
        buf[i] = nibble_sort_word_ref(buf[i])
    }
}

The first bit of good news is that on my machine, when compiled with -Ounchecked, this version performs with pretty much the same speed as the original C version. Sticking with -O instead gives up about ~20% performance, which is not too bad for the sake of safety.

So, this version has two functions to read and write the nibble at a given index, and a function to sort it using them. Get/set at a given index? That sounds a lot like a collection type. What if we were to wrap the word in a collection, and use subscript instead of the read/write functions? Something like this:

struct NibbleCollection: MutableCollectionType {
    var val: UInt64
    
    var startIndex: UInt64 { return 0 }
    var endIndex: UInt64 { return 16 }
    
    subscript(idx: UInt64) -> UInt64 {
        get {
            return (val >> (idx*4)) & 0xf
        }
        set(n) {
            let mask = 0xf << (idx * 4)
            val &= ~mask
            val |= n << (idx * 4)
        }
    }
    
    typealias Generator = IndexingGenerator<NibbleCollection>
    func generate() -> Generator { return Generator(self) }
}

Once you have this, you can then use the standard library sorting function on it to implement the challenge:

func nibble_sort_word_col(arg: UInt64) -> UInt64 {
    var col = NibbleCollection(val: arg)
    sort(&col)
    return col.val
}

“Does all this Collection overhead mean it won’t be as fast as the raw function implementation?” you might worry. But there really is no overhead – since NibbleCollection is a struct, not a class, it really isn’t anything more than the raw underlying UInt64 and some functions. All of the scaffolding sloughs away at compile time, leaving you with pretty much identical code to the function-based version.

With one big difference: Swift.sort is way better than the reference sort implementation above.¹ So this version outperforms it, both in Swift and in C. With -O, the collection version is faster than the Swift reference, and the same speed as the C version. With -Ounchecked it outperforms the C version by about 25%.

Which tells you two things you should bear in mind when writing Swift:

Structs implementing protocols, and generic functions, are a good way of getting abstraction without giving up performance.
Always check if the code you want exists as a library function. Using it won‘t just save time and make your code clearer, it could be better optimized than the version you might write.

Of course, the competitive entries in C were orders of magnitute faster than the reference version, using bit-twiddling techniques like bucket sorts and lookup tables that take advantage of the problem being constrained to values less than 16 held in a single word. But since Swift also supports most of the operations they use, many of them can be ported too. A quick and dirty translations I did of a couple of the simpler ones suggests that, like the reference version, Swift can match or almost match them in performance too.

Naming the two sort algorithms in question is an exercise for the reader. They might not be the one you’d guess first. ↩

Changes to the Swift Standard Library in 1.2 beta 1

Posted on February 11, 2015March 26, 2015 by airspeedvelocity

Swift v1.2 is out, and, well, it’s pretty fab actually.

The most exciting part for me is the beefed-up if let syntax. Not only does it allow you to unwrap multiple optionals at once, you can define a later let in terms of an earlier one. For instance, here are 3 functions, each one of which takes a non-optional and returns an optional, chained together:

if let url = NSURL(string: urlString),
   let data = NSData(contentsOfURL: url),
   let image = UIImage(data: data) {
      // display image
}

The key here is that if at any of the preceding lets fail, the whole thing fails and the later functions are not called. If you’re already familiar with optional chaining with ?. this should come naturally to you. For more on this, see NSHipster’s article.

The new ability to defer initialization of let constants makes much of of my “avoiding var” post redundant, as you can now do this: ¹

let direction: Direction    
switch (char) {
    case "a": direction = .Left
    case "s": direction = .Right
    default:  direction = .Forward
}

It’d still be nice to have a form of switch that evaluates to a value – but with the new let, that’s really just a desire for some syntactic sugar (imagine using it inside a short closure expression, without needing any return).

Speaking of switch, the new if, with the ability to have a where clause on the bound variables, means if may now be suitable for several places where you were previously using switch. But there’s still a place for switch – especially when using it for full coverage of each case of an enum. And I feel a bit like pattern matching via ~= is one of switch’s powerful overlooked features that I keep meaning to write a post about. This release adds a ~= operator for ranges, not just intervals.

Library Additions

The big new addition to the standard library is of course the Set collection type. The focus on mathematical set operations is very nice, and also that those operations take any SequenceType, not just another Set:

let alphabet = Set("abcdefghijklmnopqrstuvwxyz")
let consonants = alphabet.subtract("aeiou")
// don't forget, it's "jumps" not "jumped"
let foxy = "the quick brown fox jumped over the lazy dog"
alphabet.isSubsetOf(foxy)  // false

Note that the naming conventions emphasize the pure versions of these operations, which create new sets, with each operation having a mutating equivalent with an InPlace suffix:

// create a new set
var newalpha = consonants.union("aeiou")
// modify an existing set
newalpha.subtractInPlace(consonants)

Note also for those fearful of the evil of custom operators, there’s no - for set subtraction etc. The one operator defined for sets is ==. Unlike with Array, sets are also Equatable (since they mandate their contents are Equatable, via Hashable).

But that’s not all. Here’s a rundown of some of the other library changes:

There‘s no longer a separate countElements call for counting the size of collections vs count for ranges – it’s now just count for both.
Dictionary‘s index, which used to be bi-directional, is now forward-only, as are the lazy collections you get back from keys and values.
The C_ARGC and C_ARGV variables are gone, as is the _Process protocol. In their place there’s now a Process enum with 3 static methods: arguments, which gives you a [String] of the arguments, and argc and argv with give you the raw C-string versions.
Protocols that used to declare class funcs now declare static funcs since the language now requires that.
Array, ClosedInterval and Slice now conform to a new protocol, _DestructorSafeContainer. This protocol indicates “a container is destructor safe if whether it may store to memory on destruction only depends on its type parameters.” i.e. an array of Int is known never to allocate memory when its destroyed since Int doesn’t, whereas a more complex type (with a deinit, or aggregating a type with one) may not be.
They also now conform to a new __ArrayType protocol. Two underscores presumably means you definitely definitely shouldn’t try and conform to this directly yourself.
The builtin-converting Array.convertFromHeapArray function is gone.
The pointer types (AutoreleasingUnsafeMutablePointer, UnsafePointer and UnsafeMutablePointer now conform to a new _PointerType protocol.
The CVaListPointer (the equivalent of C’s va_list) now has an init from an unsafe pointer.
EmptyGenerator now has a public initializer, so you can use it directly.
The various unicode types now have links in their docs to the relevant parts of unicode.org’s glossary
HeapBuffer and HeapBufferStorage are gone, replaced by the more comprehensive and documented ManagedBuffer and ManagedBufferPointer, of which more later.
ObjectIdentifier is now Comparable.
The enigmatic OnHeap struct is gone.
All the views (UnicodeScalarView, UTF8View and UTF8View) inside String are now Printable and DebugPrintable
String.lowercaseString and uppercaseString are back! I missed you guys.
The various convertFrom static functions in StringInterpolationConvertible are now inits.
UnsafePointer and UnsafeMutablePointer no longer have a static null() instance, presumably you should just use the nil literal.
There’s a new zip function, which returns a Zip2 struct – consistent with the lazy function that returns LazyThings, and possibly a gateway to ZipMoreThan2 structs in the future…

The Printable and DebugPrintable protocol docs now point out: if you want a string representation of something, use the toString protocol – in case you were trying to get fancy with if let x as? Printable { use(x.description) }. Which the language will now let you do, since as? and is now work with Swift protocols, which is cool. But still don’t do that – especially considering you may be sorry when you find out String doesn’t conform to Printable.

The Character type is now a struct, not an enum, and no longer exposes a “small representation” and “large representation”. This appears to go along with a performance-improving refactoring that has made a big difference – for example, a simple bit of code that sucks in /usr/share/dict/words into a Swift String and counts the word lengths now runs in a quarter of the time it used to.

Bovine Husbandry

Suppose you wanted to write your very own collection, similar to Array, but managing all the memory yourself. Up until Swift 1.2, this was tricky to do.

The memory allocation part isn’t too hard – UnsafeMutablePointer makes it pretty straightforward (and typesafe) to allocate memory, create and destroy objects in that memory, move those objects to fresh memory when you need to expand the storage etc.

But what about de-allocating the memory? Structs have no deinit, and that unsafely-pointed memory ain’t going to deallocate itself. But you still want your collection to be a struct, just like Array is, so instead, you implement a class to hold the buffer, and make that a member. That class can implement deinit to destroy its contained objects and deallocate the buffer when the struct is destroyed.

But you’re still not done. At this point, you’re in a similar position Swift arrays were in way back in the first ever Swift beta. They were structs, but they didn’t have full value semantics. If someone copies your collection, they’ll get a bitwise copy of the struct, but all this will do is copy the reference to the buffer. The two structs will both point to the same buffer, and updating the values in one would update both. Behaviour when the buffer is expanded would depend on your implementation (if the buffer expanded itself, they might both see the change, if the struct replaced its own buffer, they might not).

How to handle this? One way would be to be able to detect when a struct is copied, and make a copy of its buffer. But just as structs don’t have deinit, they also don’t have copy constructors – this isn’t C++!

So plan B – let the bitwise copy happen, but when the collection is mutated (say, via subscript set), detect whether the buffer is referenced more than once, and at that point, if it is, make a copy of the buffer before changing it. This solves the problem, and also comes with a big side-benefit – your collection is now copy-on-write (COW). Arrays in Swift are also COW, and this means that if you make a copy of your array, but then don’t make any changes, no actual copying of the buffer takes place – the buffer is only copied when it has to be, because you are mutating the values of some shared storage.

Since classes in Swift are reference-counted, it ought to be possible to detect if your buffer class is being referenced by more than one struct. But there wasn’t an easy Swift-native way to do this, until 1.2, which introduces isUniquelyReferenced and isUniquelyReferencedNonObjC function calls:

extension MyFancyCollection: MutableCollectionType {
  // snip...

  subscript(idx: Index) -> T {
    get {
      return _buffer[idx]
    }
    set(newValue) {
      if isUniquelyReferencedNonObjC(&_buffer) {
          // all fine, this buffer is solely owned
          // by this struct:
          _buffer[idx] = newValue
      }
      else {
          // uh-oh, someone else is referencing this buffer
          _buffer = _buffer.cloneAndModify(idx, newValue)
      }
   }
}

Great, COW collections all round. But before you run off and start implementing your own buffer class for use with your collection, you probably want to check out the ManagedBuffer class implementation, and a ManagedBufferPointer struct that can be used to manage it. These provide the ability to create and resize a buffer, check its size, check if its uniquely referenced, access the underlying memory via the usual withUnsafeMutablePointer pattern etc. The commenting docs are pretty thorough so check them out.

Just in case you want to try out this new feature in a playground – this technique does not work with globals. See this thread on the dev forums. ↩

More fun with implicitly wrapped non-optionals

Posted on January 2, 2015March 23, 2015 by airspeedvelocity

As we’ve seen previously, Swift will happily auto-wrap your non-optional value inside an optional temporarily if it helps make the types involved compatible. We’ve seen the occasional instance where this can be surprising. But for the most part, this implicit wrapping of non-optionals is your friend.¹ Here’s a few more examples.

Equating Optionals

The Swift standard library defines an == operator for optionals:

func ==<T : Equatable>(lhs: T?, rhs: T?) -> Bool

So long as your optional wraps a type that’s Equatable, that means you can compare two optionals of the same underlying type:

let i: Int? = 1
let j: Int? = 1
let k: Int? = 2
let m: Int? = nil
let n: Int? = nil

i == j  // true,  1 == 1
i == k  // false, 1 != 2
i == m  // false, non-nil != nil
m == n  // true,  nil == nil

let a: [Int]? = [1,2,3]
let b: [Int]? = [1,2,3]

// error, [Int] is not equatable
// (it does have an == operator, but that's not the same thing)
a == b

let f: Double? = 1.0

// ignore the weird compiler error,
// the problem is Int and Double aren't
// the same type...
i == f

Well, this is useful, but even more useful is this one definition of == allows you to compare optionals against non-optionals too. Suppose you want to know if an optional string contains a specific value. You don’t care if it’s nil or not – just whether it contains that value.

You can just pass your non-optional into that same == that takes optionals:

let x = "Elsa"

let s1: String? = "Elsa"
let s2: String? = "Anna"
let s3: String? = nil

s1 == x  // true
s2 == x  // false
s3 == x  // false

This works because Swift is implicitly wrapping x in an optional each time. If Swift didn’t do this, and you implemented an optional-to-non-optional operator yourself, you’d have to define == an extra two times, to make sure it was symmetric (which is a requirement whenever you implement ==):

// definitions if there weren't implicit optionals
func ==<T:Equatable>(lhs:T, rhs:T?)->Bool { return rhs != nil && lhs == rhs! }
func ==<T:Equatable>(lhs:T?, rhs:T)->Bool { return lhs != nil && lhs! == rhs }

Beware though, optional == isn’t all sunshine and unicorn giggles. Imagine the following code:

let a: Any = "Hello"
let b: Any = "Goodbye"

// false, of course
(a as? String) == (b as? String)

// oops, this is true...
(a as? Int) == (b as? Int)

Depending on how you think, this is either perfectly reasonable or immensley confusing, but either way you should bear it in mind. And maybe it’s another reason to go on your long list “Why not to use Any and as unless you really really have to.”

Comparing Optionals

Unsurprisingly, there’s a similar definition of < that will compare two optionals so long as what they wrap is Comparable:

func <<T : _Comparable>(lhs: T?, rhs: T?) -> Bool

If lhs and rhs are both some value, it compares them. Two nils are considered equal just like before. But a nil is always considered less than any non-nil value, (Comparable must always enforce a strict total order so they have to go somewhere well-defined in the ordering).

So if you sort a set of optionals, the nil values all end up at one end:

let a: [Int?] = [1, 3, nil, 2]
sorted(a, <)
// results in [nil, 1, 2, 3]

This is fairly intuitive, but it can lead to other subtle bugs. Suppose you want to filter a list like so:²

let ages = [
    "Tim":  53,  "Angela": 54,  "Craig":   44,
    "Jony": 47,  "Chris":  37,  "Michael": 34,
]

let people = ["Tim", "Johny", "Chris", "Michael"]

// don't trust anyone over 40
let trust = people.filter { ages[$0] < 40 }

// trust will contain ["Johny", "Chris", "Michael"]

The string "Johny" was not in the ages dictionary so you get a nil – but the way < works, nil is less than 40, so included in the filtered array.

Optional Subscripts

While we're talking about dictionaries, did you know that Dictionary.subscript(key) doesn't just return an optional, it also takes an optional?

var d = ["1":1]

let four = "4"
let nonsense = "nonsense"

// this will add ["4":4] to d
// remember toInt returns an optional
// (the string might not have been a number)
d[four] = four.toInt()

// nothing will be added, because
// the rhs is nil
d[nonsense] = nonsense.toInt()

// this will remove the "1" entry from d
d["1"] = nil

That's quite useful – but also, there wasn't much choice in it working like this.³ Dictionary.subscript(key) { set } has to take an optional, because when you define a setter, you must define a getter. And Dictionary's getter has to be optional (in case the key isn't present). So the setter's input value has to be.

This means every time you assign a value to a dictionary using a key subscript, your non-optional value is getting auto-wrapped in an optional. If this didn’t happen automatically, it would make assigning values to your dictionary a real pain, as you’d have to type d[key] = .Some(value) every time.

To see this in practice, here's a very bare-bones (very inefficient) implementation of a dictionary. Take a look at the subscript to see how the optional is handled:

public struct RubbishDictionary<Key: Equatable, Value> {

    public typealias Element = (Key, Value)

    // just store the key/value pairs in an unsorted
    // array (see, told you it was rubbish)
    private typealias Storage = [Element]
    private var _store: Storage = []

    // there's a reason why this is a private method,
    // which I'll get to later...
    private func _indexForKey(key: Key) -> Storage.Index? {
        // new year's resolution: see if I can avoid
        // calling array[i] completely in 2015
        for (idx, (k, _)) in Zip2(indices(_store),_store) {
            if key == k { return idx }
        }
        return nil
    }

    // Subscript for key-based lookup/storage
    public subscript(key: Key) -> Value? {
        // "get" is pretty simple, either key can be
        // found, or not...
        get {
            if let idx = _indexForKey(key) {
                // implicit optional wrap
                return _store[idx].1
            }
            else {
                return nil
            }
        }
        // remember, newValue is a Value?
        set(newValue) {
            switch (_indexForKey(key), newValue) {

            // existing index, so replace value
            case let (.Some(idx), .Some(value)):
                _store[idx].1 = value

            // existing index, nil means remove value
            case let (.Some(idx), .None):
                _store.removeAtIndex(idx)

            // new index, add non-nil value
            case let (.None, .Some(value)):
                _store.append((key, value))

            // new index, nil value is no-op
            case (.None,.None):
                break
            }
        }
    }
}

Finally, just a reminder that an optional containing an optional containing nil is not the same as nil:

var d: [Int:Int?] = [1:1, 2:nil]
// d now contains two entries, 1:1 and 2:nil
d[2] = nil  // removes nil
d[3] = .Some(nil)  // adds 3:nil

Full imlementation of `RubbishDictionary`

So, in case you've made it all the way down here, and you're interested, I thought I'd end with a fuller but minimal-as-possible implementation of RubbishDictionary. It has almost-feature-parity with Dictionary (it's just missing getMirror and init(minimumCapatity)). It ought to do everything Dictionary can, just slower.

If reading raw code's not your thing, skip it. But it shows how to simply implement several of the standard library's protocols such as CollectionType and DictionaryLiteralConvertible, as well as a few other features like lazily-evaluated key and value collections. I like how Swift extensions allow you to build up the type in logical groups (e.g. each protocol implementation one-by-one). Since this is a post about implicit optional wrapping, I've flagged where it’s also helping keep the code consise.

I have a nagging feeling there are some neater ways to handle some of the optionals – if you spot one, let me know. Oh and bugs. I'm sure there are a couple of them.

// building on the initial definition given above...

// An index type for RubbishDictionary.  In theory, you could perhaps just
// expose an alias of the storage type’s index, however, if the Key type were
// an Int, this would mean subscript(key) and subscript(index) would be
// ambiguous. So instead it's simple wrapper struct that does little more than
// pass-through to the real index.
public struct RubbishDictionaryIndex<Key: Equatable, Value>: BidirectionalIndexType {
    private let _idx: RubbishDictionary<Key,Value>.Storage.Index

    public typealias Index = RubbishDictionaryIndex<Key,Value>

    public func predecessor() -> Index { return Index(_idx: _idx.predecessor()) }
    public func successor() -> Index { return Index(_idx: _idx.successor()) }
}

// Required to make the index conform to Equatable
// (indirectly required by BidirectionalIndexType)
public func ==<Key: Equatable, Value>
  (lhs: RubbishDictionaryIndex<Key,Value>, rhs: RubbishDictionaryIndex<Key,Value>)
   -> Bool { 
    return lhs._idx == rhs._idx 
}

// Conformance to CollectionType
extension RubbishDictionary: CollectionType {
    public typealias Index = RubbishDictionaryIndex<Key,Value>

    // Add a subscript that takes an Index.  This one
    // does not need to be nil, since an indexed entry
    // _must_ exist.  Read-only.
    public subscript(idx: Index) -> Element { return _store[idx._idx] }

    public var startIndex: Index { return Index(_idx: _store.startIndex) }
    public var endIndex: Index { return Index(_idx: _store.endIndex) }

    public typealias Generator = IndexingGenerator<RubbishDictionary>
    public func generate() -> Generator { return Generator(self) }
}

// Conformance to Printable and DebugPrintable
extension RubbishDictionary: Printable, DebugPrintable {
    public var description: String {
        let elements = map(self) { "($0.0):($0.1)" }
        return "[" + ",".join(elements) + "]"
    }

    public var debugDescription: String {
        let elements = map(self) { "(toDebugString($0.0)):(toDebugString($0.1))" }
        return "[" + ",".join(elements) + "]"
    }
}

// Conformance to DictionaryLiteralConvertible
extension RubbishDictionary: DictionaryLiteralConvertible {
    public init(dictionaryLiteral elements: (Key, Value)...) {
        for (k,v) in elements {
            self[k] = v
        }
    }
}

// Remaining features supported by Dictionary
extension RubbishDictionary {

    public var count: Int { return _store.count }
    public var isEmpty: Bool { return _store.isEmpty }

    // now for the public version of indexForKey
    public func indexForKey(key: Key) -> Index? {
        // this maps non-nil results from _indexForKey to
        // the wrapper type, or just returns nil if 
        // _indexForKey did...
        return _indexForKey(key).map { Index(_idx: $0) }
    }

    // a lazily-evaluated collection of keys in the dictionary
    public var keys: LazyBidirectionalCollection<MapCollectionView<RubbishDictionary<Key,Value>,Key>> {
        return lazy(self).map { (k,v) in k }
    }

    // and the same for the values
    public var values: LazyBidirectionalCollection<MapCollectionView<RubbishDictionary<Key,Value>,Value>> {
        return lazy(self).map { (k,v) in v }
    }

    // unlike its subscript equivalent, updateValue:forKey does not take an optional value...
    public mutating func updateValue(value: Value, forKey key: Key) -> Value? {
        if let idx = _indexForKey(key) {
            let old = _store[idx].1
            _store[idx].1 = value
            // implicit optional wrap
            return old
        }
        else {
            _store.append((key,value))
            return nil
        }
    }

    // returns the value removed, if there was one
    public mutating func removeValueForKey(key: Key) -> Value? {
        if let idx = _indexForKey(key) {
            // implicit optional wrap
            return _store.removeAtIndex(idx).1
        }
        else {
            return nil
        }
    }

    public mutating func removeAtIndex(index: Index) {
        _store.removeAtIndex(index._idx)
    }

    public mutating func removeAll(keepCapacity: Bool = false) { 
        _store.removeAll(keepCapacity: keepCapacity)
    }

}

Unlike implicitly-wrapped optionals, which are more like the kind of friend that gets you into fights in bars. ↩
I'm getting a lot of mileage out of this example. ↩
I guess it could have asserted on a nil value. ↩

Changes to the Swift Standard Library in 1.1 beta 3

Posted on October 1, 2014October 1, 2014 by airspeedvelocity

The biggest deal in this latest beta is the documentation. There are 2,745 new lines of /// comments in beta 3, and even pre-existing documentation for most items has been revised.

That doesn’t mean there aren’t any functional changes to the library, though. Here’s a rundown of what I could find:

As mentioned in the release notes, the various literal convertible protocols are now implemented as inits rather than static functions.
The ArrayBoundType protocol is gone, and the various integer types that conformed to it no longer do.
The CharacterLiteralConvertible protocol is gone. Character never appeared to conform to it – it conformed to ExtendedGraphemeClusterLiteralConvertible which remains.
The StringElementType protocol is gone, and UInt8 and UInt16 no longer conform to it. The UTF16.copy method, that relied on it to do some pointer-based manipulation, is also gone.
Dictionary’s initializer that took a minimumCapacity argument no longer has a default for that minimum capacity. It’s interesting that this worked previously, since if you did the same thing with your own type (i.e. gave it both an init() and an init(arg: Int = 0)) you’d get a compiler error when you tried to actually use init().
In extending RandomAccessIndexType, integers now use their Distance typealias for Int rather than straight Int.
The static from() methods on integers are all gone.
There’s now an EnumerateSequence that returns an EnumerateGenerator, and enumerate() returns that rather than the generator.
The various integer types no longer have initializers from Builtin.SomeType.
The _CocoaArrayType protocol (as used to initialize an Array from a Cocoa array) has been renamed to _SwiftNSArrayRequiredOverridesType.
Numerous _CocoaXXX and _SwiftXXX protocols have acquired an @objc attribute.
_PrintableNSObjectType (which wasn’t explicitly implemented by anything in the std lib) is gone.
The AssertStringType and StaticStringType protocols are gone, as has AssertString. StaticString still exists and is clearly described as a static string that can be known at compile-time.
assert and precondition are much simplified with those string types removed, leaving one version each that just uses String for it’s message parameter. assertionFailure, preconditionFailure and fatalError all take a String for their message now as well, though they still take StaticString for the filename argument as described in the Swift blog’s article.
There are two new sort functions that operate on a ContiguousArray (one for arrays of comparable elements, and one that takes a comparison predicate).
But there are two fewer sorted functions. The ones that take a MutableCollectionType, and return one, are gone. Seems a shame, though it didn’t really make sense for them to take a mutable type given they didn’t need to mutate it. Maybe they’ll be replaced with versions that return ExtensibleCollectionType.
There’s a new unsafeDowncast that is equivalent to x as T but with the safeties removed – to be used only when using as is causing performance problems.
Raise a glass for the snarky “Haskell’s fmap, which was mis-named” comment, which is now replaced by a very straight-laced description of what Optional.map actually does.

In keeping with the documenting theme, there are a lot of argument name changes here as well. Continuing a trend seen in previous betas, specific, descriptive argument names are preferred over more generic ones. This is probably a good Swift style tip for your own code, especially if you’re also writing libraries.

Some examples:

Naming arguments in protocols instead of just using ‘_‘ e.g. func distanceTo(other: Self) instead of func distanceTo(_: Self) (there’s no compulsion to use the same names for your method arguments that the protocol does but you probably should)
Avoiding just naming the variable after the type (e.g. sequence: Sequence), For example, Array.extend has renamed its argument to newElements.
Replacing newValues with newElements in various places, presumably because of the unintended Swift implications of the term “values”.
Avoiding i or v such as subscript(position: Index) instead of subscript(i: Index), and init(_ other : Float) instead of init(_ v : Float).
Descriptive names for predicate arguments, such as isOrderedBefore rather than just pred.

Rather than me regurgitate the comments here, I’d suggest option-clicking import Swift and reading through it in full. There’s lots of informative stuff in there. Major types like Array and String have long paragraphs in front of them with some good clarifications. Many things are explained that you’d otherwise only have picked up on by watching the developer forums like a hawk.¹

Here are a few items of specific interest:

The comment above GeneratorType has been revised. The suggestion that if using a sequence multiple times the algorithm “should probably require CollectionType, since CollectionType implies multi-pass” is gone. Instead it states that “any code that uses multiple generators (or for ... in loops) over a single sequence should have static knowledge that the specific sequence is multi-pass”. The most common case of having that static knowledge being that you got the sequence from a collection I guess.

Above Slice we find “Warning: Long-term storage of Slice instances is discouraged“. It makes it pretty clear Slice is mainly there to help pass around subranges of arrays, rather than being a standalone type in its own right. They give you the value type behaviour you’d get from creating a new array (i.e. if a value in the original array is changed, the value in the slice won‘t) while allowing you to get the performance benefits of copy-on-write for a sub-range of an existing array.

Several of the _SomeProto versions of protocols now have an explicit comment along the lines of the previously implied “this protocol is an implementation detail of SomeProto; do not use it directly”. I’m still not sure why these protocols are written in this way, possibly as a workaround for some compiler issues? If you have an idea, let me know.

edit: a post on the dev forum confirms it’s a temporary workaround for a compiler limitations related to generics, though doesn’t get specific about what limitation. Thanks to @anatomisation and @ivicamil for the link.

The reason for ContiguousArray’s existence is finally made clear. It’s there specifically for better performance when holding references to classes (not value types). There’s also a comment above the various collection withUnsafeBufferPointer methods suggesting you could use them when the optimizer fails to eliminate bounds checks automatically, in a performance-vs-safety trade-off.

Chris Lattner clarified on the dev forum that although this version is a GM, that’s in order to allow you to use it to submit apps to the mac store rather than because this is the final shipping version. We’ll likely see more changes to Xcode 6.1 before that, and with it maybe further changes to the standard lib.

Or reading this blog, of course. Hopefully the comments don’t get too helpful, what would I write about then? ↩

Changes to the Swift Standard Library in 1.1 beta 2

Posted on September 15, 2014September 16, 2014 by airspeedvelocity

Less than a week after the last beta was released, Swift 1.1 beta 2 is up. And yes, it’s officially Swift 1.1, as displayed by running swift -v from the command line.

Presumably we’re gearing up for a new GM alongside Yosemite, as there are almost no changes to the standard library in this release, much like when the GM was almost upon us for 6.0. Not surprising, since the Swift team had already confirmed on the dev forums that 6.1 was going to be a fairly small release.

The only material change are to the RawRepresentable protocol, and the _RawOptionSetType which inherits from it:

RawRepresentable’s Raw typealias is now RawValue.
Instead of a toRaw function, it now has a read-only rawValue property.
Instead of a fromRaw class function, it now defines an init? that takes a raw value.
_RawOptionSetType also now defines an init that takes a raw value, rather than a class function fromMask.

This matches the release notes, which state that enums construction from raw values have been changed to take advantage of the new failable initializers introduced in the last beta.

Interesting thing to note: while RawRepresentable has an init?(rawValue: RawValue) method, _RawOptionSetType has an init(rawValue: RawValue) method. This might seem odd at first – _RawOptionSetType inherits from RawRepresentable, shouldn’t they have the same kind of init? But since init is more restrictive than init?, it works, as an optional type can always be substituted with its non-optional equivalent. _RawOptionSetType is essentially restating RawRepresentable’s raw initializer as non-optional after all.

Here’s looking forward to a Swift 1.1 GM, and maybe after that on to 1.2?

Accidentally putting a loop in your loop

Posted on September 14, 2014September 14, 2014 by airspeedvelocity

Joel Spolsky wrote a good article back in 2001 about knowing when the function you’re calling has linear complexity,¹ and not accidentally putting a loop in your loop,² getting quadratic complexity without realizing it. He was talking about C string algorithms like strlen and strcat, but it applies just as much to several Swift standard library functions.

When a function only has versions that take a sequence, like the seductively-useful contains, equal, min– and maxElement, it’s linear.³ Some depend on their input: countElements can be done in constant time if you give it a collection with a random-access index, but linear if not. Same with distance and advance index operations, which is why extending String to support random-access indexing using them is probably a bad idea. Be careful, not everything that could be optimized may have been. For example, ClosedInterval.contains is presumably O(1) but there doesn’t appear to be an optimized overload for the non-member contains, which only has versions that take a sequence.

The Swift team have helpfully put the complexity of certain functions in the documentation. Array’s reserveCapacity, insert, and replaceRange are O(N).

And so is… removeAtIndex. In a previous article, I mentioned the following code to remove all occurrences of a given value from an array has an efficiency problem:

func remove
    <C: RangeReplaceableCollectionType,
     E: Equatable
     where C.Generator.Element == E>
    (inout collection: C, value: E) {
        var idx = collection.startIndex
        while idx != collection.endIndex {
            if collection[idx] == value {
                collection.removeAtIndex(idx)
            }
            else {
                ++idx
            }
        }
}

The documentation for removeAtIndex says: ⁴

Remove and return the element at the given index. Worst case complexity:
O(N). Requires: index < count

That is, the worst-case time it takes to remove an element increases in linear proportion to the length of the collection. That's not surprising – if you remove an element of an array, you have to shuffle each element after it down one. The larger the collection the more things to shuffle. Maybe your element is near the end, maybe your collection is a linked list that can remove elements in O(1), maybe it’s magic and has all sorts of cool optimizations – hence O(N) is the worst case. But still, it’s probably not best to call it within another loop.

Here, it shouldn't be necessary. We're already iterating over the collection, so ought to be able to combine the deletion and shuffling down together as one operation. Here's a new version of remove that does this, modelled on the C++ STL equivalent:

(incidentally, it also removes the need for questionable assumptions about how indexes behave when you remove elements, the subject of the previous article)

func remove
    <C: protocol<RangeReplaceableCollectionType,
                 MutableCollectionType>,
     E: Equatable
     where C.Generator.Element == E>
    (inout col: C, value: E) {
        // find the first entry to remove
        if var advance = find(col, value) {
            // advance points to next element to test,
            // rear points to where to copy it to
            // if it's a keeper
            var rear = advance++
            while advance != col.endIndex {
                if col[advance] != value {
                    col[rear] = col[advance]
                    ++rear
                }
                ++advance
            }
            col.removeRange(rear..<col.endIndex)
        }
}

This version breaks the removal into multiple steps. First, it uses find to locate the first entry to remove (and if it doesn’t find one, does nothing more). Next, one by one it moves the subsequent elements down on top of that entry. When it encounters more entries to remove, it skips over them (i.e. it increments advance, the index of entries to examine and maybe copy, but not rear, the index of where to copy to, and they aren’t copied).

Finally, when all this is done, the collection should have all the non-removed entries at the front, and some meaningless garbage at the end. This end section is the length of the number of removed entries (though it doesn‘t contain the removed entries – entries were copied, not swapped). This is trimmed off by a call to removeRange, which is also O(N). But that’s fine – two O(N) algorithms in series is still O(N).

By the way, in this last step it differs from the C++ STL version which, in a quality bit of user-unfriendliness, requires the caller to do the final remove step, instead leaving the collection with the garbage still at the end. There‘s good reasons for this (because iterators), but it’s a nasty gotcha for newbies. I’d chalk this one up as a win for Swift’s approach to generic collection algorithms. ⁵

Note that for this to work, the collection also needs to support the MutableCollectionType protocol, because that’s where the assignable version of subscript lives for some reason.⁶ In fact, that’s all MutableCollectionType adds. By the way, this whole optimization is based on the assumption that the assignment version of subscript runs in constant time. It should do, right? Copying a value into a position in an array shouldn’t affect the rest of the array, so it shouldn’t matter how long it is.

A quick test with a reasonably large array shows that this new version of remove does indeed run quicker (and more consistently) than our first version. Yay.

Then you try and use it on a string, and your celebration is short lived. String doesn’t implement MutableCollectionType. Huh, what’s that about?

What this doesn‘t mean is that String is immutable. It’s totally mutable. Mutating is what RangeReplaceableCollectionType is all about. This is a chance to make an important if maybe obvious point: just because your function takes an object via a protocol that doesn’t allow mutation doesn‘t mean that object is immutable. It just means you can‘t mutate it in your function.

My guess for why strings don‘t support MutableCollectionType? Because that assumption above, about subscript assignment being O(1), doesn’t hold. Remember, individual elements of Swift strings are of variable length. What if you replaced a longer character with a shorter one? We’d be back to square one, having to shovel all the subsequent characters down to fill in the gap. Worse, what if you replaced a shorter entry with a longer one? The string would get bigger, maybe even need relocating to a newly allocated chunk of memory.

This would be another example of signalling more from protocols than just what functions are supported. They can tell you about fundamental properties of the object. Perhaps MutableCollectionType is like MutableInConstantTimeCollectionType. Course, on the other hand, I could be reading waaay too much into String not implementing it. It could just be an oversight – only time (or one of the Swift devs) will tell.

String does support replaceRange as an alternative to subscript assign. But its complexity is, you guessed it, O(N). And after crowbarring it into the second algorithm above, tests suggest it’s no faster than the removeAtIndex version.⁷ So if you really have a burning desire to remove some characters from a huge string, maybe find another way. Perhaps you can trade some space for that time.

That wikipedia entry on complexity, like so many wikipedia entries on mathematical topics, is pretty beginner-unfriendly. If anyone has a good beginner’s guide link I could replace it with, let me know. An (admittedly cursory) google search doesn’t turn much up. ↩
I’d have called this post “Yo, dawg” but then I’ve already done that once. ↩
Unless it does something like return the first element in the sequence, obvs. Hmmm, maybe I should be less paranoid about people finding errors in my posts. ↩
Actually it doesn’t, not on the protocol version anyway. The stand-alone function version (that takes that same protocol) does though, so I’m taking the liberty of giving that instead. ↩
It‘s not all wins, mind. The STL’s iterator-based functions are much easier to use with subranges of collections. Don‘t get me started on slices… ↩
Wait for it… ↩
though this is possibly due to my lack of deftness with a crowbar. ↩

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

Airspeed Velocity

African or European Swift?

swift standard library

Generic Collections, SubSequences and Overloading

Optimizing for Random Access

Protocols, Extensions and Dynamic Dispatch

Changes to the Swift standard library in 2.0 betas 2..<5

Flat Map

First-class Initializers

RangeReplaceableType: all your ExtensibleCollectionTypes are belong to us

You get to be Sliceable. And you get to be Sliceable. Everybody gets to be Sliceable!

Grab Bag

Changes to the Swift Standard Library in 2.0 beta 1

Changes to the Swift standard library in 1.2 betas 2 and 3

Sorting Nibbles in Swift

Changes to the Swift Standard Library in 1.2 beta 1

More fun with implicitly wrapped non-optionals

Equating Optionals

Comparing Optionals

Optional Subscripts

Full imlementation of `RubbishDictionary`

Changes to the Swift Standard Library in 1.1 beta 3

Changes to the Swift Standard Library in 1.1 beta 2

Accidentally putting a loop in your loop

Optimizing for Random Access

Protocols, Extensions and Dynamic Dispatch

Flat Map

First-class Initializers

RangeReplaceableType: all your ExtensibleCollectionTypes are belong to us

You get to be Sliceable. And you get to be Sliceable. Everybody gets to be Sliceable!

Grab Bag

Equating Optionals

Comparing Optionals

Optional Subscripts

Full imlementation of RubbishDictionary

Full imlementation of `RubbishDictionary`