Changes to the Swift Standard Library in Xcode 6.1

Posted on September 10, 2014September 10, 2014 by airspeedvelocity

Congratulations to the Swift team on releasing a 1.0 GM! Unsurprisingly, there were no changes that I could spot¹ between beta 7 and the GM standard libraries.

But as they said in their blog post, “Swift will continue to advance with new features, improved performance, and refined syntax. In fact, you can expect a few improvements to come in Xcode 6.1 in time for the Yosemite launch.” As such, the first beta of Xcode 6.1² saw more changes to the standard library than we saw in the pre-1.0 beta, and here they are:

As flagged in the release notes, all the integer types have acquired truncatingBitPattern initializers for each of their larger counterparts.
The functions join, reduce, sort and sorted have acquired descriptions in the various places they’re implemented.
AssertString and StaticString are now Printable and DebugPrintable
HeapBufferStorage no longer inherits from HeapBufferStorageBase, which is gone.
The Process instance is now declared with let rather than var
StrideThrough or StrideTo are now Reflectable
StaticString now has a utf8Start instead of start (which still returns an UnsafePointer). It also has a withUTF8Buffer method for using the underlying raw buffer, a unicodeScalar get property.
String has lost its compare(other: String.UnicodeScalarView) method.
String also now implements several methods extending StringInterpolationConvertible – for a variety of different built-in types.
Second versions of assertionFailure and fatalError now take a generic AssertStringType instead of StaticString.
equal and startsWith‘s predicate functions have been renamed isEquivalent.
maxElement and minElement‘s input parameter has been renamed elements.
The toString function now says it returns the result of printing, not debugPrinting, an object.

There’s a new unsafeAddressOf function that takes an AnyObject. Apparently there’s “not much you can do with this other than use it to identify the object”.

There is a new protocol, UnicodeScalarLiteralConvertible, that follows the now-familiar literal converter pattern: a ThingType typealias, a convertFromThing class function, and a library-level typealias for ThingType for the standard type for these literals to convert into, in this case a String.

The ExtendedGraphemeClusterLiteralConvertible protocol now inherits this protocol, so the objects that implement it (Character, String, AssertString and StaticString) also now implement convertFromUnicodeScalarLiteral. And the UnicodeScalar struct now implements UnicodeScalarLiteralConvertible instead of ExtendedGraphemeClusterLiteralConvertible.

Here’s looking forward to more enhancements as Swift continues to develop.

Unless there’s one hiding in those pesky re-ordered operator overloads… ↩
Does this mean this is Swift 1.1? ↩

Changes in the Swift Standard Library in Beta 7

Posted on September 2, 2014 by airspeedvelocity

There aren’t any.

Well, actually there’s one:

The ImplicitlyUnwrappedOptional type, which ceased to be of BooleanType in beta 6, no longer has the associated vestigial boolValue property.

Ok, Ok there are a couple more:

The comparison function taken by non-member sort, sorted and partition is now labelled isOrderedBefore (matching the member versions). Not that it’s an external parameter, so no code change needed.
Further enhancements to some function documentation comments (for example min() now has a description).

I think… that’s it. I have to admit, I don’t diff all the operators each time, as they seem to constantly reorder with each beta. That’s how I missed that, between beta 3 and beta 4, you stopped being able to use === to determine if two arrays referenced the same underlying storage. Shame on me!

There’s one operator that hasn’t changed. Despite what the release notes say, you still seem to be able to use + to concatenate Characters:

let c1: Character = "a"
let c2: Character = "b"
let s = c1 + c2
// s is now the String "ab"

…but presumably it’s not long for this world. This is in keeping with the trend of having + only work between two collection types, not between collection types and their contained types.

I guess this stabilization of the Swift library is in readyness for the Swift 1.0 release, which is pretty great news. The interesting question is, how will the library keep evolving post-1.0?

Protocols and Assumptions

Posted on August 29, 2014January 10, 2015 by airspeedvelocity

edit: subsequent to this article being written, the Swift standard library has been updated, and documentation-comments above the relevant methods of RangeReplaceableCollectionType now explicitly state: “Invalidates all indices with respect to self.” Bear this in mind as you read on:

What does it mean to implement a protocol? Possibly more than supporting methods with specific names.

Obviously the methods ought to do what their names imply – isEmpty shouldn’t fire the torpedoes. But as well as basic functionality, there are also guarantees about things like the method’s complexity, and possibly wider implications for how the class itself behaves.

More importantly, protocols might not guarantee a behaviour you think they do and are relying on. Using protocols with generics can sometimes give you the illusion of more guarantees than you actually have.

Suppose you want to write a function remove that removes entries from a collection in-place – that is, similar to the sort function, it takes a collection as an inout parameter, and removes from it all the entries that match a given value. ¹

Unlike sort, which just requires MutableCollection, remove would need to use the new-in-beta6 RangeReplaceableCollectionType, which includes a removeAtIndex method. Armed with this, you might write the following: ²

func remove
    <C: RangeReplaceableCollectionType,
     E: Equatable
     where C.Generator.Element == E>
    (inout collection: C, value: E) {
        var idx = collection.startIndex
        while idx != collection.endIndex {
            if collection[idx] == value {
                collection.removeAtIndex(idx)
            }
            else {
                ++idx
            }
        }
}

Embedded in this code are a lot of assumptions about how collections and their indices behave when you mutate them, and some of these might not be valid, depending on the collection type. It works with String, the only explicitly range-replaceable collection in the standard library currently, as well as the secretly range-replaceable Array, ContiguousArray and Slice. But you could easily implement a new type of collection that was range-replaceable for which the above code would explode in flames.

The biggest assumption is that removing an element from a collection does not completely invalidate an index. That is, after you call collection.removeAtIndex(idx), idx remains a legitimate index into the collection rather than just becoming junk.

Next, there’s the assumption that when you remove a entry at an index, that index will now point to the next element in the collection. That’s why, after removing the entry, the code above just goes straight back around the loop without incrementing idx. You could put it another way – when you remove an element, the next element “moves down” to the position indexed by the removed element.

Finally, there’s the assumption that if the element that idx points to is the last element, then what idx will point to after you remove it will be endIndex. Or, to put it another way, as you remove the last element, endIndex “moves down” to equal the index of the last element.

By the way, this last assumption is why the code uses while idx != collection.endIndex rather than the more-elegant for idx in indices(collection). indices returns a Range object between the collection’s start and end, but it would be created before we start looping. Because endIndex is a moving target as we remove some entries from the collection, it won’t work for our purposes. A cleverer version of indices that returned something more dynamic might help, but that could have different undesirable side-effects.

Are all these assumptions legit? Well you can see they obviously are for the Array types, because these just use an integer for their index, with endIndex equal to count. When elements are removed, the rest of the array shuffles down. Even if that resulted in a full reallocation of the array’s memory, the assumptions would still hold because all the index does is represent a relative offset from the start of the array, not point to a specific location.

Strings are trickier, because their index type is opaque. Chances are it’s still an offset from the start of the string, but because Swift characters aren’t of uniform width, ³ that offset doesn’t necessarily increment by a constant amount each time. Still, if this is the implementation, the above assumptions would hold, and experimentation suggests they do.

What kind of collection might not adhere to these assumptions? Well, a very simple doubly-linked list implemention might not. ⁴ If the index for a linked list were a pointer to each node, then removing that node could leave the index pointing at a removed entry. You couldn’t just loop around without first pointing to the next node in the list:

func remove
    <C: RangeReplaceableCollectionType,
     E: Equatable
     where C.Generator.Element == E>
    (inout collection: C, element: E) {
        var idx = collection.startIndex
        while idx != collection.endIndex {
            if collection[idx] == element {
              // first grab the index of the next element
              let next = idx.successor()
              // then remove this one
              collection.removeAtIndex(idx)
              // and repoint
              idx = next
            }
            else {
                ++idx
            }
        }
}

But then this algorithm would no longer work correctly with arrays and strings! ⁵

So what’s the solution – should RangeReplaceableCollectionType mandate the kind of index validity behaviour our remove algorithm relies on? Or are the assumptions invalid and we need a better algorithm? (of which more in a later article) The Swift standard library is still evolving rapidly with each beta so it’s possibly a little early to tell. For now, be careful about the assumptions you make – just because all the current implementations of a particular protocol work a certain way doesn’t mean other implementations will.

As opposed to a version that returned a copy, which would be called removed. I thought I didn’t like this convention at first, but I’m warming to it. ↩
This code has an efficiency deficiency, which we’ll talk about in a later article. ↩
For an in-depth explanation of Swift strings, see Ole Begemann’s article. ↩
Singly-linked lists couldn’t implement removeAtIndex easily, and would probably have some kind of removeAfterIndex operation instead. ↩
The C++ STL resolves this by having erase return an iterator for the entry just after the erased elements – along with a fairly draconian assertion that any removal invalidates all other iterators (not just ones at and beyond the removed value). But Swift’s removeAtIndex currently returns the removed value, rather than a new index. ↩

filter, String and ExtensibleCollectionType

Posted on August 22, 2014August 23, 2014 by airspeedvelocity

String was extended in beta 6 to implement RangeReplaceableCollectionType. This means that, via inheritance, it also implements ExtensibleCollectionType.¹

ExtensibleCollectionType is interesting, because it requires the collection to support an empty initializer. This means that, without having to resort to shenanigans, you can write a generic function that takes an ExtensibleCollectionType and returns a new one.

Since they were changed to return eagerly-evaluated results, the non-member filter and map have returned arrays, no matter what. This is a bit frustrating when working with some non-array types, such as String:²

let vowels = "eaoiu"
let isConsonant = { !contains(vowels, $0) }
let s = "hello, i must be going"
// filtered will be an array
let filtered = filter(s, isConsonant)
// and then we have to turn it back into a string
let only_consonants = String(seq: filtered)
// only_consonants is "hll,  mst b gng"

It would be nice to have a version of filter that took a String and returned a String instead of an Array. ³ Even better, it would be nice to have a single generic version that worked on both arrays and strings.

Here’s one:

func my_filter
  <C: ExtensibleCollectionType>
  (source: C, includeElement: (C.Generator.Element)->Bool)
  -> C {
    // use the `init()` from `ExtensibleCollectionType`
    var result = C()
    for element in source {
        if(includeElement(element)) {
            // append is also part of `ExtensibleCollectionType`
            result.append(element)
        }
    }
    return result
}

// my_filter returns a String when passed one:
let only_consonants = my_filter(s, isConsonant)

Since this is possible, should Swift’s filter and map be changed to be like this? Maybe, but I can think of a couple of reasons why not.

First, it’d be a bit inconsistent and possibly surprising. Not all collections are extensible collections. Dictionary isn’t. Range and StrideTo even less so – they’re like “virtual” collections that don’t really have individual elements at all. So there’d still need to be versions that took these collections and returned an array. So when calling filter, you’d need to know whether your collection was extensible to know whether you were going to get back the same collection type or an array.

There’s precedent for this kind of thing. lazy gives you back different types depending on what you pass in. But lazy is very explicit. map and filter would be a bit more subtle, and bear in mind subtle maybe-unexpected behaviour was probably the reason lazy evaluation was moved into the lazy family in the first place.

Second, maybe you do want an array back. This can be catered for – declare a second version of my_filter like so: ⁴

func my_filter
    <C1: ExtensibleCollectionType, C2: ExtensibleCollectionType
    where C1.Generator.Element == C2.Generator.Element>
    (source: C1, includeElement: (C1.Generator.Element)->Bool)
    -> C2 {
        var result = C2()
        for element in source {
            if(includeElement(element)) {
                result.append(element)
            }
        }
        return result
}

// same-type version will be used by default
let consonant_string = my_filter(s, isConsonant)
// but if you declare the result as a specific type, the
// second version will be used:
let consonant_array = my_filter(s, isConsonant) as Array

Third, there’s the big gotcha that means this wouldn’t be a good idea, but that I haven’t thought of. If you have, leave a comment or tweet me.

or rather, _ExtensibleCollectionType, which contains the goods, and that ExtensibleCollectionType just inherits without any additions. I’m not sure why it’s done this way, though I’m guessing it’s not for no reason. ↩
This is of course a horrible piece of code, ignoring upper-case characters, not to mention accented characters, but let’s keep the examples simple. ↩
Using lazy(s).filter is probably more efficient, since it won’t require the construction of temporary Array. But the issue of it being a two-step process remains. ↩
Having written the second version, you should probably implement the first one in terms of the second to avoid code duplication. ↩

Changes in the Swift Standard Library in Beta 6

Posted on August 18, 2014August 19, 2014 by airspeedvelocity

I managed to catch a copy of beta 6 before it was pulled. Though not a copy of the release notes, so apologies if I duplicate some items (and hopefully don’t misspeak about stuff better explained in them!). On the assumption the binaries will be the same except re-signed, here’s a rundown of the changes to the standard library. (edit: they were)

Feels like Swift might be approaching the 1.0 home-stretch, with the focus moving to stability and Objective-C API interfacing. Nevertheless, plenty of changes to the Swift standard library in beta 6.

By far the largest swathe of changes are additional comments on existing types and functions, which are definitely worth a read and clarify several things. For example, a comment above Comparable makes it clear you only need to define < to be comparable, despite Comparable defining the comparators that aren't <.

Some small bits and pieces:

The ?? operator has been updated to include a version to explicitly handle both the LHS and RHS being of the same optional type. I've updated my post with a comment, but it's still worth reading as a case study if you're writing a similar function.
Array now has an init that takes a _CocoaArrayType, as well as a noCopy flag, only to be set if the source array cannot be further mutated.
AutoreleasingUnsafeMutablePointer is no longer a BooleanType, so no longer has a boolValue property
Bit.Zero and Bit.One are now capitalized (don't say we don't pay attention to detail here!)
The Bool constructor, which previously took a parameter of BooleanType (which worked because BooleanType has no associated type requirements unlike, say, IntegerType), now takes a generic parameter T that must be of BooleanType. Interesting question to ponder is how this changes the function.
COpaquePointer has new constructors from raw memory addresses (these are described as “fundamentally unsafe”, you have been warned)
Character is now Comparable
The value properties of Float, Float80 and Double (which were of Builtin.FPIEEExx) are gone.
The FloatingPointType protocol now includes constructors from all the built-in integer types.
ImplicitlyUnwrappedOptional no longer conforms to BooleanType, though it still has its boolValue property (I should avoid using that if I were you).
Optional no longer has a hasValue property. You should just use != nil
RawOptionSetType no longer implements BooleanType and Equatable but instead implements BitwiseOperationsType
The FIXMEs about how StrideThrough and StrideTo should be collections not sequences are gone. They're still sequences.
UnicodeScalarView is now reflectable.
String has a new extend method that takes another string. This is in addition to the existing extend that takes a sequence of characters.
It also has an append function that takes a UnicodeScalar.
String is also now Comparable.
Strings unicodeScalars property is now writeable.
UnicodeScalar now has an init for UInt16 and UInt8 in addition to UInt32
UnsafeMutableBufferPointer has several changes. First, it is heavily commented. It's now a RandomAccessIndexType (so you can calculate distances between them, and advance them). And its constructors have been changed to be consistent with COpaquePointer.
_ExtensibleCollectionType has added an append function that appends a single element (all the implementors already support this)
_RawOptionSetType (and thus RawOptionSetType) is now Equatable.
contains now has a version that takes an equatable element rather than a predicate.
sorted now takes any sequence, which is way less restrictive as before it required a mutable random-access collection.
startsWith now has a version that takes a comparison predicate (but like equal requires both sequences to contain the same type, even if the predicate could handle two different types).

transcode, which I think officially has the longest function signature in the whole Swift library, got a teeny bit shorter as it now returns a (still a bit odd-looking) 1-tuple containing a Bool, rather than a 1-tuple containing a Bool labelled hadError. Oh, by the way, you're not allowed to return 1-tuples with labels as of beta 6.

There is a new AssertString type, which has a lower precedence for overloading purposes than StaticString. How this is achieved is interesting, and a demonstration of how there are still inheritance hierarchies with structs: AssertString implements a new AssertStringType protocol. StaticString also now implements a new protocol, StaticStringType. StaticStringType inherits from AssertStringType. This means StaticString is more specific than AssertString and will therefore “win” in choices for which overload to pick (in the same way a function taking a CollectionType wins over SequenceType if an object supports it, or RandomAccessIndexType wins over ForwardIndexType). The protocols StaticString previously implemented have moved to AssertStringType

There is a family of a new kind of assertion function, precondition. The signatures are very similar to that of assert. The comments suggest precondition is a little stronger – they will still stop program execution even if assertions are turned off. Only with -Ounchecked will they not check the condition. There's also a @noreturn preconditionFailure function that doesn't check anything, just stops execution immediately.

Various new comments in the Swift library suggest use of these new preconditions. For example, a comment above GeneratorType.next suggests calling preconditionFailure if called a second time after nil has already been returned.

There's a new protocol, RangeReplaceableCollectionType, that defines several new operations on collections such as removing or replacing ranges, as well as insert (insert an element into the middle of a collection), and splice (insert a collection into the middle of the collection).

String implements this new protocol. Interestingly, Array (alongside ContiguousArray and Slice) does not appear to, though it does support all the methods (including new splice and removeRange functions) and if you write a generic function that takes a RangeReplaceableCollectionType, you can pass an Array into it. Declared somewhere more private I guess? Unless I'm missing a bit of indirection somewhere. If you spot it, let me know on twitter.

Following a familiar pattern, several of these new functions, such as slice and the remove operations, are also availabe as non-member functions as well.

Finally, _BridgedToObjectiveCType and _ConditionallyBridgedToObjectiveCType appear to have coalesced into _ObjectiveCBridgeable, but as ever I'll steer clear of discussing bridging topics.

The case against making Array subscript return an optional

Posted on August 8, 2014August 8, 2014 by airspeedvelocity

I got a lot of interesting feedback from my previous article, regarding the proposal to change array subscripts to return optionals. Some pro, some anti.

The pro came mostly from people who’d been burned by out of bounds exceptions often and wanted Swift to change in a way that would help them not crash.

The anti camp was obviously not pro crashing, but felt that using optionals was too strict – that it would end up being counterproductive because in fighting with the optionals, developers would likely introduce as many bugs as they eliminated. Their arguments are pretty convincing.

A question tied up in this is, what are optionals for? Should nil be rare like an exception, or commonplace? Should you avoid using optionals unless you absolutely have to, or use them often, to represent things like unknown information? For a good discussion on how they should be rare, see this article by @nomothetis.

There’s definitely a lot of optional mis-use out there, in sample code on the internet – optionals for delayed initialization of members (where lazy stored properties would be better), or for representing collections that might not have contents (rather than just returning an empty collection).

For guidance, we could look at the Swift standard library and how it uses optionals:

In sequences, optionals are routine and informational – nil means you’ve reached the end of the sequence.
In the case of String.toInt(), its more to return an error – you passed in a bad string. Using toInt() to detect whether a string is an integer feels iffy.
Array.first seems in-between – is getting back nil an error or information? If you were expecting your array to always have values, the former. If you’re writing something generic that needs to handle empty arrays, it’s the latter, a convenience for combining the check for empty and getting the value.
In find, nil means not found, because maybe the collection doesn’t contain that value. This isn’t an error. But there was an alternative – it could have returned Collection.endIndex, which is what the C++ find does. The optional forces you to check, whereas with endIndex you could forget/not bother.

Array subscripts are closest to the last example. You’ve got an index, but maybe it’s out of bounds, so you should be encouraged to check before you use it. So it seems like a good fit.

The problem is when it becomes annoying without benefit. Take this code:

for idx in indices(a) {
    // this is dumb, of course it has a value!
    if let val = a[idx] {
        // do something
    }
}

Faced with the above case too often, developers would probably start to get unwrap fatigue. They inspect the code, see that there’s no way the index could not be valid (there’s no index arithmetic going on there, just use of an index that is guaranteed to be within bounds), and just force unwrap instead.

This is a slippery slope. Once you start doing that, you do it all the time. Not just with guaranteed-safe cases but others that aren’t, and one time you use a closed rather than half-open range and bang, run-time error. Only it’s not a array bounds error, it’s a force-unwrap nil error. So we’re back to where we started, only with less helpful debug info and a bunch of exclamation marks all over our code.

So a better solution is needed – one that stays out of the way when doing things that can’t go wrong, but helps with handling the cases that can, like at the edges of ranges or when performing arithmetic on indices.

At this point, I’m pinning my hopes on the second option in my original article – a kind of index type that could never not point to a value in the collection. But implementing this is tricky – especially if it might involve changing the index type protocols, which would break everything that builds on top of them.

In the mean-time, if you feel your code would be better off with an optional array index function, you can of course extend array to provide it. The useful swiftz library has one, safeIndex, that you could re-use.

Null Pointer Exceptions fixed, next up…

Posted on August 6, 2014December 22, 2014 by airspeedvelocity

Edit: there is a follow-up to this post, giving the case against this idea, which you should read after this one.

My not-statistically-proven assertion (after working in the LOB development mines for years) is that Null Pointer Exception is the #1 cause of crashes for apps written in memory-managed languages.¹

The hope is, by introducing optionals, that this kind of error get pushed way down the leaderboard in Swift. Sure you can still force-unwrap a nil value. But you have to load the gun, cock it, and then point it at your foot. I love that the force-unwrap operator is an exclamation mark.

Who is the aspiring hopeful cause of crashes, eyeing that vacated top spot? I’m guessing the Array Out of Bounds exception.

This one is still very much at large in Swift. Int is the index for arrays, and there’s nothing stopping you blatting straight past the end, getting a runtime assertion or possibly even scribbling into memory depending on what you’re doing and how you’re compiling.

This is really just a step up from pointers.² There’s just too much leeway for unreformed C programmers to write the same crappy old code:

// But it worked when I tested it!  
// (with an odd-sized array)
func reverseInPlace<T>(inout a: Array<T>) {
    var start = 0
    var end = a.count - 1
    while start != end {
        swap(&a[start], &a[end])
        start++; end--
    }
}

It doesn't have to be this way. It's totally fixable, just like null pointers were. There are two good options, an easy one and a (slightly) harder one.

The easy one

Make Array.subscript return an optional. Only if your index is within bounds will it return a value, otherwise it'll return nil. This is like the approach Dictionary takes with lookups. If your dictionary doesn't contain a particular key, you get a nil.

“But that's because dictionaries are different” you say. No they aren't. Dictionary has a method to check if a key is present. You could call that first and then, if it is, get the value. But people don't want to do that, so they skip straight to the getting out the value part, and because that's a bit risky, it returns nil if the key isn't there. Likewise, Array has methods for checking if the array is empty or if an index is beyond the end. You don't have to bother checking for that, but if you mess up, boom!

The most common case is probably getting the first element with a[0]. So common is this that beta 5 introduced Array.first. Which returns… an optional. If that doesn't convince you, I don't know what will.

I expect this change would elicit some moaning. Similar to the complaints about how you can't index randomly into the middle of a Swift string. But as with the string case, the question really is, how often do you need to do this anyway? Use for...in, use find, use first and last. Don't hack at your array like a weed.

Developers who hate it and want the old behaviour could just stick a ! after their subscript access and it'd be back to the way it was. Presumably these people are compiling -Ounchecked. Let's see whether their users find the snappy performance makes up for the random crashing.

The harder one

Stop using Int for indexing into random-access collections. Use a real index object.

Again, Dictionary already does this. To index over/into a dictionary you have to get a DictionaryIndex object, which has no public initializers so can only be fetched via methods on Dictionary.

If an index type is random-access (which Dictionary’s isn't), then they can be compared, added together, advanced by numeric amounts in O(n), just like an integer can.

To allow arrays to return a non-optional value when subscripted with this index, you'd need to tweak the index object to have successor() etc return an optional, with nil for if it's gone beyond the collection's bounds. This way, the index can be guaranteed to point to a valid entry.³

Again, this would make them more of a pain to use, in exchange for safety. But the other downside to this approach is to implement this, indices would need to know about the specific collection they index. Which means they'd need a reference to it, which introduces other complications like reference cycles and an increased size. Not a deal-breaker though.

Indices knowing about their container would have the side benefit of allowing them to become dereferenceable (see a previous post for more on this).

It would also allow them to guard against the following, which compiles and runs but is presumably a bad idea:⁴

let d1 = [1:1, 2:2, 3:3]
let d2 = [7:7, 2:2, 1:1, 4:4]

// get an index from d1
let idx = d1.startIndex.successor()
// and use it on d2...
d2[idx]  // returns (2,2)

No need to pick one

These two approaches are not mutually exclusive. Array could provide both an Int subscript and an index one – the former giving back an optional, the latter not.

I like that idea. The index option has downsides though. There's also a compromise, where the Int version is checked, but the index object version is unchecked, on the basis that if you're using indices you're thinking a bit more carefully about what you're doing.

But here's hoping at least the first option makes it into a subsequent beta before the current implementation's set in stone.

If there's a reason I'm missing that means these schemes are hopelessly naive, let me know at @airspeedswift on twitter.

In non-managed code, the number one cause of crashes is crap, I still haven’t found the problem, where did my whole day go? ↩
Obviously that’s a long step. You can’t do in Swift my favourite silly thing to do in C: char s[] = "hello"; int i = 3; cout<<i[s]<<endl; ↩
All this is assuming the indices are into collections that aren't changing underneath them. Indexing mutating collections is a whole different ball-game. ↩
Maybe it's ok? If the index is to a key common to both, it works. If it's an index to a key not in d2, you get an invalid index assertion. Still probably a bad idea. ↩

Changes in the Swift Standard Library in Beta 5

Posted on August 4, 2014August 4, 2014 by airspeedvelocity

Hey, happy Monday, none of your Swift code will compile!

Nothing a quick find and replace can’t cure. “Several” protocols have been renamed with the -Type suffix to avoid confusion, say the release notes. “A crapload” may have been a more accurate term but that probably wouldn’t pass Apple QA. Mostly this is a lot clearer – especially with IntegerType, as there were lots of confused beginners wondering why they couldn’t declare an Integer variable.

Unmentioned, but in a similar vein in the opposite direction, various typealiases within protocols have had their Type suffix removed. IndexType, GeneratorType, KeyType, ValueType and SliceType are now just Index, Generator, Key, Value and SubSlice.

This I guess makes up for the extra verbosity of now having to type CollectionType that bugs me a bit, though it will give me a reason to just write “collection” in future articles without always feeling guilty that I haven’t capitalized and monospace-fonted it.

A quick list of small items:

All the operators declared at the top now have visible precedence levels and associativity, which is handy if you are trying to target the precedence of your own operators.
Lots of new operators for strides and intervals.
The weird use of [T], [K:V] and T? when extending Array, Dictionary and Optional is gone, as has ArrayType.
Character is now hashable.
The trend of replacing proxies of C-style things with pointer objects, that started last beta with replacing CString, continues with UnsafeArray and UnsafeMutableArray becoming UnsafeBufferPointer and UnsafeMutableBufferPointer.
The law finally caught up with Array and made it mark its various mutable methods as mutable. Same for ContiguousArray and Slice.
Various integer types have had the getArrayBoundValue function changed to an arrayBoundValue get property.
String now has inits that take integers. You can also supply a radix, as well as whether you want digits above 9 to be in upper or lower case.
In addition to Array acquiring first and last, the lazy collections also implement first (and last if they can index bidirectionally), as well as isEmpty.
There’s also a non-member first and isEmpty that takes a collection, and a last that takes a collection if it has a bidirectional iterator.
Optional now has a hasValue property instead of a getLogicValue() function.
String.compare is gone.
UInt and family’s asSigned() has gone. Instead, use numericCast which appears to have been revamped a bit to do more at compile time.
UInt now has a bitwise initializer from its signed equivalent, which you need to explicitly call with a bitPattern: argument.
Relatedly, reinterpretCast has been renamed unsafeBitCast. You have been warned.
New prefix and suffix functions take a Sliceable and return the start or end of it as a slice.
You can no longer access the underlying sequences of a Zip2 sequence.
There’s now a version of assert that takes a BooleanType expression, not just a bool.
The comment descriptions for sort now have a helpful link to wikipedia about strict weak ordering!
The function passed to withExtendedLifetime actually takes x as an argument now.

Poor old Range has been thoroughly demoted. One minute it’s on top of the world, then last time stride takes its spot for non-contiguous increments. Now ClosedInterval and HalfOpenInterval replace it for general purpose ranges. They also follow the pattern adopted by StrideTo and StrideThrough of having completely different types for the two kinds, determined at compile time. They require their initializing values to conform to ComparableType, which means they can detect if they are inverted and can efficiently detect overlap between two intervals.

Range is now relegated entirely to managing collection indices. The odd thing about this is that Int, probably the most common thing to use to initialize a range, is also an index type, so Range still gets to come out and play all the time.

By the way, I love how the definitions of StrideTo and StrideThrough have a FIXME in their comments about how they ought to be collections.

Finally, ever wonder why when you declared your class to be a Collection, you got chewed out by the compiler for not implementing _Collection? Well out of the shadows emerges the (newly renamed) _CollectionType. Here you will find the missing bits of what made collections work – startIndex, IndexType etc. Same goes for _Sequence, _IntegerArithmeticType etc. It’s not clear why these aren’t just pulled up into their non-underscore equivalents, but at least now you can go look at them instead of having to reverse-engineer what you needed to implement.

Tuple-wrangling

Posted on August 2, 2014June 26, 2015 by airspeedvelocity

EDIT: updated for Swift 2.0

Attention conservation notice: this is a bit of a ramble with no real payoff.

@OldManKris posted some code on github earlier in the week, with a version of zip that could zip together 3 arrays instead of 2. He also implemented an unzip function, that took an array of tuples and unzipped it into a tuple of arrays, that is:

let tuples = [
    ( 1, "1", 1.0), 
    ( 2, "2", 2.0), 
    ( 3, "3", 3.0),
]

let arrays = unzip(tuples)

// arrays now contains:
// ([1,2,3],["1","2","3"],[1.0,2.0,3.0])

Given my recent posts, my mind immediately went to making unzip lazy. But before we get to that, let’s muck about with tuples some.

The Swift standard library has a class for zipping together two sequences into a sequence of tuples. It’s called Zip2, and it’s a lazy sequence that only zips on demand, returning a 2-tuple of the corresponding elements of that position in the two collections:

let r = 1...3
let a = [6, 7, 8]
let z = Zip2(a,r)
for (x, y) in z {
    // iterate over (1,6),
    // (2,7) and (3,8)
}

Why Zip2? Why not just Zip, a class that takes an arbitrary number of sequences and zips them together into an n-tuple? Its init could take a variadic argument for the sequences. Why isn’t it that easy?

Because types, that’s why. Specifically, the type of of Zip2.GeneratorType.Element. It’s a 2-tuple, containing the types of the elements of the two sequences. It’s generic, sure – the first and second element can be of any type. But it’s emphatically a 2-tuple: ZipGenerator2 defines typealias Element = (E0.Element, E1.Element). If you wanted to create another version that supported a 3-tuple, you’d have to create a whole new class Zip3 that defined Element to be of type (E0.Element, E1.Element, E2.Element). There’s no general way of representing an n-tuple.

Similarly, Kris’s code defines unzip3 for operating on 3-tuples. Is there any way of unzipping an arbitrary tuple? The name unzip3 is taken from Haskell, which follows the same convention (as well as unzip4, unzip5 etc.). That Haskell (and Scala, and F#) doesn’t just define a generalized zip/unzip suggests it’s probably hard to do.¹ ²

edit: Swift 1.2 introduced a zip function, that takes two sequences and returns a Zip2. This could be a step towards overloading to return Zip3, Zip4 etc. for more than two sequences. So you can now write:

let r = 1...3
let a = [6, 7, 8]
let z = zip(a,r)
for (x, y) in z {
    // iterate over (1,6),
    // (2,7) and (3,8)
}

Unless… we make the switch from compile-time to run-time type identification. Then we have more tools at our disposal – Any, a type which can stand in for any type, and Mirror, the class that allows you to reflect and inspect the contents of a type.

Using Any might put some people off. Possibly if you’re a C programmer, you might worry it’s like void* and you’re one mis-cast away from an explosion. But I think of Any much like optionals – safe to use with the support the language gives you (such as if let x = y as? Type, similar to if let x = y for optionals).

Just like optionals, before you use it, you should ask the question “is there any other way?”. Just as it’s a pain to have to check an optional when it didn’t need to be an optional, it’s a pain to have to cast an Any, but if there’s no neater way then it’s OK.

The Mirror protocol allows you to examine any type at runtime. You get an object’s mirror by calling reflect(obj), passing in your object. If the object implements the Reflectable protocol, then the mirror you get back will be that type’s specific implementation of Mirror. If it doesn’t, you’ll get back the default implementation. Playgrounds use mirrors to determine how to display the values of variables, and a custom mirror would enable a type to pretty-print in the playground, for example like Array does.

The fact that Mirror is used by playgrounds has led some to infer it’s not to be relied on for other purposes – that it’s just there for debug. I’m not so sure – that it’s there in the standard library, and objects can implement their own mirrors, suggests it’s more solidly part of the language proper, like IntegerLiteralConvertible or Collection. As much as anything is in this beta is, anyway.³

Mirror isn’t a collection, but it is “collection-like”, in that it has a count property and supports subscripting. This is a good point, by the way – you don’t have to be a collection to be subscriptable.

The default implementation counts and exposes an object’s member variables, with the subscript returning a pair with the name and value of each member. It also provides a disposition enum, which indicates what kind of object it is – a struct, an enum, an index container (such as an array).

Or… a tuple. That means we could use mirrors to inspect a tuple at runtime to determine how many elements it has. Here is a struct, TupleCollectionView, that uses mirrors to take a tuple and make it look like a collection:

struct TupleCollectionView<T>: CollectionType {
    private let _mirror: MirrorType
    init(_ tuple: T) {
        _mirror = reflect(tuple)
    }

    var startIndex: Int { return 0 }

    var endIndex: Int {
        guard case .Tuple = _mirror.disposition
          else { return 1 }

        return _mirror.count
    }

    subscript(idx: Int) -> Any {
        guard case .Tuple = _mirror.disposition
          else { return _mirror.value }

        return _mirror[idx].1.value
    }
}

let t = (7,8,9)
let c = TupleCollectionView(t)
c[0]  // returns 7
c[2]  // returns 9

But wait, you ask, what if I pass something other than a tuple in? Well, I answer (in a deeply philosophical tone), what really is a variable if not a tuple of one element? Every variable in Swift is a 1-tuple. In fact, every non-tuple variable has a .0 property that is the value of that variable.⁴ Open up a playground and try it. So if you pass in a non-tuple variable into TupleCollectionView, it’s legit for it to act like a collection of one. If you’re unconvinced, read that justification again in the voice of someone who sounds really confident.

By using this view, we can easily write a version of unzip that takes a collection of n-tuples and returns them unzipped, if not as an array of tuples, then at least as an array of arrays.⁵

func unzip<C: CollectionType>(tuples: C) -> [[Any]] {
    var unzipped: [[Any]] = []

    var g = tuples.generate()
    if let t = g.next() {
        // take the first element and set up
        // the return array
        let c = TupleCollectionView(t)
        for e in c {
            unzipped.append([e])
        }

        // slot the remaining tuples into it
        // (need GeneratorSequence because not all
        //  collection generators are sequences)
        for t in GeneratorSequence(g) {
            let c = TupleCollectionView(t)
            for (i,e) in c.enumerate() {
                unzipped[i].append(e)
            }
        }
    }
    return unzipped
}

let zipped = [(1,"one", 1.0),(2,"two", 2.0)]
let unzipped = unzip(zipped)
// unzipped is now [[1, 2], ["one", "two"], [1.0, 2.0]]

We return [[Any]] for two reasons. First, because it’s what the mirror returns for the tuple elements. But second, because it’s the only way to dynamically return an array of different types without using tuples.

But note we don’t have to take in an Any as an argument. The input type T can be fixed at compile time – we just don’t know what arity it is until inspecting it at run-time. But because every element in the array is of the same fixed type, we can safely construct the size of the array to return based on the first element.⁶

Is there any way to lock in the return type at compile time? Sorta. If you’re willing to be constrained only to tuples of all of the same type, you could specify the return type explicitly and then cast the Any to it:

// note, additional U placeholder
struct TupleCollectionView<T,U>: CollectionType {
// ...

    subscript(idx: Int) -> U {
        guard case .Tuple = _mirror.disposition
            else { return _mirror.value as! U }

        return _mirror[idx].1.value as! U
    }
}

let t = (1,2)
let c = TupleCollectionView<(Int,Int),Int>(t)


func unzip<C: CollectionType,U>(tuples: C) -> [[U]] {
    var unzipped: [[U]] = []

    var g = tuples.generate()
    if let t = g.next() {
        let c = TupleCollectionView<C.Generator.Element,U>(t)
        for e in c {
            unzipped.append([e])
        }

        for t in GeneratorSequence(g) {
            let c = TupleCollectionView<C.Generator.Element,U>(t)
            for (i,e) in c.enumerate() {
                unzipped[i].append(e)
            }
        }
    }
    return unzipped
}

let zipped = [(1,2), (3,4)]
let unzipped: [[Int]] = unzip(zipped)

There are a few downsides to this, aside from being constrained to homogenous tuples.

First, having to declare both the in and out types is unweildy, though a little less unweildy with unzip, which at least can infer everything but the return type.

Second, what if you do this?

// this won't compile
let ints = [(1,2), ("3","4")]

// but this will
let ints = [(1,2), (3,4)]
let strings: [[String]] = unzip(ints)

// and so will this
let mixed = [(1,"2"), (3,"4")]
let ints: [[Int]] = unzip(mixed)

In both the latter cases, you get a kaboom at runtime.

We could replace the as with as?, but that would require unzip to return an optional, as in let ints: [[Int?]] = unzip(mixed), which is as much inconvenience as returning an Any.

My hope is there’s a clever way of determining, at compile time, that every element of the tuple T is of type U. But if there is, it’s beyond me. If someone smarter at compile-time type-wrangling than me thinks of a solution, hit me up on twitter.⁷

There is actually a solution in Haskell, involving Applicative Types. But there’s no way I’m lunging head-first into a description of that based on my iffy Haskell skills. ↩
Erlang can do it! Erlang tuples can be composed at runtime. Erlang gets no love from the Swift crowd :-( ↩
Unlike, say, __conversion() which feels like an implementation detail leaked out. Those underscores are basically a sign on the door saying “Beware of the Leopard”. EDIT: __conversion was disappeared in later versions of Swift. ↩
Interestingly, if you pass a tuple into a function that takes a generic argument, .0 will refer to the entire tuple, not the first element of the tuple. Well, it’s interesting to me, anyway. ↩
Once again I find myself wishing I’d written that generic collection head and tail function. ↩
OK, not entirely safely. You can break this by passing in an [Any] that contains tuples of variable length. Using Any is not quite as crash-proof as I might have led you to believe. ↩
This would be cool because you could then probably use the same technique to confirm every element of the tuple conformed to a protocol. If every tuple element was equatable, you could then write a n-tuple implementation of ==. ↩

Randomly Lazy

Posted on July 30, 2014July 31, 2014 by airspeedvelocity

In the previous article on extending lazy, we extended the lazy views with first_n, which returned a sequence of the first n elements of a sequence.

But this isn’t great if what you passed in was a random-access indexed collection. Where possible, it’s nice to retain random access on the output of your lazy function. The lazy map does this, for example. If you pass an array into lazy then call map, the collection you get back is a LazyRandomAccessView, and you can index randomly into it just like you could into an array.

To do this for first_n, we need a struct that implements Collection, takes a Collection as an initializer plus a count n, and layers a view over the top of that collection that only exposes the first n elements.

Or, in the more general case, a view that takes a collection and a sub-range within that collection. Here it is:

struct SubrangeCollectionView<Base: Collection>: Collection {
    private let _base: Base
    private let _range: Range<Base.IndexType>
    init(_ base: Base, subrange: Range<Base.IndexType>) {
        _base = base
        _range = subrange
    }

    var startIndex: Base.IndexType {
      return _range.startIndex
    }
    var endIndex: Base.IndexType {
      return _range.endIndex
    }

    subscript(idx: Base.IndexType)
        -> Base.GeneratorType.Element {
        return _base[idx]
    }

    typealias GeneratorType
        = IndexingGenerator<SubrangeCollectionView>
    func generate() -> GeneratorType {
        return IndexingGenerator(self)
    }
}

This sub-range collection view is so useful that I was sure it was somewhere in the Swift standard library. That’s partly why I wrote this article on collection and sequence helpers – as a side-effect of going line by line looking for it. I’m still expecting someone to reply to this article saying “no you fool, you just use X”. If you are that person, call me a fool here.

The Sliceable protocol seems designed to provide this function. It requires a collection to typealias a SliceType and implement a subscript(Range)->SliceType method. But you need the collection to implement sliceable, which seems overly restrictive, whereas the view above works for any collection.

SubrangeCollectionView enables you to pass in a sub-range of any collection to any algorithm that takes a collection or sequence. For example:

let r = 1...10
let halfway = r.startIndex.advancedBy(5)
let top_half = SubrangeCollectionView(r,
                subrange: halfway..<r.endIndex)
reduce(top_half,0,+)  // returns 6+7+8+9+10

A nice feature is that if you pass a subrange collection into an algorithm that returns an index, such as find, the index returned can also be used on the base collection. So in the example above, if you ran if let idx = find(top_half,8), you could then use idx on not just top_half[idx], but also r[idx].

Operating on subranges is a capability that C++ developers, used to the STL, will be missing from Swift collection algorithms. In the STL, operations on containers are performed not by passing the container itself to the algorithm, but instead by passing two iterators on that container defining a range.

For example, here is the definition of the find algorithm in the C++ STL compared to the Swift version:

// C++ std::find
template <class InputIterator, class T>
  InputIterator find
    (InputIterator first, InputIterator last, const T& val);

// Swift.find
func find
  <C: Collection where C.GeneratorType.Element: Equatable>
    (domain: C, value: C.GeneratorType.Element) -> C.IndexType?

While Swift.find takes a collection as its first argument, std::find takes a first and last iterator.

STL container iterators are similar to Swift collection indices, in that they can be forward, bidirectional or random, and are used to move up and down the collection. Like Swift indices, the end iterator points to one past the last element – the equivalent of Swift.find returning nil for not found would be std::find returning end.

But unlike Swift indices, STL iterators model C pointer-like behaviour, so they are dereferencable. To get the value the iterator points to, you don’t have to have access to the underlying container, using something like Swift’s subscript. To get the value at first, you call *first. This is why std::find doesn’t need to take a container in it’s input. It just increments first until either it equals last, or *first equals val.

There are lots of reasons why the STL models iterators like this: because it’s like pointers into C arrays; because it dodges memory management issues (which won’t apply to Swift) etc. But also because, with this model, every algorithm will work just as well on a sub-range of a container as on the whole container. You don’t have to pass the starting or ending iterator for the container into find, you can pass in any two arbitrary points.¹

Swift indices, on the other hand, are pretty dumb beasts. They don’t have to know about what value they point to or what collection they index. Heck, the index for Array is just an Int.²

For all the advantages of Swift’s just passing in collections as arguments has (it looks a lot cleaner, doesn’t confuse beginners),³ it lacks this ability to apply algorithms to subranges. SubrangeCollectionView gives you that ability back.

Anyway, back to the original problem, which was to enhance LazyRandomAccessView with a version of first_n that returns another LazyRandomAccessView. With the help of SubrangeCollectionView, here it is:

extension LazyRandomAccessCollection {
  func first_n
    (n: LazyRandomAccessCollection.IndexType.DistanceType)
    -> LazyRandomAccessCollection<SubrangeCollectionView<LazyRandomAccessCollection>> {
      let start = self.startIndex
      let end = min(self.endIndex, start.advancedBy(n))
      let range = start..<end
      let perm = SubrangeCollectionView(self, subrange: range)
      return lazy(perm)
    }
}

let a = [1, 2, 3, 4, 5, 6, 7]
let first3 = lazy(a).first_n(3)
first3[1]  // returns 2

One final interesting question is whether to do the same for forward and bidirectional lazy views as well. The problem here is computing the endIndex for the view. Unlike with a random-access index, you can’t just advance them by n in constant time. It takes O(n) because the index has to be walked up one by one. For this reason, I’d stick with returning sequences for these.

Of course, you can also pass a later iterator to first than to last, at which point, kaboom! A risk which passing in the container itself eliminates. ↩
Of course, you could write an über-index type for your collection, that did understand about what it pointed at. Maybe you could even overload the * operator for it to dereference itself. ↩
And lets face it, you operate on the whole collection 99.9% of the time. ↩

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

Airspeed Velocity

African or European Swift?

swift standard library