Protocols and Assumptions

Posted on August 29, 2014January 10, 2015 by airspeedvelocity

edit: subsequent to this article being written, the Swift standard library has been updated, and documentation-comments above the relevant methods of RangeReplaceableCollectionType now explicitly state: “Invalidates all indices with respect to self.” Bear this in mind as you read on:

What does it mean to implement a protocol? Possibly more than supporting methods with specific names.

Obviously the methods ought to do what their names imply – isEmpty shouldn’t fire the torpedoes. But as well as basic functionality, there are also guarantees about things like the method’s complexity, and possibly wider implications for how the class itself behaves.

More importantly, protocols might not guarantee a behaviour you think they do and are relying on. Using protocols with generics can sometimes give you the illusion of more guarantees than you actually have.

Suppose you want to write a function remove that removes entries from a collection in-place – that is, similar to the sort function, it takes a collection as an inout parameter, and removes from it all the entries that match a given value. ¹

Unlike sort, which just requires MutableCollection, remove would need to use the new-in-beta6 RangeReplaceableCollectionType, which includes a removeAtIndex method. Armed with this, you might write the following: ²

func remove
    <C: RangeReplaceableCollectionType,
     E: Equatable
     where C.Generator.Element == E>
    (inout collection: C, value: E) {
        var idx = collection.startIndex
        while idx != collection.endIndex {
            if collection[idx] == value {
                collection.removeAtIndex(idx)
            }
            else {
                ++idx
            }
        }
}

Embedded in this code are a lot of assumptions about how collections and their indices behave when you mutate them, and some of these might not be valid, depending on the collection type. It works with String, the only explicitly range-replaceable collection in the standard library currently, as well as the secretly range-replaceable Array, ContiguousArray and Slice. But you could easily implement a new type of collection that was range-replaceable for which the above code would explode in flames.

The biggest assumption is that removing an element from a collection does not completely invalidate an index. That is, after you call collection.removeAtIndex(idx), idx remains a legitimate index into the collection rather than just becoming junk.

Next, there’s the assumption that when you remove a entry at an index, that index will now point to the next element in the collection. That’s why, after removing the entry, the code above just goes straight back around the loop without incrementing idx. You could put it another way – when you remove an element, the next element “moves down” to the position indexed by the removed element.

Finally, there’s the assumption that if the element that idx points to is the last element, then what idx will point to after you remove it will be endIndex. Or, to put it another way, as you remove the last element, endIndex “moves down” to equal the index of the last element.

By the way, this last assumption is why the code uses while idx != collection.endIndex rather than the more-elegant for idx in indices(collection). indices returns a Range object between the collection’s start and end, but it would be created before we start looping. Because endIndex is a moving target as we remove some entries from the collection, it won’t work for our purposes. A cleverer version of indices that returned something more dynamic might help, but that could have different undesirable side-effects.

Are all these assumptions legit? Well you can see they obviously are for the Array types, because these just use an integer for their index, with endIndex equal to count. When elements are removed, the rest of the array shuffles down. Even if that resulted in a full reallocation of the array’s memory, the assumptions would still hold because all the index does is represent a relative offset from the start of the array, not point to a specific location.

Strings are trickier, because their index type is opaque. Chances are it’s still an offset from the start of the string, but because Swift characters aren’t of uniform width, ³ that offset doesn’t necessarily increment by a constant amount each time. Still, if this is the implementation, the above assumptions would hold, and experimentation suggests they do.

What kind of collection might not adhere to these assumptions? Well, a very simple doubly-linked list implemention might not. ⁴ If the index for a linked list were a pointer to each node, then removing that node could leave the index pointing at a removed entry. You couldn’t just loop around without first pointing to the next node in the list:

func remove
    <C: RangeReplaceableCollectionType,
     E: Equatable
     where C.Generator.Element == E>
    (inout collection: C, element: E) {
        var idx = collection.startIndex
        while idx != collection.endIndex {
            if collection[idx] == element {
              // first grab the index of the next element
              let next = idx.successor()
              // then remove this one
              collection.removeAtIndex(idx)
              // and repoint
              idx = next
            }
            else {
                ++idx
            }
        }
}

But then this algorithm would no longer work correctly with arrays and strings! ⁵

So what’s the solution – should RangeReplaceableCollectionType mandate the kind of index validity behaviour our remove algorithm relies on? Or are the assumptions invalid and we need a better algorithm? (of which more in a later article) The Swift standard library is still evolving rapidly with each beta so it’s possibly a little early to tell. For now, be careful about the assumptions you make – just because all the current implementations of a particular protocol work a certain way doesn’t mean other implementations will.

As opposed to a version that returned a copy, which would be called removed. I thought I didn’t like this convention at first, but I’m warming to it. ↩
This code has an efficiency deficiency, which we’ll talk about in a later article. ↩
For an in-depth explanation of Swift strings, see Ole Begemann’s article. ↩
Singly-linked lists couldn’t implement removeAtIndex easily, and would probably have some kind of removeAfterIndex operation instead. ↩
The C++ STL resolves this by having erase return an iterator for the entry just after the erased elements – along with a fairly draconian assertion that any removal invalidates all other iterators (not just ones at and beyond the removed value). But Swift’s removeAtIndex currently returns the removed value, rather than a new index. ↩

filter, String and ExtensibleCollectionType

Posted on August 22, 2014August 23, 2014 by airspeedvelocity

String was extended in beta 6 to implement RangeReplaceableCollectionType. This means that, via inheritance, it also implements ExtensibleCollectionType.¹

ExtensibleCollectionType is interesting, because it requires the collection to support an empty initializer. This means that, without having to resort to shenanigans, you can write a generic function that takes an ExtensibleCollectionType and returns a new one.

Since they were changed to return eagerly-evaluated results, the non-member filter and map have returned arrays, no matter what. This is a bit frustrating when working with some non-array types, such as String:²

let vowels = "eaoiu"
let isConsonant = { !contains(vowels, $0) }
let s = "hello, i must be going"
// filtered will be an array
let filtered = filter(s, isConsonant)
// and then we have to turn it back into a string
let only_consonants = String(seq: filtered)
// only_consonants is "hll,  mst b gng"

It would be nice to have a version of filter that took a String and returned a String instead of an Array. ³ Even better, it would be nice to have a single generic version that worked on both arrays and strings.

Here’s one:

func my_filter
  <C: ExtensibleCollectionType>
  (source: C, includeElement: (C.Generator.Element)->Bool)
  -> C {
    // use the `init()` from `ExtensibleCollectionType`
    var result = C()
    for element in source {
        if(includeElement(element)) {
            // append is also part of `ExtensibleCollectionType`
            result.append(element)
        }
    }
    return result
}

// my_filter returns a String when passed one:
let only_consonants = my_filter(s, isConsonant)

Since this is possible, should Swift’s filter and map be changed to be like this? Maybe, but I can think of a couple of reasons why not.

First, it’d be a bit inconsistent and possibly surprising. Not all collections are extensible collections. Dictionary isn’t. Range and StrideTo even less so – they’re like “virtual” collections that don’t really have individual elements at all. So there’d still need to be versions that took these collections and returned an array. So when calling filter, you’d need to know whether your collection was extensible to know whether you were going to get back the same collection type or an array.

There’s precedent for this kind of thing. lazy gives you back different types depending on what you pass in. But lazy is very explicit. map and filter would be a bit more subtle, and bear in mind subtle maybe-unexpected behaviour was probably the reason lazy evaluation was moved into the lazy family in the first place.

Second, maybe you do want an array back. This can be catered for – declare a second version of my_filter like so: ⁴

func my_filter
    <C1: ExtensibleCollectionType, C2: ExtensibleCollectionType
    where C1.Generator.Element == C2.Generator.Element>
    (source: C1, includeElement: (C1.Generator.Element)->Bool)
    -> C2 {
        var result = C2()
        for element in source {
            if(includeElement(element)) {
                result.append(element)
            }
        }
        return result
}

// same-type version will be used by default
let consonant_string = my_filter(s, isConsonant)
// but if you declare the result as a specific type, the
// second version will be used:
let consonant_array = my_filter(s, isConsonant) as Array

Third, there’s the big gotcha that means this wouldn’t be a good idea, but that I haven’t thought of. If you have, leave a comment or tweet me.

or rather, _ExtensibleCollectionType, which contains the goods, and that ExtensibleCollectionType just inherits without any additions. I’m not sure why it’s done this way, though I’m guessing it’s not for no reason. ↩
This is of course a horrible piece of code, ignoring upper-case characters, not to mention accented characters, but let’s keep the examples simple. ↩
Using lazy(s).filter is probably more efficient, since it won’t require the construction of temporary Array. But the issue of it being a two-step process remains. ↩
Having written the second version, you should probably implement the first one in terms of the second to avoid code duplication. ↩

Changes in the Swift Standard Library in Beta 6

Posted on August 18, 2014August 19, 2014 by airspeedvelocity

I managed to catch a copy of beta 6 before it was pulled. Though not a copy of the release notes, so apologies if I duplicate some items (and hopefully don’t misspeak about stuff better explained in them!). On the assumption the binaries will be the same except re-signed, here’s a rundown of the changes to the standard library. (edit: they were)

Feels like Swift might be approaching the 1.0 home-stretch, with the focus moving to stability and Objective-C API interfacing. Nevertheless, plenty of changes to the Swift standard library in beta 6.

By far the largest swathe of changes are additional comments on existing types and functions, which are definitely worth a read and clarify several things. For example, a comment above Comparable makes it clear you only need to define < to be comparable, despite Comparable defining the comparators that aren't <.

Some small bits and pieces:

The ?? operator has been updated to include a version to explicitly handle both the LHS and RHS being of the same optional type. I've updated my post with a comment, but it's still worth reading as a case study if you're writing a similar function.
Array now has an init that takes a _CocoaArrayType, as well as a noCopy flag, only to be set if the source array cannot be further mutated.
AutoreleasingUnsafeMutablePointer is no longer a BooleanType, so no longer has a boolValue property
Bit.Zero and Bit.One are now capitalized (don't say we don't pay attention to detail here!)
The Bool constructor, which previously took a parameter of BooleanType (which worked because BooleanType has no associated type requirements unlike, say, IntegerType), now takes a generic parameter T that must be of BooleanType. Interesting question to ponder is how this changes the function.
COpaquePointer has new constructors from raw memory addresses (these are described as “fundamentally unsafe”, you have been warned)
Character is now Comparable
The value properties of Float, Float80 and Double (which were of Builtin.FPIEEExx) are gone.
The FloatingPointType protocol now includes constructors from all the built-in integer types.
ImplicitlyUnwrappedOptional no longer conforms to BooleanType, though it still has its boolValue property (I should avoid using that if I were you).
Optional no longer has a hasValue property. You should just use != nil
RawOptionSetType no longer implements BooleanType and Equatable but instead implements BitwiseOperationsType
The FIXMEs about how StrideThrough and StrideTo should be collections not sequences are gone. They're still sequences.
UnicodeScalarView is now reflectable.
String has a new extend method that takes another string. This is in addition to the existing extend that takes a sequence of characters.
It also has an append function that takes a UnicodeScalar.
String is also now Comparable.
Strings unicodeScalars property is now writeable.
UnicodeScalar now has an init for UInt16 and UInt8 in addition to UInt32
UnsafeMutableBufferPointer has several changes. First, it is heavily commented. It's now a RandomAccessIndexType (so you can calculate distances between them, and advance them). And its constructors have been changed to be consistent with COpaquePointer.
_ExtensibleCollectionType has added an append function that appends a single element (all the implementors already support this)
_RawOptionSetType (and thus RawOptionSetType) is now Equatable.
contains now has a version that takes an equatable element rather than a predicate.
sorted now takes any sequence, which is way less restrictive as before it required a mutable random-access collection.
startsWith now has a version that takes a comparison predicate (but like equal requires both sequences to contain the same type, even if the predicate could handle two different types).

transcode, which I think officially has the longest function signature in the whole Swift library, got a teeny bit shorter as it now returns a (still a bit odd-looking) 1-tuple containing a Bool, rather than a 1-tuple containing a Bool labelled hadError. Oh, by the way, you're not allowed to return 1-tuples with labels as of beta 6.

There is a new AssertString type, which has a lower precedence for overloading purposes than StaticString. How this is achieved is interesting, and a demonstration of how there are still inheritance hierarchies with structs: AssertString implements a new AssertStringType protocol. StaticString also now implements a new protocol, StaticStringType. StaticStringType inherits from AssertStringType. This means StaticString is more specific than AssertString and will therefore “win” in choices for which overload to pick (in the same way a function taking a CollectionType wins over SequenceType if an object supports it, or RandomAccessIndexType wins over ForwardIndexType). The protocols StaticString previously implemented have moved to AssertStringType

There is a family of a new kind of assertion function, precondition. The signatures are very similar to that of assert. The comments suggest precondition is a little stronger – they will still stop program execution even if assertions are turned off. Only with -Ounchecked will they not check the condition. There's also a @noreturn preconditionFailure function that doesn't check anything, just stops execution immediately.

Various new comments in the Swift library suggest use of these new preconditions. For example, a comment above GeneratorType.next suggests calling preconditionFailure if called a second time after nil has already been returned.

There's a new protocol, RangeReplaceableCollectionType, that defines several new operations on collections such as removing or replacing ranges, as well as insert (insert an element into the middle of a collection), and splice (insert a collection into the middle of the collection).

String implements this new protocol. Interestingly, Array (alongside ContiguousArray and Slice) does not appear to, though it does support all the methods (including new splice and removeRange functions) and if you write a generic function that takes a RangeReplaceableCollectionType, you can pass an Array into it. Declared somewhere more private I guess? Unless I'm missing a bit of indirection somewhere. If you spot it, let me know on twitter.

Following a familiar pattern, several of these new functions, such as slice and the remove operations, are also availabe as non-member functions as well.

Finally, _BridgedToObjectiveCType and _ConditionallyBridgedToObjectiveCType appear to have coalesced into _ObjectiveCBridgeable, but as ever I'll steer clear of discussing bridging topics.

Implicitly converting functions to return optionals

Posted on August 17, 2014August 18, 2014 by airspeedvelocity

One final post on upconversion of non-optional values to optional. As we saw previously, if you pass a non-optional value to a function argument that expects an optional, it will get automatically converted into an optional by the compiler.

Ken Ferry points out something that goes one step further. If a function takes an argument of a function that returns an optional, and you pass into it a function that does not return an optional, the function will automatically be converted to return an optional.

That is:

func foo()->Int {
    return 1
}

func bar(fun: ()->Int?) {
    println(fun())
}

// Even though foo returns an Int, 
// you can pass it to bar.
// Inside bar it will return an Int?,
// so this prints "Optional(1)"
bar(foo)

What’s more, you can even store or return the converted function:

// alter bar to return the passed-in function
func bar(fun: ()->Int?)->()->Int? {
    return fun
}

var i = 0
let f = { ++i }

f() // returns 1
f() // returns 2

let g = bar(f)

g() // returns {Some 3}
g() // returns {Some 4}
f() // returns 5

This feature makes sense when you think about it in the context of other implicit conversions – if you can pass non-optional values in to functions and have them implicitly converted to optionals, the logical next step is to be able to do the same for the return value of functions you pass in as parameters.

This only works on the return value of the function, though. The following won’t compile:

func foo(i: Int)->Int {
    return i
}

func bar(fun: (Int?)->Int?) {
    println(fun(1))
}

// won't compile - no implicit conversion
// of foo's argument to an Int?
bar(foo)

This is because foo is expecting a non-nil argument – it wouldn’t know how handle nil if it received one. However, what would compile is if foo could handle an optional, but bar was expecting a function with a non-optional argument:

func foo(i: Int?)->Int {
    return i ?? 0
}

func bar(fun: (Int)->Int?) {
    println(fun(1))
}

// will compile - implicit conversion
// of foo's argument to an Int
bar(foo)

This works because any call by bar of foo could convert the non-optional parameter it passes in to an optional implicitly. (thanks to @westacular and @jvasileff for pointing this out)

Just like you can think of passing non-optionals into optional arguments as the compiler automatically wrapping your values in an optional, you can think of the compiler silently wrapping your non-optional-returning function in a closure that calls your function and returns its value wrapped in an optional:

func foo()->Int {
    return 1
}

func bar(fun: ()->Int?) {
    println(fun())
}

// to do an explicit equivalent 
// of the compiler's conversion:
bar( { Optional(foo()) } )

In practice, the compiler probably does something a bit more low-level. If you wrote the above code, and then stepped through it with the debugger, and you put a breakpoint inside foo, you’d see an extra entry in the stack trace between foo and bar showing the in-between closure. But if you leave the compiler to do it implicitly, you’ll see no such entry.

If you’re interested in digging a bit further into exactly where the compiler is doing the conversion, you can use a feature of the Swift compiler to dump the syntax tree. Here is a dump from the first piece of code in this article:¹

% xcrun swiftc -dump-ast main.swift
(source_file
  (func_decl "foo()" type='() -> Int' access=internal
    (body_params
      (pattern_tuple type='()'))
    (result
      (type_ident
        (component id='Int' bind=type)))
    (brace_stmt
      (return_stmt
        (call_expr implicit type='Int' location=main.swift:2:12 range=[main.swift:2:12 - line:2:12]
          (dot_syntax_call_expr type='(Int2048) -> Int' location=main.swift:2:12 range=[main.swift:2:12 - line:2:12]
            (declref_expr implicit type='Int.Type -> (Int2048) -> Int' location=main.swift:2:12 range=[main.swift:2:12 - line:2:12] decl=Swift.(file).Int._convertFromBuiltinIntegerLiteral specialized=no)
            (type_expr implicit type='Int.Type' location=main.swift:2:12 range=[main.swift:2:12 - line:2:12] typerepr='<<IMPLICIT>>'))
          (integer_literal_expr type='Int2048' location=main.swift:2:12 range=[main.swift:2:12 - line:2:12] value=1)))))
  (func_decl "bar(_:)" type='(() -> Int?) -> ()' access=internal
    (body_params
      (pattern_tuple type='(fun: () -> Int?)'
        (pattern_typed type='() -> Int?'
          (pattern_named type='() -> Int?' 'fun')
          (type_function
            (type_tuple)
))))
    (brace_stmt
      (call_expr type='()' location=main.swift:6:5 range=[main.swift:6:5 - line:6:18]
        (declref_expr type='(Int?) -> ()' location=main.swift:6:5 range=[main.swift:6:5 - line:6:5] decl=Swift.(file).println [with T=Int?] specialized=no)
        (paren_expr type='(Int?)' location=main.swift:6:13 range=[main.swift:6:12 - line:6:18]
          (call_expr type='Int?' location=main.swift:6:13 range=[main.swift:6:13 - line:6:17]
            (declref_expr type='() -> Int?' location=main.swift:6:13 range=[main.swift:6:13 - line:6:13] decl=main.(file).func decl.fun@main.swift:5:10 specialized=no)
            (tuple_expr type='()' location=main.swift:6:16 range=[main.swift:6:16 - line:6:17]))))))
  (top_level_code_decl
    (brace_stmt
      (call_expr type='()' location=main.swift:9:1 range=[main.swift:9:1 - line:9:8]
        (declref_expr type='(() -> Int?) -> ()' location=main.swift:9:1 range=[main.swift:9:1 - line:9:1] decl=main.(file).bar@main.swift:5:6 specialized=no)
        (paren_expr type='(() -> Int?)' location=main.swift:9:5 range=[main.swift:9:4 - line:9:8]
          (function_conversion_expr implicit type='() -> Int?' location=main.swift:9:5 range=[main.swift:9:5 - line:9:5]
            (declref_expr type='() -> Int' location=main.swift:9:5 range=[main.swift:9:5 - line:9:5] decl=main.(file).foo@main.swift:1:6 specialized=no))))))

This is fairly simple to tie back to the original code. We have three sections under source_file: two func_decls, one for foo and one for bar, followed by a top_level_code_decl showing the call to bar passing in foo. And we see that the compiler inserts a function_conversion_expr implicit type='() ->Int?' (line 36).

Incidentally, if you run -dump-ast on a call that requires a conversion from regular type like Int to an optional, you’d see inject_into_optional implicit type='Int?'. And for an implicit conversion to an Any type, you’d see erasure_expr implicit type='Any'.

swiftc dumps to stderr. If you want to pipe longer examples through less, and like me you need to trial-and-error where the ampersand goes every damn time despite how many thousand times you might have typed it before, it’s 2>&1 ↩

Bachman Ternary Overdrive

Posted on August 13, 2014August 13, 2014 by airspeedvelocity

(ok I think I’ve pushed the puns too far with that headline)

David Owens rightly called me out for pulling a bit of a fast one at the start of my last article on how ?: and ?? differ in behaviour. Not that they don’t, but I was acting as if what the ternary operator version was doing was something perfectly sensible, and that ?? was the funny one. But I glossed over what the ternary operator itself was doing.

Look at this code again:

let i: Int? = nil
let j: Int? = 5
let k: Int? = 6

// this returns {Some 5}
let x = i != nil ? i! : j
// this returns {Some 6}
let y = k != nil ? k! : j

What are the types of x and y? Int?, right? And how is that decided? From the type of the ternary expression. But which part? If you look at the second and third part of the expression, they’re different types. i! and k! are of type Int, whereas j is of type Int?.

That’s no good! Both possible values from the ternary expression have to be able to become the same type, because the type of x is determined at compile time, before you know what i or k contains.

For example, this won’t compile:

let s = "hello"
let i = 123
// x can't be either a String or an Int
let x = i > 0 ? s : i

But note I said become the same type, not be the same type. As we saw, the compiler is willing to put your values into optionals to make them fit. It can convert i!, an Int, to be of type Int?, and that would match what j is. Since that’s the only possible way this would work, it does it. I’m guessing it rewrites the expression as:

let x = i != nil ? Optional(i!) : j

This looks similar to what was happening with ?? – the left-hand side is getting upconverted from an Int to an Int?. But in this case, it’s happening after the comparison to nil, whereas in the ?? function, it was upconverted beforehand. Which explains why, with the ternary statement version, j is picked over i.

Of course, none of this is likely the behaviour you wanted. You probably just fat-fingered an optional onto the right-hand side, meaning for it to be a regular Int. If so, you might find being explicit about the type would help avoid this confusion:

// if j is an Int?, this will not compile:
let x: Int = i != nil ? i! : j

// and neither will this:
let y: Int = i ?? j

In the case of the ?? version, the compiler’s error message is pretty helpful – it asks you if you meant to have type Int? for j, and offers to stick a ! after it to unwrap it for you. This makes it clear that, in both cases, passing an optional on the right-hand side is user error (which is why I don’t think the way ?? behaves is necessarily a bug).

Finally, here’s a silly thing to try. Up until now, we’ve been fixing the type of the result by converting one of the two types to another. What if you had two types, both of which could be implicitly converted to a third type? Would that work too?

Yes it would. We can create a class that can be constructed from either a string literal or an integer literal, and then use a ternary statement that returned one or the other:

struct Both: IntegerLiteralConvertible, StringLiteralConvertible {
    typealias ExtendedGraphemeClusterLiteralType = String
    let str: String

    static func convertFromIntegerLiteral(value: IntegerLiteralType) -> Both {
        return Both(str: String(value))
    }
    static  func convertFromStringLiteral(value: StringLiteralType) -> Both {
        return Both(str: value)
    }
    static func convertFromExtendedGraphemeClusterLiteral(value: ExtendedGraphemeClusterLiteralType) -> Both {
        return Both(str: value)
    }
}

let b = true
// x will be of type String
let x = b ? "one" : "two"
// y will be of type Int
let y = b ? 1 : 2
// z1 and z2 will be of type Both
let z1 =  b ? "one" : 2  // {str "one"}
let z2 = !b ? "one" : 2  // {str "2"}

Well, it amused me anyway.

Yo, dawg

Posted on August 12, 2014August 18, 2014 by airspeedvelocity

EDIT: the behaviour of ?? has been altered as of Swift beta 6. There is now a special case for both the LHS and RHS being of T?, that matches the ternary version. However, the below is still of interest for details of implementing generic functions that take optionals.

The nil coalescing operator (a ?? b) is described in the Swift docs as shorthand for the following code:

a != nil ? a! : b

That is, if a is nil you get b, otherwise you get the unwrapped a.

Ok great, that’s pretty intuitive and very useful. You might be surprised then if you try the following:

let i: Int? = nil
let j: Int? = 5

// i is nil, so this evalates
// to j i.e. {Some 5}
i != nil ? i! : j

// but this returns nil
i ?? j

Why is it returning nil? Is the analogy with the ?: operator a simplification of what’s actually happening? Let’s see, by implementing our own version of the ?? operator, using the exact same ternary logic. Here it is: ¹

infix operator ~~ {
    associativity right
    precedence 110
}

func ~~<T>(a: T?, b: @autoclosure () -> T) -> T {
    return a != nil ? a! : b()
}

// nope, still nil
i ~~ j

Hum. What’s going on?

Well, to be fair, this isn’t playing by the rules for ??. The docs also state “The expression b must match the type that is stored inside a.” Instead, I’ve passed the same type for both a and b, so the results are undefined.

What is actually happening is that the compiler is fixing the type of T to be the type of the argument on the right-hand side. So T is an Int?. That means the left-hand side T? is an Int??, or to write it longhand, an Optional of an Optional Int. ²

Then, in the call, i is being “upgraded” from an optional to an optional optional. This upconversion is a feature that allows you to pass in plain values to functions that take optional arguments, and have them be automatically wrapped in an optional, rather than having to manually construct an optional to pass in yourself.

This is useful when defining a function with default arguments (arguments which are, ahem, optional). For example the dump command takes a second parameter name: that you can choose not to pass in, that will prefix your dumped object data if you do. An implementation could look like this:

func mydump<T>(x: T, name: String? = nil) {
    if let name = name {
        // if the caller supplied a name, use it
        print(name); print(": ")
    }
    println(x)
}

// this won't prefix the dump with a name:
mydump(a)

// you could call mydump like this:
mydump(a, name: Optional("My object"))

// but there's no need, you can just do this:
mydump(a, name: "My object")

So back to our operator example. With the left-hand side being upconverted, by the time we’re inside our ~~ function, i has become Optional(i). Even if i == nil, Optional(i) != nil. It contains a value (of an optional that doesn’t contain a value).

To simulate what ?? is doing outside of a function, we can do this:

let ii = Optional(i)

// this actually evaluates to nil
ii != nil ? ii! : j

That an optional that contains a nil optional is not equal to nil is pretty important. If it were equal to nil, then iterating over sequences would be a problem. In the following example, the last two entries wouldn’t be reached:

let a: [Int?] = [1, 2, nil, 3, 4]
// remember, for...in is short for the following:
var g = a.generate()
// while the result of next() isn't nil
while let i = g.next() {
    // i is now an optional, that will
    // contain nil on the 3rd iteration
}

Is this behaviour of ?? a bug? I dunno, probably not. You could prefer it to behave like the raw ternary operator, or fail to compile by somehow mandating the right-hand type really be what’s contained in the left-hand optional. But you could also say it’s behaving correctly, based on how the language works, and you might even need it to behave this way in some scenarios.

Either way, it’s a useful case study if you plan on implementing a generic function that takes optionals yourself.

If you’re unclear on why the right-hand side is being declared with an autoclosure, read this post ↩
I would have loved to put some angle brackets in there, but WordPress had other ideas. It certainly knows how to re-capitalize its name when I get that wrong, though! ↩

The case against making Array subscript return an optional

Posted on August 8, 2014August 8, 2014 by airspeedvelocity

I got a lot of interesting feedback from my previous article, regarding the proposal to change array subscripts to return optionals. Some pro, some anti.

The pro came mostly from people who’d been burned by out of bounds exceptions often and wanted Swift to change in a way that would help them not crash.

The anti camp was obviously not pro crashing, but felt that using optionals was too strict – that it would end up being counterproductive because in fighting with the optionals, developers would likely introduce as many bugs as they eliminated. Their arguments are pretty convincing.

A question tied up in this is, what are optionals for? Should nil be rare like an exception, or commonplace? Should you avoid using optionals unless you absolutely have to, or use them often, to represent things like unknown information? For a good discussion on how they should be rare, see this article by @nomothetis.

There’s definitely a lot of optional mis-use out there, in sample code on the internet – optionals for delayed initialization of members (where lazy stored properties would be better), or for representing collections that might not have contents (rather than just returning an empty collection).

For guidance, we could look at the Swift standard library and how it uses optionals:

In sequences, optionals are routine and informational – nil means you’ve reached the end of the sequence.
In the case of String.toInt(), its more to return an error – you passed in a bad string. Using toInt() to detect whether a string is an integer feels iffy.
Array.first seems in-between – is getting back nil an error or information? If you were expecting your array to always have values, the former. If you’re writing something generic that needs to handle empty arrays, it’s the latter, a convenience for combining the check for empty and getting the value.
In find, nil means not found, because maybe the collection doesn’t contain that value. This isn’t an error. But there was an alternative – it could have returned Collection.endIndex, which is what the C++ find does. The optional forces you to check, whereas with endIndex you could forget/not bother.

Array subscripts are closest to the last example. You’ve got an index, but maybe it’s out of bounds, so you should be encouraged to check before you use it. So it seems like a good fit.

The problem is when it becomes annoying without benefit. Take this code:

for idx in indices(a) {
    // this is dumb, of course it has a value!
    if let val = a[idx] {
        // do something
    }
}

Faced with the above case too often, developers would probably start to get unwrap fatigue. They inspect the code, see that there’s no way the index could not be valid (there’s no index arithmetic going on there, just use of an index that is guaranteed to be within bounds), and just force unwrap instead.

This is a slippery slope. Once you start doing that, you do it all the time. Not just with guaranteed-safe cases but others that aren’t, and one time you use a closed rather than half-open range and bang, run-time error. Only it’s not a array bounds error, it’s a force-unwrap nil error. So we’re back to where we started, only with less helpful debug info and a bunch of exclamation marks all over our code.

So a better solution is needed – one that stays out of the way when doing things that can’t go wrong, but helps with handling the cases that can, like at the edges of ranges or when performing arithmetic on indices.

At this point, I’m pinning my hopes on the second option in my original article – a kind of index type that could never not point to a value in the collection. But implementing this is tricky – especially if it might involve changing the index type protocols, which would break everything that builds on top of them.

In the mean-time, if you feel your code would be better off with an optional array index function, you can of course extend array to provide it. The useful swiftz library has one, safeIndex, that you could re-use.

Null Pointer Exceptions fixed, next up…

Posted on August 6, 2014December 22, 2014 by airspeedvelocity

Edit: there is a follow-up to this post, giving the case against this idea, which you should read after this one.

My not-statistically-proven assertion (after working in the LOB development mines for years) is that Null Pointer Exception is the #1 cause of crashes for apps written in memory-managed languages.¹

The hope is, by introducing optionals, that this kind of error get pushed way down the leaderboard in Swift. Sure you can still force-unwrap a nil value. But you have to load the gun, cock it, and then point it at your foot. I love that the force-unwrap operator is an exclamation mark.

Who is the aspiring hopeful cause of crashes, eyeing that vacated top spot? I’m guessing the Array Out of Bounds exception.

This one is still very much at large in Swift. Int is the index for arrays, and there’s nothing stopping you blatting straight past the end, getting a runtime assertion or possibly even scribbling into memory depending on what you’re doing and how you’re compiling.

This is really just a step up from pointers.² There’s just too much leeway for unreformed C programmers to write the same crappy old code:

// But it worked when I tested it!  
// (with an odd-sized array)
func reverseInPlace<T>(inout a: Array<T>) {
    var start = 0
    var end = a.count - 1
    while start != end {
        swap(&a[start], &a[end])
        start++; end--
    }
}

It doesn't have to be this way. It's totally fixable, just like null pointers were. There are two good options, an easy one and a (slightly) harder one.

The easy one

Make Array.subscript return an optional. Only if your index is within bounds will it return a value, otherwise it'll return nil. This is like the approach Dictionary takes with lookups. If your dictionary doesn't contain a particular key, you get a nil.

“But that's because dictionaries are different” you say. No they aren't. Dictionary has a method to check if a key is present. You could call that first and then, if it is, get the value. But people don't want to do that, so they skip straight to the getting out the value part, and because that's a bit risky, it returns nil if the key isn't there. Likewise, Array has methods for checking if the array is empty or if an index is beyond the end. You don't have to bother checking for that, but if you mess up, boom!

The most common case is probably getting the first element with a[0]. So common is this that beta 5 introduced Array.first. Which returns… an optional. If that doesn't convince you, I don't know what will.

I expect this change would elicit some moaning. Similar to the complaints about how you can't index randomly into the middle of a Swift string. But as with the string case, the question really is, how often do you need to do this anyway? Use for...in, use find, use first and last. Don't hack at your array like a weed.

Developers who hate it and want the old behaviour could just stick a ! after their subscript access and it'd be back to the way it was. Presumably these people are compiling -Ounchecked. Let's see whether their users find the snappy performance makes up for the random crashing.

The harder one

Stop using Int for indexing into random-access collections. Use a real index object.

Again, Dictionary already does this. To index over/into a dictionary you have to get a DictionaryIndex object, which has no public initializers so can only be fetched via methods on Dictionary.

If an index type is random-access (which Dictionary’s isn't), then they can be compared, added together, advanced by numeric amounts in O(n), just like an integer can.

To allow arrays to return a non-optional value when subscripted with this index, you'd need to tweak the index object to have successor() etc return an optional, with nil for if it's gone beyond the collection's bounds. This way, the index can be guaranteed to point to a valid entry.³

Again, this would make them more of a pain to use, in exchange for safety. But the other downside to this approach is to implement this, indices would need to know about the specific collection they index. Which means they'd need a reference to it, which introduces other complications like reference cycles and an increased size. Not a deal-breaker though.

Indices knowing about their container would have the side benefit of allowing them to become dereferenceable (see a previous post for more on this).

It would also allow them to guard against the following, which compiles and runs but is presumably a bad idea:⁴

let d1 = [1:1, 2:2, 3:3]
let d2 = [7:7, 2:2, 1:1, 4:4]

// get an index from d1
let idx = d1.startIndex.successor()
// and use it on d2...
d2[idx]  // returns (2,2)

No need to pick one

These two approaches are not mutually exclusive. Array could provide both an Int subscript and an index one – the former giving back an optional, the latter not.

I like that idea. The index option has downsides though. There's also a compromise, where the Int version is checked, but the index object version is unchecked, on the basis that if you're using indices you're thinking a bit more carefully about what you're doing.

But here's hoping at least the first option makes it into a subsequent beta before the current implementation's set in stone.

If there's a reason I'm missing that means these schemes are hopelessly naive, let me know at @airspeedswift on twitter.

In non-managed code, the number one cause of crashes is crap, I still haven’t found the problem, where did my whole day go? ↩
Obviously that’s a long step. You can’t do in Swift my favourite silly thing to do in C: char s[] = "hello"; int i = 3; cout<<i[s]<<endl; ↩
All this is assuming the indices are into collections that aren't changing underneath them. Indexing mutating collections is a whole different ball-game. ↩
Maybe it's ok? If the index is to a key common to both, it works. If it's an index to a key not in d2, you get an invalid index assertion. Still probably a bad idea. ↩

Changes in the Swift Standard Library in Beta 5

Posted on August 4, 2014August 4, 2014 by airspeedvelocity

Hey, happy Monday, none of your Swift code will compile!

Nothing a quick find and replace can’t cure. “Several” protocols have been renamed with the -Type suffix to avoid confusion, say the release notes. “A crapload” may have been a more accurate term but that probably wouldn’t pass Apple QA. Mostly this is a lot clearer – especially with IntegerType, as there were lots of confused beginners wondering why they couldn’t declare an Integer variable.

Unmentioned, but in a similar vein in the opposite direction, various typealiases within protocols have had their Type suffix removed. IndexType, GeneratorType, KeyType, ValueType and SliceType are now just Index, Generator, Key, Value and SubSlice.

This I guess makes up for the extra verbosity of now having to type CollectionType that bugs me a bit, though it will give me a reason to just write “collection” in future articles without always feeling guilty that I haven’t capitalized and monospace-fonted it.

A quick list of small items:

All the operators declared at the top now have visible precedence levels and associativity, which is handy if you are trying to target the precedence of your own operators.
Lots of new operators for strides and intervals.
The weird use of [T], [K:V] and T? when extending Array, Dictionary and Optional is gone, as has ArrayType.
Character is now hashable.
The trend of replacing proxies of C-style things with pointer objects, that started last beta with replacing CString, continues with UnsafeArray and UnsafeMutableArray becoming UnsafeBufferPointer and UnsafeMutableBufferPointer.
The law finally caught up with Array and made it mark its various mutable methods as mutable. Same for ContiguousArray and Slice.
Various integer types have had the getArrayBoundValue function changed to an arrayBoundValue get property.
String now has inits that take integers. You can also supply a radix, as well as whether you want digits above 9 to be in upper or lower case.
In addition to Array acquiring first and last, the lazy collections also implement first (and last if they can index bidirectionally), as well as isEmpty.
There’s also a non-member first and isEmpty that takes a collection, and a last that takes a collection if it has a bidirectional iterator.
Optional now has a hasValue property instead of a getLogicValue() function.
String.compare is gone.
UInt and family’s asSigned() has gone. Instead, use numericCast which appears to have been revamped a bit to do more at compile time.
UInt now has a bitwise initializer from its signed equivalent, which you need to explicitly call with a bitPattern: argument.
Relatedly, reinterpretCast has been renamed unsafeBitCast. You have been warned.
New prefix and suffix functions take a Sliceable and return the start or end of it as a slice.
You can no longer access the underlying sequences of a Zip2 sequence.
There’s now a version of assert that takes a BooleanType expression, not just a bool.
The comment descriptions for sort now have a helpful link to wikipedia about strict weak ordering!
The function passed to withExtendedLifetime actually takes x as an argument now.

Poor old Range has been thoroughly demoted. One minute it’s on top of the world, then last time stride takes its spot for non-contiguous increments. Now ClosedInterval and HalfOpenInterval replace it for general purpose ranges. They also follow the pattern adopted by StrideTo and StrideThrough of having completely different types for the two kinds, determined at compile time. They require their initializing values to conform to ComparableType, which means they can detect if they are inverted and can efficiently detect overlap between two intervals.

Range is now relegated entirely to managing collection indices. The odd thing about this is that Int, probably the most common thing to use to initialize a range, is also an index type, so Range still gets to come out and play all the time.

By the way, I love how the definitions of StrideTo and StrideThrough have a FIXME in their comments about how they ought to be collections.

Finally, ever wonder why when you declared your class to be a Collection, you got chewed out by the compiler for not implementing _Collection? Well out of the shadows emerges the (newly renamed) _CollectionType. Here you will find the missing bits of what made collections work – startIndex, IndexType etc. Same goes for _Sequence, _IntegerArithmeticType etc. It’s not clear why these aren’t just pulled up into their non-underscore equivalents, but at least now you can go look at them instead of having to reverse-engineer what you needed to implement.

Swift is like Visual Basic

Posted on August 4, 2014August 4, 2014 by airspeedvelocity

I’m trolling a little with the title. But it is in a way. Let me explain myself.

My first job out of college at the end of the 90s was writing Visual Basic GUIs, along with C++ DLLs underneath them. That was quite a culture shock. I went from coding in Scheme and Haskell (or C and assembly, depending on the class I was taking), to writing basic. Basic! The kiddy language! The C++/MFC developers down the hall called us “paint monkeys”.

Except after a few months, under the patient tutelage of a couple of experienced devs who took my immature scorn for their language in their stride, I realized I actually enjoyed it. More importantly, I was productive in it. It was so much quicker to express business logic in VB than C++. I started writing my DLLs in VB as well, except when the problem demanded something more powerful, and when it did, it was the power of the STL it demanded rather than the C++ language. When I did have to go back and forth with C++, it was easy to switch – my VB and C++ interoperated so easily that I could pick my language on a class-by-class basis.

The interop was smooth for a few reasons. First, VB was compiled to binary executables,¹ rather than bytecode to run on a VM. You could call C functions directly from your VB code, passing in the addresses of raw VB Integers or Arrays or Strings. This meant most of the Win32 API was directly available without any extra wrappers or native interface hoops to jump through.

VB was also reference counted rather than garbage collected. COM objects could be instantiated in C++ or VB and passed directly back and forth between the two languages. Most VB devs didn’t know about the ref counting, and it rarely mattered, but understanding the need to break reference cycles was our stock screening interview question.

Any of this sound familiar?

Swift’s easy interop with its platform’s older established language makes trying it out a fairly painless experience (or it should once the beta ends and the language and compiler stabilizes). My feeling is, once interested Objective-C developers try writing real code in it, and get accustomed to type inference and cleaner syntax and see some of the other benefits it brings like generics and functional paradigms, it’ll be a one-way trip. But without the downside of having to port old codebases.

The similarities with VB are only superficial. Just enough to give me a vague feeling of déjà vu. Swift is way more powerful, picking the best of the innovations in programming languages from the last 20 years. It does this while making the language as accessible as VB was. Despite their academic superiority, a new generation of developers are not going to adopt Haskell or Clojure or F# as their language of choice.

VB was still Basic of course. Those Begin, End, and Dim keywords are hard to live down. When Java took the Enterprise development world by storm, Microsoft did their knee-jerk reaction thing and shipped a garbage-collected Java/JVM clone.² As a sop to the existing developers, they gave us Visual Basic.NET which was just a stupid cousin to C#.

Those that adopted it plunged themselves into a world of interop pain as they tried to make their CLR code work with non-CLR C++ (or even worse, tried to convert that to “managed” C++). The MFC developers that laughed at us paint monkeys stuck with C++, and plenty are still using it today. Microsoft can’t make up its mind what to back, continuing to make a hash of things with Silverlight or their weird HTML5 proposals for Windows 8 app development. I’m certain Apple won’t make the same mistake. In 5 years time, Swift will be the language you write apps in.

There are still some die-hard VB6 developers out there, god bless them, and Microsoft’s obsession with backwards compatibility means they can keep using the version of Visual Studio that was released in ’98. I gave up on Microsoft in disgust and switched to C++ on Linux. Part of why I’ve been enjoying writing Swift is because it reminds me of the old times. Now if only I could figure out how to be in my 20s again, I’d be set.

Assuming you chose the compiled rather than the interpreted route. VB code could also be interpreted in a script mode. All our batch files were written in VB. ↩
Yes, I know C# has been way more innovative than Java in recent years. But at the beginning it was just a pale imitation. ↩

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

Airspeed Velocity

African or European Swift?

Month: August 2014